mlstream.models package¶
Submodules¶
mlstream.models.base_models module¶
-
class
mlstream.models.base_models.
LumpedModel
¶ Bases:
object
Model that operates on lumped (daily, basin-averaged) inputs.
-
load
(model_file: pathlib.Path) → None¶ Loads a trained and pickled model.
Parameters: model_file (Path) – Path to the stored model.
-
predict
(ds: mlstream.datasets.LumpedBasin) → numpy.ndarray¶ Generates predictions for a basin.
Parameters: ds (LumpedBasin) – Dataset of the basin to predict. Returns: Array of predictions. Return type: np.ndarray
-
mlstream.models.lstm module¶
Large parts of this implementation are taken over from https://github.com/kratzert/ealstm_regional_modeling.
-
class
mlstream.models.lstm.
EALSTM
(input_size_dyn: int, input_size_stat: int, hidden_size: int, batch_first: bool = True, initial_forget_bias: int = 0)¶ Bases:
sphinx.ext.autodoc.importer._MockObject
Implementation of the Entity-Aware-LSTM (EA-LSTM)
Model details: https://arxiv.org/abs/1907.08456
Parameters: - input_size_dyn (int) – Number of dynamic features, which are those, passed to the LSTM at each time step.
- input_size_stat (int) – Number of static features, which are those that are used to modulate the input gate.
- hidden_size (int) – Number of hidden/memory cells.
- batch_first (bool, optional) – If True, expects the batch inputs to be of shape [batch, seq, features] otherwise, the shape has to be [seq, batch, features], by default True.
- initial_forget_bias (int, optional) – Value of the initial forget gate bias, by default 0
-
forward
(x_d: <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5fef0>, x_s: <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5ff28>) → Tuple[<sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5ff60>, <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5ff98>]¶ Performs a forward pass on the model.
Parameters: - x_d (torch.Tensor) – Tensor, containing a batch of sequences of the dynamic features. Shape has to match the format specified with batch_first.
- x_s (torch.Tensor) – Tensor, containing a batch of static features.
Returns: - h_n (torch.Tensor) – The hidden states of each time step of each sample in the batch.
- c_n (torch.Tensor) – The cell states of each time step of each sample in the batch.
-
reset_parameters
()¶ Initialize all learnable parameters of the LSTM
-
class
mlstream.models.lstm.
LSTM
(input_size: int, hidden_size: int, batch_first: bool = True, initial_forget_bias: int = 0)¶ Bases:
sphinx.ext.autodoc.importer._MockObject
Implementation of the standard LSTM.
Parameters: - input_size (int) – Number of input features
- hidden_size (int) – Number of hidden/memory cells.
- batch_first (bool, optional) – If True, expects the batch inputs to be of shape [batch, seq, features] otherwise, the shape has to be [seq, batch, features], by default True.
- initial_forget_bias (int, optional) – Value of the initial forget gate bias, by default 0
-
forward
(x: <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de6b0f0>) → Tuple[<sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de6b128>, <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de6b160>]¶ Performs a forward pass on the model.
Parameters: x (torch.Tensor) – Tensor, containing a batch of input sequences. Format must match the specified format, defined by the batch_first agrument. Returns: - h_n (torch.Tensor) – The hidden states of each time step of each sample in the batch.
- c_n (torch.Tensor) – The cell states of each time step of each sample in the batch.
-
reset_parameters
()¶ Initializes all learnable parameters of the LSTM.
-
class
mlstream.models.lstm.
LumpedLSTM
(num_dynamic_vars: int, num_static_vars: int, use_mse: bool = True, no_static: bool = False, concat_static: bool = False, run_dir: pathlib.Path = None, n_jobs: int = 1, hidden_size: int = 256, learning_rate: float = 0.001, learning_rates: Dict[KT, VT] = {}, epochs: int = 30, initial_forget_bias: int = 5, dropout: float = 0.0, batch_size: int = 256, clip_norm: bool = True, clip_value: float = 1.0)¶ Bases:
mlstream.models.base_models.LumpedModel
(EA-)LSTM model for lumped data.
-
load
(model_file: pathlib.Path) → None¶ Loads a trained and pickled model.
Parameters: model_file (Path) – Path to the stored model.
-
predict
(ds: mlstream.datasets.LumpedBasin) → numpy.ndarray¶ Generates predictions for a basin.
Parameters: ds (LumpedBasin) – Dataset of the basin to predict. Returns: Array of predictions. Return type: np.ndarray
-
-
class
mlstream.models.lstm.
Model
(input_size_dyn: int, input_size_stat: int, hidden_size: int, initial_forget_bias: int = 5, dropout: float = 0.0, concat_static: bool = False, no_static: bool = False)¶ Bases:
sphinx.ext.autodoc.importer._MockObject
Wrapper class that connects LSTM/EA-LSTM with fully connceted layer
-
forward
(x_d: <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5fac8>, x_s: <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5fda0> = None) → Tuple[<sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5fdd8>, <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5fe10>, <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5fe48>]¶ Run forward pass through the model. :param x_d: Tensor containing the dynamic input features of shape [batch, seq_length, n_features] :type x_d: torch.Tensor :param x_s: Tensor containing the static catchment characteristics, by default None :type x_s: torch.Tensor, optional
Returns: - out (torch.Tensor) – Tensor containing the network predictions
- h_n (torch.Tensor) – Tensor containing the hidden states of each time step
- c_n (torch,Tensor) – Tensor containing the cell states of each time step
-
mlstream.models.nseloss module¶
-
class
mlstream.models.nseloss.
NSELoss
(eps: float = 0.1)¶ Bases:
sphinx.ext.autodoc.importer._MockObject
Calculates (batch-wise) NSE Loss.
Each sample i is weighted by 1 / (std_i + eps)^2, where std_i is the standard deviation of the discharge of the basin to which the sample belongs.
Parameters: eps (float) – Constant, added to the weight for numerical stability and smoothing, default to 0.1 -
forward
(y_pred: <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5fcf8>, y_true: <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5fd30>, q_stds: <sphinx.ext.autodoc.importer._MockObject object at 0x7f9f5de5fd68>)¶ Calculates the batch-wise NSE loss function.
Parameters: - y_pred (torch.Tensor) – Tensor containing the network prediction.
- y_true (torch.Tensor) – Tensor containing the true discharge values
- q_stds (torch.Tensor) – Tensor containing the discharge std (calculated over training period) of each sample
Returns: The batch-wise NSE-Loss
Return type: torch.Tenor
-
-
class
mlstream.models.nseloss.
XGBNSEObjective
(dummy_target, actual_target, q_stds, eps: float = 0.1)¶ Bases:
object
Custom NSE XGBoost objective.
This is a bit of a hack: We use a unique dummy target value for each sample, allowing us to look up the q_std that corresponds to the sample’s station. When calculating the loss, we replace the dummy with the actual target so the model learns the right thing.
-
neg_nse_metric_sklearn
(estimator, X, y_true)¶ Negative NSE metric for sklearn.
-
nse
(y_pred, y_true, q_stds)¶
-
nse_metric_xgb
(y_pred, y_true)¶ NSE metric for XGBoost.
-
nse_objective_xgb
(y_pred, dtrain)¶ NSE objective for XGBoost (non-sklearn API).
-
nse_objective_xgb_sklearn_api
(y_true, y_pred)¶ NSE objective for XGBoost (sklearn API).
-
mlstream.models.sklearn_models module¶
-
class
mlstream.models.sklearn_models.
LumpedSklearnRegression
(model: sklearn.base.BaseEstimator, no_static: bool = False, concat_static: bool = True, run_dir: pathlib.Path = None, n_jobs: int = 1)¶ Bases:
mlstream.models.base_models.LumpedModel
Wrapper for scikit-learn regression models on lumped data.
-
load
(model_file: pathlib.Path) → None¶ Loads a trained and pickled model.
Parameters: model_file (Path) – Path to the stored model.
-
predict
(ds: mlstream.datasets.LumpedBasin) → numpy.ndarray¶ Generates predictions for a basin.
Parameters: ds (LumpedBasin) – Dataset of the basin to predict. Returns: Array of predictions. Return type: np.ndarray
-
mlstream.models.xgboost module¶
-
class
mlstream.models.xgboost.
LumpedXGBoost
(no_static: bool = False, concat_static: bool = True, use_mse: bool = False, run_dir: pathlib.Path = None, n_jobs: int = 1, seed: int = 0, n_estimators: int = 100, learning_rate: float = 0.01, early_stopping_rounds: int = None, n_cv: int = 5, param_dist: Dict[KT, VT] = None, param_search_n_estimators: int = None, param_search_n_iter: int = None, param_search_early_stopping_rounds: int = None, reg_search_param_dist: Dict[KT, VT] = None, reg_search_n_iter: int = None, model_path: pathlib.Path = None)¶ Bases:
mlstream.models.base_models.LumpedModel
Wrapper for XGBoost model on lumped data.
-
load
(model_path: pathlib.Path)¶ Loads a trained and pickled model.
Parameters: model_file (Path) – Path to the stored model.
-
predict
(ds: mlstream.datasets.LumpedBasin) → numpy.ndarray¶ Generates predictions for a basin.
Parameters: ds (LumpedBasin) – Dataset of the basin to predict. Returns: Array of predictions. Return type: np.ndarray
-