stouputils.data_science.mlflow_utils module#

This module contains utility functions for working with MLflow.

This module contains functions for:

Getting the artifact path from the current mlflow run
Getting the weights path
Getting the runs by experiment name
Logging the history of the model to the current mlflow run
Starting a new mlflow run

get_artifact_path(from_string: str = '', os_name: str = 'posix') → str[source]#

Get the artifact path from the current mlflow run (without the file:// prefix).

Handles the different path formats for Windows and Unix-based systems.

Parameters:

from_string (str) – Path to the artifact (optional, defaults to the current mlflow run)
os_name (str) – OS name (optional, defaults to os.name)

Returns:

The artifact path

Return type:

str

get_weights_path(from_string: str = '', weights_name: str = 'best_model.keras', os_name: str = 'posix') → str[source]#

Get the weights path from the current mlflow run.

Parameters:

from_string (str) – Path to the artifact (optional, defaults to the current mlflow run)
weights_name (str) – Name of the weights file (optional, defaults to “best_model.keras”)
os_name (str) – OS name (optional, defaults to os.name)

Returns:

The weights path

Return type:

str

Examples

>>> get_weights_path(from_string="file:///path/to/artifact", weights_name="best_model.keras", os_name="posix")
'/path/to/artifact/best_model.keras'

>>> get_weights_path(from_string="file:///C:/path/to/artifact", weights_name="best_model.keras", os_name="nt")
'C:/path/to/artifact/best_model.keras'

get_runs_by_experiment_name(experiment_name: str, filter_string: str = '', set_experiment: bool = False) → list[Run][source]#

Get the runs by experiment name.

Parameters:

experiment_name (str) – Name of the experiment
filter_string (str) – Filter string to apply to the runs
set_experiment (bool) – Whether to set the experiment

Returns:

List of runs

Return type:

list[Run]

get_runs_by_model_name(experiment_name: str, model_name: str, set_experiment: bool = False) → list[Run][source]#

Get the runs by model name.

Parameters:

experiment_name (str) – Name of the experiment
model_name (str) – Name of the model
set_experiment (bool) – Whether to set the experiment

Returns:

List of runs

Return type:

list[Run]

log_history(history: dict[str, list[Any]], prefix: str = 'history', **kwargs: Any) → None[source]#

Log the history of the model to the current mlflow run.

Parameters:

history (dict[str, list[Any]]) – History of the model (usually from a History object like from a Keras model: history.history)
**kwargs (Any) – Additional arguments to pass to mlflow.log_metric

start_run(mlflow_uri: str, experiment_name: str, model_name: str, override_run_name: str = '', **kwargs: Any) → str[source]#

Start a new mlflow run.

Parameters:

mlflow_uri (str) – MLflow URI
experiment_name (str) – Name of the experiment
model_name (str) – Name of the model
override_run_name (str) – Override the run name (if empty, it will be set automatically)
**kwargs (Any) – Additional arguments to pass to mlflow.start_run

Returns:

Name of the run (suffixed with the version number)

Return type:

str

get_best_run_by_metric(experiment_name: str, metric_name: str, model_name: str = '', ascending: bool = False, has_saved_model: bool = True) → Run | None[source]#

Get the best run by a specific metric.

Parameters:

experiment_name (str) – Name of the experiment
metric_name (str) – Name of the metric to sort by
model_name (str) – Name of the model (optional, if empty, all models are considered)
ascending (bool) – Whether to sort in ascending order (default: False, i.e. maximum metric value is best)
has_saved_model (bool) – Whether the model has been saved (default: True)

Returns:

The best run or None if no runs are found

Return type:

Run | None

load_model(run_id: str, model_type: Literal['keras', 'pytorch'] = 'keras') → Any[source]#

Load a model from MLflow.

Parameters:

run_id (str) – ID of the run to load the model from
model_type (Literal["keras", "pytorch"]) – Type of model to load (default: “keras”)

Returns:

The loaded model

Return type:

Any

stouputils.data_science.mlflow_utils module#

This Page