Databricks MLflow — manage models

Michal Molka
3 min readFeb 3, 2023

--

Let’s assume that you have you data prepared and you want to use MLflow to manage and use you ML model.

First step is to set an experiment.

The empty experiment is present here:

And here:

As of now, the experiment is empty. This code changes the situation. It creates five runs. A run is a component of experiment. Experiment can contain many runs. Every run can have different parameters.

The picture bellow shows this experiment and many runs.

What has happened in the code?

Line 13: mlflow.start_run — creates a new run for the experiment. In this case the run is attached to the experiment from the first step. If we want to attach a run to another experiment we can provide an experiment id as an argument:

mlflow.start_run(experiment_id = 2277472460643381, run_name=’random_forest_’+str(i), description=’MLFlow_wines_experiment_description’):

Lines 22, 23: mlflow.log_param — parameters saved to a run.

Lines 24, 25: mlflow.log_metric, mlflow.log_metrics — metrics saved to a run.

Lines 26, 27: mlflow.set_experiment_tag, mlflow.set_experiment_tags — tags saved to a run.

Lines 30, 31: mlflow.set_tag, mlflow.set_tags — tags saved to an experiment.

All above (regarding runs) are present here:

Line 28: mlflow.log_text — creates a new text file for a run as an artifact containing provided string.

Line 38: mlflow.pyfunc.log_model — creates model artifacts for a run.

Here are artifacts:

Now, we can register the model for later use.

Here are all the registered models, a you see, we registered a [wine_quality] model. We have used a particular run ID which is a model [random_forest_model4].

If we want to use the newest model version for prediction we need to change its stage. Currently, the version 6 is expected to be used on Production. By the way we mark the older one (Version 3) as Archived.

The model is ready. Here is are two examples how to load the model into a notebook.

There are much more possibilities of MLflow Python API. Some examples are in this post: Databricks MLflow — Python API a few examples

Here is one additional feature which you can find useful. A runs comparison. Where you can select chosen runs and analyze their properties.

Here is a code: MLflow manage models

--

--

Michal Molka

Architect | Azure | Power BI | Fabric | Power Platform | Infrastructure | Security | M365