Review any GitHub repo.
MLflow is an open-source framework to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, packaging model for deployment and managing models in a central repository. Specifically, MLflow’s empowers Data Scientists and Machine Learning Engineers to effectively develop machine learning models through four components are:
I have been using MLflow successfully for the last 2 years through two different projects, from where I can share that:
Note: MLflow Models is a very helpful component, however I must share that I have faced some issues to serialize models from MLlib (Apache Spark’s scalable machine learning library) when using the MLflow versions previous than 2.0.
Finally, to illustrate how easy it is to use, below I share a code snippet using MLflow Tracking and MLflow Models components:
import mlflow
# Start an MLflow run to track the experiment
with mlflow.start_run():
# Log parameters
mlflow.log_param("learning_rate", 0.001)
mlflow.log_param("batch_size", 32)
# Log metrics
mlflow.log_metric("accuracy", 0.85)
mlflow.log_metric("loss", 0.42)
# Log artifacts (e.g., model weights, visualizations)
mlflow.log_artifact("model_weights.h5")
mlflow.log_artifact("visualization.png")
# Save the trained model in a specific Keras “flavor”
mlflow.keras.log_model(model, "model")