Awesome MLOps Tools

Kelvin Salton do Prado
3 min readMay 27, 2020
Pixabay — Buildings

With the hype of data science and AI, companies are facing a new challenge of building, deploying, and managing machine learning models in production. As a result of this hype, several new tools and a new stack of technologies emerge that software engineers may not be familiar with.

In this scenario, the role of machine learning engineer (MLE) emerges, who is responsible for providing tools that facilitate the deployment and management of machine learning models in production.

But even for machine learning engineers it is hard to keep up to date with the new tools that appear every single day. So, I decided to create an awesome list of machine learning engineering tools that may be helpful to keep you up to date with the latest and most incredible tools on the market.

Let’s see some of them:

Data Exploration

Tools for performing data exploration.

  • Google Colab: Hosted Jupyter notebook service that requires no setup.
  • Jupyter Notebook: Web-based environment for interactive computing.
  • JupyterLab: The next-generation user interface for Project Jupyter.

Data Version Control

Tools for performing data version control.

  • DVC: Management and versioning of datasets and ML models.

Data Visualization

Tools for data visualization, reports and dashboards.

  • Metabase: The simplest, fastest way to get BI and analytics to everyone.
  • Redash: Connect to any data source, easily visualize and share data.
  • Superset: Modern, enterprise-ready BI web application.
  • Tableau: Powerful and fastest growing data visualization tool.

Feature Store

Feature store tools for data serving.

  • Feast: End-to-end open source feature store for machine learning.

Hyperparameter Tuning

Tools and libraries to perform hyperparameter tuning.

  • Katib: Kubernetes-based system for Hyperparameter Tuning.
  • Tune: Library for experiment execution and hyperparameter tuning.

Knowledge Sharing

Tools for sharing knowledge to the entire team/company.

  • Knowledge Repo: Knowledge sharing platform for data scientists.
  • Kyso: One place for data insights so your entire team can learn from it.

Machine Learning Platform

Complete machine learning platform solutions.

  • CNVRG: End-to-end ML platform to build and deploy AI models at scale.
  • DataRobot: AI platform that democratizes DS and automates ML.
  • Domino: One place for your DS tools, results, models, and knowledge.
  • H2O: Open-source leader in AI with a mission to democratize AI.
  • Hopsworks: Open-source platform for operating ML models.
  • Iguazio: Data science platform that automates MLOps with pipelines.
  • Kubeflow: Making deployments of ML workflows on Kubernetes.
  • Pachyderm: Combines data lineage with end-to-end pipelines on k8s.
  • Sagemaker: Fully managed service to build, train, and deploy models.

Model Lifecycle

Tools for managing model lifecycle (experiments, parameters and metrics).

  • Comet: Track your datasets, code changes, experiments and models.
  • Mlflow: Open source platform for the machine learning lifecycle.
  • Neptune AI: The most lightweight experiment management tool.

Model Serving

Tools for serving models in production.

  • BentoML: Open-source platform for high-performance model serving.
  • Cortex: Machine learning model serving infrastructure.
  • KFServing: K8S custom resource definition for serving ML models.
  • PredictionIO: Deployment and querying predictive results via APIs.
  • Seldon: Take your ML projects from POC to production easily.
  • TensorFlow Serving: Flexible, high-performance serving system.

Optimization Tools

Optimization tools related to model scalability in production.

  • Dask: Provides advanced parallelism enabling performance at scale.
  • Mahout: Distributed linear algebra framework.
  • MLlib: Apache Spark’s scalable machine learning library.
  • Modin: Speed up your Pandas workflows by changing a single line.
  • Ray: Fast and simple framework for building distributed applications.
  • Singa: Distributed training of deep learning and ML models.

Workflow Tools

Tools and frameworks to create workflows or pipelines.

  • Kedro: Library that use best-practices for data and ML pipelines.
  • Metaflow: Helps scientists to build and manage real-life DS projects.

To check out the complete and updated list, visit the repository on Github:

If you liked it please give it a clap 👏 and a star ⭐️ on Github.

Thank you all ❤️

Summer Love GIF by America’s Got Talent

--

--