Skip to main content

Introducing the Backend.AI MLOps platform, FastTrack.

· 6 min read
Jihyun Kang

In this article, we introduce FastTrack, the MLOps platform of Backend.AI. With FastTrack, users can compose each step of data preprocessing, training, validation, deployment, and inference into a single pipeline. Especially, FastTrack allows users to easily customize each step when building a pipeline. In this post, we introduce the background of Backend.AI FastTrack and its unique features, along with why MLOps platforms are needed.

Emergence of MLOps Platforms

In the past few years, AI has been introduced in most industries undergoing digital transformation, as well as in the IT industry, to extract meaningful predictions from scattered data to respond to rapidly changing markets with utmost effort. In this process, it has become necessary to respond to various stages, such as hardware introduction considering data I/O and model version management, in addition to model training and optimization, to make good use of AI. The concept that emerged from this is MLOps (Machine Learning Operations). For more information on MLOps, please refer to the MLOps series covered on the Lablup technology blog. We recommend that those who are unfamiliar with the concept of MLOps should take a look at the above article before reading this FastTrack introduction.

History of FastTrack

In 2019, in response to demand for DevOps pipelines, Lablup added Backend.AI pipeline functionality as a beta release. We developed and tested the function to simplify the process of creating and managing complex pipelines and to operate one-way pipelines that are divided into two or more paths in the middle. However, as various pipeline solutions such as AirFlow, MLFlow, and KubeFlow were popularized along with the emergence of MLOps, we changed our development direction to integrate and support open source pipeline tools instead of developing pipeline functionality as an official feature.

Meanwhile, as AI development pipelines became increasingly complex, and it became clear that open source MLOps pipeline tools could not fulfill users' diverse requirements, we decided to revive Backend.AI's pipeline functionality. During the revival and prototyping process of Backend.AI pipeline functionality, the development direction changed to a standalone MLOps pipeline solution that runs with the Backend.AI cluster, rather than an integrated pipeline feature that can immediately reflect users' requests.

Thus, the AI/MLOps solution from Lablup, which has undergone various histories, was named FastTrack, inspired by the process that processes clearance procedures quickly in places such as airports and logistics, and the first official version is being tested with Backend.AI 22.09.

What is FastTrack?

FastTrack is a machine learning workflow platform that allows users to customize multiple units of work based on the Backend.AI cluster and run them in the form of a directed acyclic graph (DAG). When each step of the machine learning pipeline corresponds to a session, the user can combine each step, such as data preprocessing, training, validation, deployment, monitoring, and optimization, into a single workflow as needed. In other words, by creating a workflow from sessions that users had to manually create one by one in the existing Backend.AI cluster, FastTrack automatically schedules them at the end of each step, making it easier for users to build and reuse models.

FastTrack Structure and Features

In FastTrack, workflow templates are separated into pipelines and the workflows that are executed are called pipeline jobs. The units of work within a workflow are called tasks, and the units of work that are executed are called task instances. The following diagram explains how each step progresses in FastTrack.

Pipeline

A pipeline is a collection of information and relationships between tasks, and has a directed acyclic graph (DAG) structure. To create an AI workflow, you can create a pipeline, and FastTrack automatically generates a pipeline-specific folder in the Backend.AI cluster to enable you to verify whether the learning is progressing well using artifacts. Additionally, FastTrack allows users to easily modify the relationships between tasks with an interface such as drag and drop, and to immediately view the changes as a diagram and a YAML file, making it very convenient. Furthermore, pipelines are managed as YAML files, making it easy to export or import them and to share them among users.

Pipeline Job

A pipeline job is an actual object created based on the generated pipeline information, and it has the characteristic that it cannot be modified during execution. In the FastTrack GUI, users can see that the unit of work is being executed as the color of the corresponding node. Like pipelines, the information and relationships of the component task instances are managed in YAML format. When all task instances are completed, the status of the pipeline job is displayed as either success or failure.

Task

A task is the minimum unit of execution that makes up a pipeline, and resource allocation is possible for each purpose. For example, for a task only for model training, more GPU resources can be intensively allocated to efficiently use resources, unlike preprocessing. In addition, the execution environment can also be specified. Without docker build process, images such as TensorFlow, PyTorch, Python 3.x, NGC TensorFlow, NGC PyTorch, supported by the Backend.AI cluster, can be used as they are. Virtual folders created by the Backend.AI cluster can also be mounted for each task as needed.

Task Instance

A task instance can be considered as an actual object created based on the task information that makes up the pipeline when the pipeline job is created. In other words, running an AI workflow means that task instances of the pipeline job, which make up the pipeline in the order of the specified pre/post-relationships, are executed. Task instances correspond one-to-one with the current Backend.AI cluster's session, and the session state and task instance state are synchronized, but in the future, it will be expanded to various execution units other than sessions.

Conclusion

We have introduced the Backend.AI MLOps platform, FastTrack, and discussed MLOps. Currently, the latest version of FastTrack is 22.09, and we plan to develop and provide various user convenience features such as pipeline versioning, pipeline dependency, task resource optimization, and GitHub-based model/data store support. In line with the motto of Lablup, "Make AI Accessible, anyone, anytime, anywhere", we will continue to make it easier for everyone to build automated models using FastTrack. We appreciate your continued interest in our future endeavors.