Integrate Azure Databricks MLflow

How to use Azure Databricks Managed MLflow in Nussknacker Cloud

This blogpost is a companion to “How to train and register ML models in Azure Databricks”. Here, we show you how to integrate your Nussknacker Cloud account with the ML model trained and registered as described in the companion blogpost, and then use it in a credit card fraud detection scenario. Alternatively, you can follow along this blogpost with your own ML model.

---------------------------

MLflow is the open-source, industry-standard solution for ML model versioning, tracking, and lifecycle management. Nussknacker Cloud already provides excellent integration with models registered in MLflow, as described in the blogposts: Real-Time ML-Driven Recommendations and ML models inference in fraud detection.

In this blogpost, we explore how to integrate your Nussknacker Cloud account with Azure Databricks, a popular platform for collaborative data science and machine learning, and we demonstrate how Nussknacker simplifies the process of inferring models registered in Azure Databricks managed MLflow, making it both easy and intuitive. We assume you already have your ML model trained and registered in Azure Databricks managed MLflow, and you want to use it for inference in your Nussknacker scenarios.

As an example ML model, we use a simple decision tree classifier trained on a Kaggle dataset of credit card fraud data. The model was registered in Azure Databricks as “credit-card-fraud-classifier” (version 1). Azure Databricks as “credit-card-fraud-classifier”

You can explore the full training and registration process of this model in this post How to train and register ML models in Azure Databricks

Exposing ML models registered in Azure Databricks

The first step is to expose the Azure Databricks MLflow Model Registry externally, enabling Nussknacker to communicate with it and access its registered ML models.

Note the Azure Databricks Service URL

In the Overview section of your Azure Databricks Service, note down the provided URL. It will be needed later when configuring Nussknacker integration. Overview section of your Azure Databricks Service

Set Up Service Principals and Secrets

- Inside your Azure Databricks workspace, click on your user icon in the top-right corner and select ”Settings".

- Navigate to "Identity and access".

- Click the "Manage" button in the "Service principals" section.

- Click "Add Service Principal" and "Add new" in the modal window.

- Choose "Databricks Managed", provide a name (e.g., "nu-cloud"), and click "Add".

- Click on the newly created service principal's name.

- Pick the "Allow unrestricted cluster creation" option and click "Update".

- Navigate to the "Secrets" tab and click "Generate New Secret". Specify the desired lifetime for the new secret and then click "Generate".

- Copy the generated "Client ID" and "Secret" values and store them securely. Copy the generated "Client ID" and "Secret" values

Grant View Permission to your workspace

- Navigate to Workspace in the left-hand menu and select the "Workspace" icon in the “Workspace” view. Integrate Azure Databricks MLflow for machine learning model management and inference

- Click the Share button in the top-left corner of the page.

- In the "Type to add multiple users, groups or service principal" field, enter the service principal name (e.g., "nu-cloud") that you created earlier.

- Set the permission level to "Can View", then click the "Add" button.

- Close the sharing modal to finalize the changes. Integrate Azure Databricks MLflow for machine learning model management and inference

Grant Read Permission to the registered ML model

- In the left-hand menu inside your Azure Databricks workspace, navigate to "Models" under the "Machine Learning" section and select your ML model.

- Click the "Permissions" button in the top-right corner of the model's page.

- In the "Select user, group, and service principal..." field, enter the service principal name (e.g., "nu-cloud") that you created earlier.

- Set the permission level to "Can Read" and then click "Add".

- Save the changes by clicking the "Save" button. The Azure Databricks MLflow model registry and the registered ML model are now accessible for read operations from outside the Azure Databricks platform.

The Azure Databricks MLflow model registry and the registered ML model are now accessible for read operations from outside the Azure Databricks platform.

Configuring Azure Databricks MLflow Integration in Nussknacker

The next step is to configure Nussknacker to access the Azure Databricks managed MLflow and provide components for inferring registered ML models.

Open the Integrations tab in your Nu Cloud account, locate the “MLflow” card, and click the “Add New” button. locate the “MLflow” card

Fill out the integration form with the following information:

- Name: enter the identifier for the ML component that will be used within Nussknacker to refer to your model.

- Server URL: this is the URL from the "Overview" section of your Azure Databricks Service.

- Token URL: append the `/oidc/v1/token` to the Server URL, forming the complete Token URL needed for authentication.

- Client ID: this is the "Client ID" saved when we created the Service Principal in Azure.

- Client Secret: this is the "Secret" saved when we created the Service Principal in Azure.

- Scope: set this to "all-apis". All APIs

Make sure that Nussknacker can successfully connect to the Azure Databricks managed MLflow using the provided configuration by clicking the “Check Connection” button. If the connection is successful, apply the configuration by clicking the “Submit” button. The MLflow component that provides access to ML models registered in the configured Azure Databricks managed MLflow is now available for use in your Nussknacker scenarios, simplifying ML model integration into your data processing flows.

Using Azure Databricks MLflow Integration

The configured integration with Azure Databricks managed MLflow is available as a dedicated component located under the Enrichers section in the Creator Panel of a Nussknacker Cloud scenario. The component’s name consists of an uppercase “Name” provided during the Integration configuration, followed by the “ML Model” suffix. Using Azure Databricks MLflow Integration

After dragging the ML component onto the scenario creation panel and double-clicking it, you will be presented with a configuration form in which you can select a desired ML model (out of the models that you registered in Azure Databricks MLflow) by specifying its name and its version. Our example model was registered as “credit-card-fraud-classifier” with version 1, and therefore, this is the model we chose to select. credit card fraud ML model

Having selected the appropriate model, the configuration form will be populated with input fields for the model input parameters. Each field has a label with a name of the model’s input parameter and its expected data type - this is the place in which you configure the bindings for your ML model input parameters. credit card fraud detection demo scenario

For our example, we created a simple credit card fraud detection demo scenario in which the ML model is directly connected to the Kafka source, providing transaction events with fields mapping directly to the model’s input parameters.

In more realistic scenarios, these events could be part of much more complex pipelines before reaching the ML model.

The output of the ML model is bound to a variable name specified in the “Output variable name” field. Classifier

This variable is used to access the ML model’s outputs by their names, as specified in the model’s signature during its registration in Azure Databricks.

In our demo scenario, the output from the ML model, i.e., the decision whether a given transaction is fraudulent or not, is used to forward the transaction event to the appropriate Kafka Sink topic, which contains either accepted or blocked transactions. Credit Card Fraud ML Model scenario

This event forwarding is performed by the Filter component, which uses the ML model’s credit card fraud prediction obtained from the model’s named output parameter. easy to integrate with ML models registered in Azure Databricks managed MLflow

Final thoughts

Nussknacker Cloud makes it very easy to integrate with ML models registered in Azure Databricks managed MLflow. The process is very straightforward, and the registered ML models appear as regular enricher components, ready to be used for inference at any place in your Nussknacker scenarios.

share

follow us on X

The Nu Blog

All posts

Features

Landscape

Comparison

Case studies

By Industries

Use cases

Using Nussknacker

Discover

Integrate Azure Databricks MLflow for machine learning model management and inference

How to use Azure Databricks Managed MLflow in Nussknacker Cloud

Exposing ML models registered in Azure Databricks

Note the Azure Databricks Service URL

Set Up Service Principals and Secrets

Grant View Permission to your workspace

Grant Read Permission to the registered ML model

Configuring Azure Databricks MLflow Integration in Nussknacker

Using Azure Databricks MLflow Integration

Final thoughts

Turn your Flink UDFs and PTFs into low-code components

Serving real-time Kafka data to AI agents through MCP

Typed structured output from LLMs in Nussknacker — no more parsing JSON by hand

Feel free to ask any questions