Edit

Share via


Create an import job from Azure Blob Storage to a file system

Azure Managed Lustre integrates with Azure Blob Storage to simplify the process of importing data from a blob container to a file system. You can configure this integration during cluster creation, and you can create an import job any time after the cluster is created.

In this article, you learn how to use the Azure portal to create an import job that imports data from a blob container into an existing Azure Managed Lustre file system.

Note

When you import data from a blob container to an Azure Managed Lustre file system, only the file names (namespace) and metadata are imported into the Lustre file system namespace.

The actual contents of a blob are imported when a client first accesses the file. When you first access the data, there's a slight delay while the Lustre Hierarchical Storage Management (HSM) feature pulls in the blob contents to the corresponding file in the file system. This delay only occurs the first time a file is accessed.

You can prefetch the contents of a blob container by using the Lustre lfs hsm_restore command from a mounted client with sudo capabilities. To learn more, see Restore data from Blob Storage.

Prerequisites

Create an import job

To begin importing data from a blob container into an Azure Managed Lustre file system, create an import job. In this section, you learn how to create, configure, and start an import job in the Azure portal.

Note

Only one import or export job can run at a time. For example, if an import job is in progress, and you attempt to start another import or export job, you get an error.

Configure import options and start the job

To configure the import options and start the job, follow these steps:

  1. In the Azure portal, open your Azure Managed Lustre file system. Under Settings, select Blob integration.

  2. Select + Create new job.

  3. Select Import from the Job type dropdown menu.

  4. Enter a name for the import job in the Job Name field.

  5. Select a value for the Conflict resolution mode field. This setting determines how the import job handles conflicts between existing files in the file system and the new files that you import. In this example, we select Skip. To learn more, see Conflict resolution mode.

  6. Select a value for Error tolerance. This setting determines how the import job handles errors that occur during the import process. In this example, we select Allow errors. To learn more, see Error tolerance.

  7. To filter the data that you're importing from Blob Storage, enter import prefixes. The Azure portal allows you to enter up to 10 prefixes. In this example, we specify the prefixes /data and /logs. To learn more, see Import prefix.

  8. After the job is configured, select Start to begin the import process.

The following screenshot shows the import job configuration settings in the Azure portal.

Screenshot that shows the portal settings to create an import job.

Monitor the import job

After you create the import job, you can monitor its progress to make sure it successfully finishes. In this section, you learn how to monitor the import job in the Azure portal.

To view the job details, follow these steps:

  1. In the Azure portal, open your Azure Managed Lustre file system. Under Settings, select Blob integration.
  2. Select the import job you want to monitor from the list of recent jobs.
  3. The Job details pane displays information about the job, including the job status, start time, blobs imported, and any errors or conflicts that occurred during the import process.

The following screenshot shows the job details for an import job in the Azure portal.

Screenshot that shows the job details for an import job.

When the job finishes, you can view the logging container to see detailed information about the import process, including any errors or conflicts that occurred. This information is available only after the job finishes.