Skip to content

azure_data_lake_gen2

Sends message parts as files to an Azure Data Lake Gen2 filesystem. Each file is uploaded with the filename specified with the path field.

Introduced in version 4.38.0.

# Config fields, showing default values
output:
label: ""
azure_data_lake_gen2:
storage_account: ""
storage_access_key: ""
storage_connection_string: ""
storage_sas_token: ""
filesystem: messages-${!timestamp("2006")} # No default (required)
path: ${!counter()}-${!timestamp_unix_nano()}.txt
max_in_flight: 64

In order to have a different path for each file you should use function interpolations described here, which are calculated per message of a batch.

Supports multiple authentication methods but only one of the following is required:

  • storage_connection_string
  • storage_account and storage_access_key
  • storage_account and storage_sas_token
  • storage_account to access via DefaultAzureCredential

If multiple are set then the storage_connection_string is given priority.

If the storage_connection_string does not contain the AccountName parameter, please specify it in the storage_account field.

Performance

This output benefits from sending multiple messages in flight in parallel for improved performance. You can tune the max number of in flight messages (or message batches) with the field max_in_flight.

Fields

storage_account

The storage account to access. This field is ignored if storage_connection_string is set.

Type: string

Default: ""

storage_access_key

The storage account access key. This field is ignored if storage_connection_string is set.

Type: string

Default: ""

storage_connection_string

A storage account connection string. This field is required if storage_account and storage_access_key / storage_sas_token are not set.

Type: string

Default: ""

storage_sas_token

The storage account SAS token. This field is ignored if storage_connection_string or storage_access_key are set.

Type: string

Default: ""

filesystem

The data lake storage filesystem name for uploading the messages to. This field supports interpolation functions.

Type: string

# Examples
filesystem: messages-${!timestamp("2006")}

path

The path of each message to upload within the filesystem. This field supports interpolation functions.

Type: string

Default: "${!counter()}-${!timestamp_unix_nano()}.txt"

# Examples
path: ${!counter()}-${!timestamp_unix_nano()}.json
path: ${!meta("kafka_key")}.json
path: ${!json("doc.namespace")}/${!json("doc.id")}.json

max_in_flight

The maximum number of messages to have in flight at a given time. Increase this to improve throughput.

Type: int

Default: 64