Skip to content

git

A Git input that clones (or pulls) a repository and reads the repository contents.

Introduced in version 4.51.0.

# Config fields, showing default values
input:
label: ""
git:
repository_url: https://github.com/username/repo.git # No default (required)
branch: main
poll_interval: 10s
include_patterns: []
exclude_patterns: []
max_file_size: 10485760
checkpoint_cache: "" # No default (optional)
checkpoint_key: git_last_commit
auth:
basic:
username: ""
password: ""
ssh_key:
private_key_path: ""
private_key: ""
passphrase: ""
token:
value: ""
auto_replay_nacks: true

The git input clones the specified repository (or pulls updates if already cloned) and reads the content of the specified file. It periodically polls the repository for new commits and emits a message when changes are detected.

Metadata

This input adds the following metadata fields to each message:

  • git_file_path
  • git_file_size
  • git_file_mode
  • git_file_modified
  • git_commit
  • git_mime_type
  • git_is_binary
  • git_encoding (present if the file was base64 encoded)
  • git_deleted (only present if the file was deleted)

You can access these metadata fields using function interpolation.

Fields

repository_url

The URL of the Git repository to clone.

Type: string

# Examples
repository_url: https://github.com/username/repo.git

branch

The branch to check out.

Type: string

Default: "main"

poll_interval

Duration between polling attempts

Type: string

Default: "10s"

# Examples
poll_interval: 10s

include_patterns

A list of file patterns to include (e.g., ‘/.md’, ‘configs/.yaml’). If empty, all files will be included. Supports glob patterns: *, //, ?, and character ranges [a-z]. Any character with a special meaning can be escaped with a backslash.

Type: array

Default: []

exclude_patterns

A list of file patterns to exclude (e.g., ‘.git/’, ’/*.png’). These patterns take precedence over include_patterns. Supports glob patterns: *, /**/, ?, and character ranges [a-z]. Any character with a special meaning can be escaped with a backslash.

Type: array

Default: []

max_file_size

The maximum size of files to include in bytes. Files larger than this will be skipped. Set to 0 for no limit.

Type: int

Default: 10485760

checkpoint_cache

A cache resource to store the last processed commit hash, allowing the input to resume from where it left off after a restart.

Type: string

checkpoint_key

The key to use when storing the last processed commit hash in the cache.

Type: string

Default: "git_last_commit"

auth

Authentication options for the Git repository

Type: object

auth.basic

Basic authentication credentials

Type: object

auth.basic.username

Username for basic authentication

Type: string

Default: ""

auth.basic.password

Password for basic authentication

Type: string

Default: ""

auth.ssh_key

SSH key authentication

Type: object

auth.ssh_key.private_key_path

Path to SSH private key file

Type: string

Default: ""

auth.ssh_key.private_key

SSH private key content

Type: string

Default: ""

auth.ssh_key.passphrase

Passphrase for the SSH private key

Type: string

Default: ""

auth.token

Token-based authentication

Type: object

auth.token.value

Token value for token-based authentication

Type: string

Default: ""

auto_replay_nacks

Whether messages that are rejected (nacked) at the output level should be automatically replayed indefinitely, eventually resulting in back pressure if the cause of the rejections is persistent. If set to false these messages will instead be deleted. Disabling auto replays can greatly improve memory efficiency of high throughput streams as the original shape of the data can be discarded immediately upon consumption and mutation.

Type: bool

Default: true