Skip to content

sqlite

Stores messages in an SQLite database and acknowledges them at the input level.

# Config fields, showing default values
buffer:
sqlite:
path: "" # No default (required)
pre_processors: [] # No default (optional)
post_processors: [] # No default (optional)

Stored messages are then consumed as a stream from the database and deleted only once they are successfully sent at the output level. If the service is restarted Redpanda Connect will make a best attempt to finish delivering messages that are already read from the database, and when it starts again it will consume from the oldest message that has not yet been delivered.

Delivery guarantees

Messages are not acknowledged at the input level until they have been added to the SQLite database, and they are not removed from the SQLite database until they have been successfully delivered. This means at-least-once delivery guarantees are preserved in cases where the service is shut down unexpectedly. However, since this process relies on interaction with the disk (wherever the SQLite DB is stored) these delivery guarantees are not resilient to disk corruption or loss.

Batching

Messages that are logically batched at the point where they are added to the buffer will continue to be associated with that batch when they are consumed. This buffer is also more efficient when storing messages within batches, and therefore it is recommended to use batching at the input level in high-throughput use cases even if they are not required for processing.

Fields

path

The path of the database file, which will be created if it does not already exist.

Type: string

pre_processors

An optional list of processors to apply to messages before they are stored within the buffer. These processors are useful for compressing, archiving or otherwise reducing the data in size before it’s stored on disk.

Type: array

post_processors

An optional list of processors to apply to messages after they are consumed from the buffer. These processors are useful for undoing any compression, archiving, etc that may have been done by your pre_processors.

Type: array

Examples

Batching at the input level greatly increases the throughput of this buffer. If logical batches aren’t needed for processing add a split processor to the post_processors.

yaml
input:
batched:
child:
sql_select:
driver: postgres
dsn: postgres://foouser:foopass@localhost:5432/testdb?sslmode=disable
table: footable
columns: [ '*' ]
policy:
count: 100
period: 500ms
buffer:
sqlite:
path: ./foo.db
post_processors:
- split: {}