openai_embeddings

Generates vector embeddings to represent input text, using the OpenAI API.

Introduced in version 4.32.0.

# Config fields, showing default values
label: ""
openai_embeddings:
  server_address: https://api.openai.com/v1
  api_key: "" # No default (required)
  model: text-embedding-3-large # No default (required)
  text_mapping: "" # No default (optional)
  dimensions: 0 # No default (optional)

This processor sends text strings to the OpenAI API, which generates vector embeddings. By default, the processor submits the entire payload of each message as a string, unless you use the text_mapping configuration field to customize it.

To learn more about vector embeddings, see the OpenAI API documentation.

Compute embeddings for some generated data and store it within xrefs:component:outputs/pinecone.adoc[Pinecone]

input:
  generate:
    interval: 1s
    mapping: |
      root = {"text": fake("paragraph")}
pipeline:
  processors:
  - openai_embeddings:
      model: text-embedding-3-large
      api_key: "${OPENAI_API_KEY}"
      text_mapping: "root = this.text"
output:
  pinecone:
    host: "${PINECONE_HOST}"
    api_key: "${PINECONE_API_KEY}"
    id: "root = uuid_v4()"
    vector_mapping: "root = this"

Compute embeddings for some generated data and store it within xrefs:component:outputs/cyborgdb.adoc[CyborgDB]

input:
  generate:
    interval: 1s
    mapping: |
      root = {"text": fake("paragraph")}
pipeline:
  processors:
  - openai_embeddings:
      model: text-embedding-3-large
      api_key: "${OPENAI_API_KEY}"
      text_mapping: "root = this.text"
output:
  cyborgdb:
    host: "${CYBORGDB_HOST}"
    api_key: "${CYBORGDB_API_KEY}"
    index_key: "${CYBORGDB_INDEX_KEY}"
    index_name: "my_encrypted_index"
    operation: "upsert"
    id: "root = uuid_v4()"
    vector_mapping: "root = this"

Fields

`server_address`

The Open API endpoint that the processor sends requests to. Update the default value to use another OpenAI compatible service.

Type: string

Default: "https://api.openai.com/v1"

`api_key`

The API key for OpenAI API.

Type: string

`model`

The name of the OpenAI model to use.

Type: string

# Examples

model: text-embedding-3-large

model: text-embedding-3-small

model: text-embedding-ada-002

`text_mapping`

The text you want to generate a vector embedding for. By default, the processor submits the entire payload as a string.

Type: string

`dimensions`

The number of dimensions the resulting output embeddings should have. Only supported in text-embedding-3 and later models.

Type: int