cohere_embeddings
Generates vector embeddings to represent input text, using the Cohere API.
Introduced in version 4.37.0.
# Config fields, showing default valueslabel: ""cohere_embeddings: base_url: https://api.cohere.com api_key: "" # No default (required) model: embed-english-v3.0 # No default (required) text_mapping: "" # No default (optional) input_type: search_document dimensions: 0 # No default (optional)This processor sends text strings to the Cohere API, which generates vector embeddings. By default, the processor submits the entire payload of each message as a string, unless you use the text_mapping configuration field to customize it.
To learn more about vector embeddings, see the Cohere API documentation.
Examples
Compute embeddings for some generated data and store it within xrefs:component:outputs/qdrant.adoc[Qdrant]
input: generate: interval: 1s mapping: | root = {"text": fake("paragraph")}pipeline: processors: - cohere_embeddings: model: embed-english-v3 api_key: "${COHERE_API_KEY}" text_mapping: "root = this.text"output: qdrant: grpc_host: localhost:6334 collection_name: "example_collection" id: "root = uuid_v4()" vector_mapping: "root = this"Fields
base_url
The base URL to use for API requests.
Type: string
Default: "https://api.cohere.com"
api_key
The API key for the Cohere API.
Type: string
model
The name of the Cohere model to use.
Type: string
# Examples
model: embed-english-v3.0
model: embed-english-light-v3.0
model: embed-multilingual-v3.0
model: embed-multilingual-light-v3.0text_mapping
The text you want to generate a vector embedding for. By default, the processor submits the entire payload as a string.
Type: string
input_type
Specifies the type of input passed to the model.
Type: string
Default: "search_document"
| Option | Summary |
|---|---|
classification | Used for embeddings passed through a text classifier. |
clustering | Used for the embeddings run through a clustering algorithm. |
search_document | Used for embeddings stored in a vector database for search use-cases. |
search_query | Used for embeddings of search queries run against a vector DB to find relevant documents. |
dimensions
The number of dimensions of the output embedding. This is only available for embed-v4 and newer models. Possible values are 256, 512, 1024, and 1536.
Type: int