openai_speech

Generates audio from a text description and other attributes, using OpenAI API.

Introduced in version 4.32.0.

Common
Advanced

# Common config fields, showing default values
label: ""
openai_speech:
  server_address: https://api.openai.com/v1
  api_key: "" # No default (required)
  model: tts-1 # No default (required)
  input: "" # No default (optional)
  voice: alloy # No default (required)

# Advanced config fields, showing default values
label: ""
openai_speech:
  server_address: https://api.openai.com/v1
  api_key: "" # No default (required)
  model: tts-1 # No default (required)
  input: "" # No default (optional)
  voice: alloy # No default (required)
  response_format: mp3 # No default (optional)

This processor sends a text description and other attributes, such as a voice type and format to the OpenAI API, which generates audio. By default, the processor submits the entire payload of each message as a string, unless you use the input configuration field to customize it.

To learn more about turning text into spoken audio, see the OpenAI API documentation.

Fields

`server_address`

The Open API endpoint that the processor sends requests to. Update the default value to use another OpenAI compatible service.

Type: string

Default: "https://api.openai.com/v1"

`api_key`

The API key for OpenAI API.

Type: string

`model`

The name of the OpenAI model to use.

Type: string

# Examples

model: tts-1

model: tts-1-hd

`input`

A text description of the audio you want to generate. The input field accepts a maximum of 4096 characters.

Type: string

`voice`

The type of voice to use when generating the audio. This field supports interpolation functions.

Type: string

# Examples

voice: alloy

voice: echo

voice: fable

voice: onyx

voice: nova

voice: shimmer

`response_format`

The format to generate audio in. Default is mp3. This field supports interpolation functions.

Type: string

# Examples

response_format: mp3

response_format: opus

response_format: aac

response_format: flac

response_format: wav

response_format: pcm