ollama_moderation
Generates responses to messages in a chat conversation, using the Ollama API.
Introduced in version 4.42.0.
# Common config fields, showing default valueslabel: ""ollama_moderation: model: llama-guard3 # No default (required) prompt: "" # No default (required) response: "" # No default (required) runner: context_size: 0 # No default (optional) batch_size: 0 # No default (optional) server_address: http://127.0.0.1:11434 # No default (optional)# Advanced config fields, showing default valueslabel: ""ollama_moderation: model: llama-guard3 # No default (required) prompt: "" # No default (required) response: "" # No default (required) runner: context_size: 0 # No default (optional) batch_size: 0 # No default (optional) gpu_layers: 0 # No default (optional) threads: 0 # No default (optional) use_mmap: false # No default (optional) server_address: http://127.0.0.1:11434 # No default (optional) cache_directory: /opt/cache/connect/ollama # No default (optional) download_url: "" # No default (optional)This processor checks LLM response safety using either llama-guard3 or shieldgemma. If you want to check if a given prompt is safe, then that can be done with the ollama_chat processor - this processor is for response classification only.
By default, the processor starts and runs a locally installed Ollama server. Alternatively, to use an already running Ollama server, add your server details to the server_address field. You can download and install Ollama from the Ollama website.
For more information, see the Ollama documentation.
Examples
This example uses Llama Guard 3 to check if another model responded with a safe or unsafe content.
input: stdin: scanner: lines: {}pipeline: processors: - ollama_chat: model: llava prompt: "${!content().string()}" save_prompt_metadata: true - ollama_moderation: model: llama-guard3 prompt: "${!@prompt}" response: "${!content().string()}" - mapping: | root.response = content().string() root.is_safe = @safeoutput: stdout: codec: linesFields
model
The name of the Ollama LLM to use.
Type: string
| Option | Summary |
|---|---|
llama-guard3 | When using llama-guard3, two pieces of metadata is added: @safe with the value of yes or no and the second being @category for the safety category violation. For more information see the Llama Guard 3 Model Card. |
shieldgemma | When using shieldgemma, the model output is a single piece of metadata of @safe with a value of yes or no if the response is not in violation of its defined safety policies. |
# Examples
model: llama-guard3
model: shieldgemmaprompt
The input prompt that was used with the LLM. If using ollama_chat the you can use save_prompt_metadata to safe the prompt as metadata.
This field supports interpolation functions.
Type: string
response
The LLM’s response to classify if it contains safe or unsafe content. This field supports interpolation functions.
Type: string
runner
Options for the model runner that are used when the model is first loaded into memory.
Type: object
runner.context_size
Sets the size of the context window used to generate the next token. Using a larger context window uses more memory and takes longer to processor.
Type: int
runner.batch_size
The maximum number of requests to process in parallel.
Type: int
runner.gpu_layers
This option allows offloading some layers to the GPU for computation. This generally results in increased performance. By default, the runtime decides the number of layers dynamically.
Type: int
runner.threads
Set the number of threads to use during generation. For optimal performance, it is recommended to set this value to the number of physical CPU cores your system has. By default, the runtime decides the optimal number of threads.
Type: int
runner.use_mmap
Map the model into memory. This is only support on unix systems and allows loading only the necessary parts of the model as needed.
Type: bool
server_address
The address of the Ollama server to use. Leave the field blank and the processor starts and runs a local Ollama server or specify the address of your own local or remote server.
Type: string
# Examples
server_address: http://127.0.0.1:11434cache_directory
If server_address is not set - the directory to download the ollama binary and use as a model cache.
Type: string
# Examples
cache_directory: /opt/cache/connect/ollamadownload_url
If server_address is not set - the URL to download the ollama binary from. Defaults to the offical Ollama GitHub release for this platform.
Type: string