Home

Awesome

<h1 align="center"><img src="https://rivet.ironcladapp.com/img/logo-banner-wide.png" alt="Rivet Logo"></h1> <div align="center"> <picture> <source media="(prefers-color-scheme: dark)" height="200px" srcset="https://github.com/jmorganca/ollama/assets/3325447/56ea1849-1284-4645-8970-956de6e51c3c"> <img alt="logo" height="200px" src="https://github.com/jmorganca/ollama/assets/3325447/0d0b44e2-8f4a-4e99-9b52-a5c1c741c8f7"> </picture> </div>

Rivet Ollama Plugin

The Rivet Ollama Plugin is a plugin for Rivet to allow you to use Ollama to run and chat with LLMs locally and easily. It adds the following nodes:

Table of Contents

Running Ollama

To run Ollama so that Rivet's default browser executor can communicate with it, you will want to start it with the following command:

OLLAMA_ORIGINS=* ollama serve

If you are using the node executor, you can omit the OLLAMA_ORIGINS environment variable.

Using the plugin

In Rivet

To use this plugin in Rivet:

  1. Open the plugins overlay at the top of the screen.
  2. Search for "rivet-plugin-ollama"
  3. Click the "Add" button to install the plugin into your current project.

In the SDK

  1. Import the plugin and Rivet into your project:

    import * as Rivet from "@ironclad/rivet-node";
    import RivetPluginOllama from "rivet-plugin-ollama";
    
  2. Initialize the plugin and register the nodes with the globalRivetNodeRegistry:

    Rivet.globalRivetNodeRegistry.registerPlugin(RivetPluginOllama(Rivet));
    

    (You may also use your own node registry if you wish, instead of the global one.)

  3. The nodes will now work when ran with runGraphInFile or createProcessor.

Configuration

In Rivet

By default, the plugin will attempt to connect to Ollama at http://localhost:11434. If you would like you change this, you can open the Settings window, navigate to the Plugins area, and you will see a Host setting for Ollama. You can change this to the URL of your Ollama instance. For some users it works using http://127.0.0.1:11434 instead.

In the SDK

When using the SDK, you can pass a host option to the plugin to configure the host:

Using createProcessor or runGraphInFile, pass in via pluginSettings in RunGraphOptions:

await createProcessor(project, {
  ...etc,
  pluginSettings: {
    ollama: {
      host: "http://localhost:11434",
    },
  },
});

Nodes

Ollama Chat

The main node of the plugin. Functions similarly to the Chat Node built in to Rivet. Uses /api/chat route

Inputs

TitleData TypeDescriptionDefault ValueNotes
System PromptstringThe system prompt to prepend to the messages list.(none)Optional.
Messages'chat-message[]'The chat messages to use as the prompt for the LLM.(none)Chat messages are converted to the OpenAI message format using "role" and "content" keys

Outputs

TitleData TypeDescriptionNotes
OutputstringThe response text from the LLM.
Messages Sentchat-message[]The messages that were sent to Ollama.
All Messageschat-message[]All messages, including the reply from the LLM.

Editor Settings

SettingDescriptionDefault ValueUse Input ToggleInput Data Type
ModelThe name of the LLM model in to use in Ollama.(Empty)Yesstring
Prompt FormatThe way to format chat messages for the prompt being sent to the ollama model. Raw means no formatting is applied. Llama 2 Instruct follows the Llama 2 prompt format.Llama 2 InstructNoN/A
JSON ModeActivates JSON output modefalseYesboolean
Parameters Group
MirostatEnable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)(unset)Yesnumber
Mirostat EtaInfluences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1)(unset)Yesnumber
Mirostat TauControls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)(unset)Yesnumber
Num CtxSets the size of the context window used to generate the next token. (Default: 2048)(unset)Yesnumber
Num GQAThe number of GQA groups in the transformer layer. Required for some models, for example it is 8 for llama2:70b(unset)Yesnumber
Num GPUsThe number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable.(unset)Yesnumber
Num ThreadsSets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number of cores).(unset)Yesnumber
Repeat Last NSets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)(unset)Yesnumber
Repeat PenaltySets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)(unset)Yesnumber
TemperatureThe temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)(unset)Yesnumber
SeedSets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0)(unset)Yesnumber
StopSets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return.(unset)Yesstring
TFS ZTail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)(unset)Yesnumber
Num PredictMaximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context)(unset)Yesnumber
Top KReduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)(unset)Yesnumber
Top PWorks together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)(unset)Yesnumber
Additional ParametersAdditional parameters to pass to Ollama. Numbers will be parsed and sent as numbers, otherwise they will be sent as strings. See all supported parameters in Ollama(none)Yesobject

Ollama Embedding

Embedding models are models that are trained specifically to generate vector embeddings: long arrays of numbers that represent semantic meaning for a given sequence of text. The resulting vector embedding arrays can then be stored in a database, which will compare them as a way to search for data that is similar in meaning.

Inputs

See Editor Settings for all possible inputs.

Outputs

TitleData TypeDescriptionNotes
EmbeddingvectorArray of numbers that represent semantic meaning for a given sequence of text.

Editor Settings

SettingDescriptionDefault ValueUse Input ToggleInput Data Type
Model NameThe name of the model to get.(Empty)Yes (default off)string
TextThe text to embed.(Empty)Yes (default off)string

Ollama Generate

Previously the main node of the plugin. Allows you to send prompts to Ollama and receive responses from the LLMs installed with deep customization options even including custom prompt formats. Uses /api/generate route

Inputs

TitleData TypeDescriptionDefault ValueNotes
System PromptstringThe system prompt to prepend to the messages list.(none)Optional.
Messages'chat-message[]'The chat messages to use as the prompt for the LLM.(none)Chat messages are converted to a prompt in Ollama based on the "Prompt Format" editor setting. If "Raw" is selected, no formatting is performed on the chat messages, and you are expected to have already formatted them in your Rivet graphs.

Additional inputs available with toggles in the editor.

Outputs

TitleData TypeDescriptionNotes
OutputstringThe response text from the LLM.
PromptstringThe full prompt, with formatting, that was sent to Ollama.
Messages Sentchat-message[]The messages that were sent to Ollama.
All Messageschat-message[]All messages, including the reply from the LLM.
Total DurationnumberTime spent generating the response.Only available if the "Advanced Outputs" toggle is enabled.
Load DurationnumberTime spent in nanoseconds loading the model.Only available if the "Advanced Outputs" toggle is enabled.
Sample CountnumberNumber of samples generated.Only available if the "Advanced Outputs" toggle is enabled.
Sample DurationnumberTime spent in nanoseconds generating samples.Only available if the "Advanced Outputs" toggle is enabled.
Prompt Eval CountnumberNumber of tokens in the prompt.Only available if the "Advanced Outputs" toggle is enabled.
Prompt Eval DurationnumberTime spent in nanoseconds evaluating the prompt.Only available if the "Advanced Outputs" toggle is enabled.
Eval CountnumberNumber of tokens in the response.Only available if the "Advanced Outputs" toggle is enabled.
Eval DurationnumberTime spent in nanoseconds evaluating the response.Only available if the "Advanced Outputs" toggle is enabled.
Tokens Per SecondnumberNumber of tokens generated per second.Only available if the "Advanced Outputs" toggle is enabled.
ParametersobjectThe parameters used to generate the response.Only available if the "Advanced Outputs" toggle is enabled.

Editor Settings

SettingDescriptionDefault ValueUse Input ToggleInput Data Type
ModelThe name of the LLM model in to use in Ollama.(Empty)Yesstring
Prompt FormatThe way to format chat messages for the prompt being sent to the ollama model. Raw means no formatting is applied. Llama 2 Instruct follows the Llama 2 prompt format.Llama 2 InstructNoN/A
JSON ModeActivates JSON output modefalseYesboolean
Advanced OutputsAdd additional outputs with detailed information about the Ollama execution.NoNoN/A
Parameters Group
MirostatEnable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)(unset)Yesnumber
Mirostat EtaInfluences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1)(unset)Yesnumber
Mirostat TauControls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)(unset)Yesnumber
Num CtxSets the size of the context window used to generate the next token. (Default: 2048)(unset)Yesnumber
Num GQAThe number of GQA groups in the transformer layer. Required for some models, for example it is 8 for llama2:70b(unset)Yesnumber
Num GPUsThe number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable.(unset)Yesnumber
Num ThreadsSets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number of cores).(unset)Yesnumber
Repeat Last NSets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)(unset)Yesnumber
Repeat PenaltySets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)(unset)Yesnumber
TemperatureThe temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)(unset)Yesnumber
SeedSets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0)(unset)Yesnumber
StopSets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return.(unset)Yesstring
TFS ZTail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)(unset)Yesnumber
Num PredictMaximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context)(unset)Yesnumber
Top KReduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)(unset)Yesnumber
Top PWorks together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)(unset)Yesnumber
Additional ParametersAdditional parameters to pass to Ollama. Numbers will be parsed and sent as numbers, otherwise they will be sent as strings. See all supported parameters in Ollama(none)Yesobject

List Ollama Models

Lists the models installed in Ollama.

Inputs

This node has no inputs.

Outputs

TitleData TypeDescriptionNotes
Model Namesstring[]The names of the models installed in Ollama.

Editor Settings

This node has no editor settings.

Get Ollama Model

Gets the model with the given name from Ollama.

Inputs

See Editor Settings for all possible inputs.

Outputs

TitleData TypeDescriptionNotes
LicensestringContents of the license block of the model.
ModelfilestringThe Ollama modelfile for the model"
ParametersstringThe parameters for the model.
TemplatestringThe template for the model.

Editor Settings

SettingDescriptionDefault ValueUse Input ToggleInput Data Type
Model NameThe name of the model to get.(Empty)Yes (default on)string

Pull Model to Ollama

Downloads a model from the Ollama library to the Ollama server.

Inputs

See Editor Settings for all possible inputs.

Outputs

TitleData TypeDescriptionNotes
Model NamestringThe name of the model that was pulled.

Editor Settings

SettingDescriptionDefault ValueUse Input ToggleInput Data Type
Model NameThe name of the model to pull.(Empty)Yes (default on)string
InsecureAllow insecure connections to the library. Only use this if you are pulling from your own library during development.NoNoN/A

Local Development

  1. Run yarn dev to start the compiler and bundler in watch mode. This will automatically recombine and rebundle your changes into the dist folder. This will also copy the bundled files into the plugin install directory.
  2. After each change, you must restart Rivet to see the changes.