Awesome
Ruby OpenAI
Use the OpenAI API with Ruby! 馃鉂わ笍
Stream text with GPT-4o, transcribe and translate audio with Whisper, or create images with DALL路E...
馃殌 Hire me to build your Rails+AI app | 馃摎 Rails AI | 馃幃 Ruby AI Builders Discord | 馃惁 X | 馃 Anthropic Gem | 馃殏 Midjourney Gem
Contents
- Ruby OpenAI
- Table of Contents
- Installation
- Usage
- Development
- Release
- Contributing
- License
- Code of Conduct
Installation
Bundler
Add this line to your application's Gemfile:
gem "ruby-openai"
And then execute:
$ bundle install
Gem install
Or install with:
$ gem install ruby-openai
and require with:
require "openai"
Usage
- Get your API key from https://platform.openai.com/account/api-keys
- If you belong to multiple organizations, you can get your Organization ID from https://platform.openai.com/account/org-settings
Quickstart
For a quick test you can pass your token directly to a new client:
client = OpenAI::Client.new(
access_token: "access_token_goes_here",
log_errors: true # Highly recommended in development, so you can see what errors OpenAI is returning. Not recommended in production because it could leak private data to your logs.
)
With Config
For a more robust setup, you can configure the gem with your API keys, for example in an openai.rb
initializer file. Never hardcode secrets into your codebase - instead use something like dotenv to pass the keys safely into your environments.
OpenAI.configure do |config|
config.access_token = ENV.fetch("OPENAI_ACCESS_TOKEN")
config.organization_id = ENV.fetch("OPENAI_ORGANIZATION_ID") # Optional
config.log_errors = true # Highly recommended in development, so you can see what errors OpenAI is returning. Not recommended in production because it could leak private data to your logs.
end
Then you can create a client like this:
client = OpenAI::Client.new
You can still override the config defaults when making new clients; any options not included will fall back to any global config set with OpenAI.configure. e.g. in this example the organization_id, request_timeout, etc. will fallback to any set globally using OpenAI.configure, with only the access_token overridden:
client = OpenAI::Client.new(access_token: "access_token_goes_here")
Custom timeout or base URI
- The default timeout for any request using this library is 120 seconds. You can change that by passing a number of seconds to the
request_timeout
when initializing the client. - You can also change the base URI used for all requests, eg. to use observability tools like Helicone or Velvet
- You can also add arbitrary other headers e.g. for openai-caching-proxy-worker, eg.:
client = OpenAI::Client.new(
access_token: "access_token_goes_here",
uri_base: "https://oai.hconeai.com/",
request_timeout: 240,
extra_headers: {
"X-Proxy-TTL" => "43200", # For https://github.com/6/openai-caching-proxy-worker#specifying-a-cache-ttl
"X-Proxy-Refresh": "true", # For https://github.com/6/openai-caching-proxy-worker#refreshing-the-cache
"Helicone-Auth": "Bearer HELICONE_API_KEY", # For https://docs.helicone.ai/getting-started/integration-method/openai-proxy
"helicone-stream-force-format" => "true", # Use this with Helicone otherwise streaming drops chunks # https://github.com/alexrudall/ruby-openai/issues/251
}
)
or when configuring the gem:
OpenAI.configure do |config|
config.access_token = ENV.fetch("OPENAI_ACCESS_TOKEN")
config.log_errors = true # Optional
config.organization_id = ENV.fetch("OPENAI_ORGANIZATION_ID") # Optional
config.uri_base = "https://oai.hconeai.com/" # Optional
config.request_timeout = 240 # Optional
config.extra_headers = {
"X-Proxy-TTL" => "43200", # For https://github.com/6/openai-caching-proxy-worker#specifying-a-cache-ttl
"X-Proxy-Refresh": "true", # For https://github.com/6/openai-caching-proxy-worker#refreshing-the-cache
"Helicone-Auth": "Bearer HELICONE_API_KEY" # For https://docs.helicone.ai/getting-started/integration-method/openai-proxy
} # Optional
end
Extra Headers per Client
You can dynamically pass headers per client object, which will be merged with any headers set globally with OpenAI.configure:
client = OpenAI::Client.new(access_token: "access_token_goes_here")
client.add_headers("X-Proxy-TTL" => "43200")
Logging
Errors
By default, ruby-openai
does not log any Faraday::Error
s encountered while executing a network request to avoid leaking data (e.g. 400s, 500s, SSL errors and more - see here for a complete list of subclasses of Faraday::Error
and what can cause them).
If you would like to enable this functionality, you can set log_errors
to true
when configuring the client:
client = OpenAI::Client.new(log_errors: true)
Faraday middleware
You can pass Faraday middleware to the client in a block, eg. to enable verbose logging with Ruby's Logger:
client = OpenAI::Client.new do |f|
f.response :logger, Logger.new($stdout), bodies: true
end
Azure
To use the Azure OpenAI Service API, you can configure the gem like this:
OpenAI.configure do |config|
config.access_token = ENV.fetch("AZURE_OPENAI_API_KEY")
config.uri_base = ENV.fetch("AZURE_OPENAI_URI")
config.api_type = :azure
config.api_version = "2023-03-15-preview"
end
where AZURE_OPENAI_URI
is e.g. https://custom-domain.openai.azure.com/openai/deployments/gpt-35-turbo
Ollama
Ollama allows you to run open-source LLMs, such as Llama 3, locally. It offers chat compatibility with the OpenAI API.
You can download Ollama here. On macOS you can install and run Ollama like this:
brew install ollama
ollama serve
ollama pull llama3:latest # In new terminal tab.
Create a client using your Ollama server and the pulled model, and stream a conversation for free:
client = OpenAI::Client.new(
uri_base: "http://localhost:11434"
)
client.chat(
parameters: {
model: "llama3", # Required.
messages: [{ role: "user", content: "Hello!"}], # Required.
temperature: 0.7,
stream: proc do |chunk, _bytesize|
print chunk.dig("choices", 0, "delta", "content")
end
}
)
# => Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?
Groq
Groq API Chat is broadly compatible with the OpenAI API, with a few minor differences. Get an access token from here, then:
client = OpenAI::Client.new(
access_token: "groq_access_token_goes_here",
uri_base: "https://api.groq.com/openai"
)
client.chat(
parameters: {
model: "llama3-8b-8192", # Required.
messages: [{ role: "user", content: "Hello!"}], # Required.
temperature: 0.7,
stream: proc do |chunk, _bytesize|
print chunk.dig("choices", 0, "delta", "content")
end
}
)
Counting Tokens
OpenAI parses prompt text into tokens, which are words or portions of words. (These tokens are unrelated to your API access_token.) Counting tokens can help you estimate your costs. It can also help you ensure your prompt text size is within the max-token limits of your model's context window, and choose an appropriate max_tokens
completion parameter so your response will fit as well.
To estimate the token-count of your text:
OpenAI.rough_token_count("Your text")
If you need a more accurate count, try tiktoken_ruby.
Models
There are different models that can be used to generate text. For a full list and to retrieve information about a single model:
client.models.list
client.models.retrieve(id: "gpt-4o")
Chat
GPT is a model that can be used to generate text in a conversational style. You can use it to generate a response to a sequence of messages:
response = client.chat(
parameters: {
model: "gpt-4o", # Required.
messages: [{ role: "user", content: "Hello!"}], # Required.
temperature: 0.7,
}
)
puts response.dig("choices", 0, "message", "content")
# => "Hello! How may I assist you today?"
Streaming Chat
Quick guide to streaming Chat with Rails 7 and Hotwire
You can stream from the API in realtime, which can be much faster and used to create a more engaging user experience. Pass a Proc (or any object with a #call
method) to the stream
parameter to receive the stream of completion chunks as they are generated. Each time one or more chunks is received, the proc will be called once with each chunk, parsed as a Hash. If OpenAI returns an error, ruby-openai
will raise a Faraday error.
client.chat(
parameters: {
model: "gpt-4o", # Required.
messages: [{ role: "user", content: "Describe a character called Anna!"}], # Required.
temperature: 0.7,
stream: proc do |chunk, _bytesize|
print chunk.dig("choices", 0, "delta", "content")
end
}
)
# => "Anna is a young woman in her mid-twenties, with wavy chestnut hair that falls to her shoulders..."
Note: In order to get usage information, you can provide the stream_options
parameter and OpenAI will provide a final chunk with the usage. Here is an example:
stream_proc = proc { |chunk, _bytesize| puts "--------------"; puts chunk.inspect; }
client.chat(
parameters: {
model: "gpt-4o",
stream: stream_proc,
stream_options: { include_usage: true },
messages: [{ role: "user", content: "Hello!"}],
}
)
# => --------------
# => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[{"index"=>0, "delta"=>{"role"=>"assistant", "content"=>""}, "logprobs"=>nil, "finish_reason"=>nil}], "usage"=>nil}
# => --------------
# => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[{"index"=>0, "delta"=>{"content"=>"Hello"}, "logprobs"=>nil, "finish_reason"=>nil}], "usage"=>nil}
# => --------------
# => ... more content chunks
# => --------------
# => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[{"index"=>0, "delta"=>{}, "logprobs"=>nil, "finish_reason"=>"stop"}], "usage"=>nil}
# => --------------
# => {"id"=>"chatcmpl-7bbq05PiZqlHxjV1j7OHnKKDURKaf", "object"=>"chat.completion.chunk", "created"=>1718750612, "model"=>"gpt-4o-2024-05-13", "system_fingerprint"=>"fp_9cb5d38cf7", "choices"=>[], "usage"=>{"prompt_tokens"=>9, "completion_tokens"=>9, "total_tokens"=>18}}
Vision
You can use the GPT-4 Vision model to generate a description of an image:
messages = [
{ "type": "text", "text": "What鈥檚 in this image?"},
{ "type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
},
}
]
response = client.chat(
parameters: {
model: "gpt-4-vision-preview", # Required.
messages: [{ role: "user", content: messages}], # Required.
}
)
puts response.dig("choices", 0, "message", "content")
# => "The image depicts a serene natural landscape featuring a long wooden boardwalk extending straight ahead"
JSON Mode
You can set the response_format to ask for responses in JSON:
response = client.chat(
parameters: {
model: "gpt-4o",
response_format: { type: "json_object" },
messages: [{ role: "user", content: "Hello! Give me some JSON please."}],
temperature: 0.7,
})
puts response.dig("choices", 0, "message", "content")
# =>
# {
# "name": "John",
# "age": 30,
# "city": "New York",
# "hobbies": ["reading", "traveling", "hiking"],
# "isStudent": false
# }
You can stream it as well!
response = client.chat(
parameters: {
model: "gpt-4o",
messages: [{ role: "user", content: "Can I have some JSON please?"}],
response_format: { type: "json_object" },
stream: proc do |chunk, _bytesize|
print chunk.dig("choices", 0, "delta", "content")
end
}
)
# =>
# {
# "message": "Sure, please let me know what specific JSON data you are looking for.",
# "JSON_data": {
# "example_1": {
# "key_1": "value_1",
# "key_2": "value_2",
# "key_3": "value_3"
# },
# "example_2": {
# "key_4": "value_4",
# "key_5": "value_5",
# "key_6": "value_6"
# }
# }
# }
Functions
You can describe and pass in functions and the model will intelligently choose to output a JSON object containing arguments to call them - eg., to use your method get_current_weather
to get the weather in a given location. Note that tool_choice is optional, but if you exclude it, the model will choose whether to use the function or not (see here).
def get_current_weather(location:, unit: "fahrenheit")
# Here you could use a weather api to fetch the weather.
"The weather in #{location} is nice 馃尀 #{unit}"
end
messages = [
{
"role": "user",
"content": "What is the weather like in San Francisco?",
},
]
response =
client.chat(
parameters: {
model: "gpt-4o",
messages: messages, # Defined above because we'll use it again
tools: [
{
type: "function",
function: {
name: "get_current_weather",
description: "Get the current weather in a given location",
parameters: { # Format: https://json-schema.org/understanding-json-schema
type: :object,
properties: {
location: {
type: :string,
description: "The city and state, e.g. San Francisco, CA",
},
unit: {
type: "string",
enum: %w[celsius fahrenheit],
},
},
required: ["location"],
},
},
}
],
# Optional, defaults to "auto"
# Can also put "none" or specific functions, see docs
tool_choice: "required"
},
)
message = response.dig("choices", 0, "message")
if message["role"] == "assistant" && message["tool_calls"]
message["tool_calls"].each do |tool_call|
tool_call_id = tool_call.dig("id")
function_name = tool_call.dig("function", "name")
function_args = JSON.parse(
tool_call.dig("function", "arguments"),
{ symbolize_names: true },
)
function_response =
case function_name
when "get_current_weather"
get_current_weather(**function_args) # => "The weather is nice 馃尀"
else
# decide how to handle
end
# For a subsequent message with the role "tool", OpenAI requires the preceding message to have a tool_calls argument.
messages << message
messages << {
tool_call_id: tool_call_id,
role: "tool",
name: function_name,
content: function_response
} # Extend the conversation with the results of the functions
end
second_response = client.chat(
parameters: {
model: "gpt-4o",
messages: messages
}
)
puts second_response.dig("choices", 0, "message", "content")
# At this point, the model has decided to call functions, you've called the functions
# and provided the response back, and the model has considered this and responded.
end
# => "It looks like the weather is nice and sunny in San Francisco! If you're planning to go out, it should be a pleasant day."
Completions
Hit the OpenAI API for a completion using other GPT-3 models:
response = client.completions(
parameters: {
model: "gpt-4o",
prompt: "Once upon a time",
max_tokens: 5
}
)
puts response["choices"].map { |c| c["text"] }
# => [", there lived a great"]
Embeddings
You can use the embeddings endpoint to get a vector of numbers representing an input. You can then compare these vectors for different inputs to efficiently check how similar the inputs are.
response = client.embeddings(
parameters: {
model: "text-embedding-ada-002",
input: "The food was delicious and the waiter..."
}
)
puts response.dig("data", 0, "embedding")
# => Vector representation of your embedding
Batches
The Batches endpoint allows you to create and manage large batches of API requests to run asynchronously. Currently, the supported endpoints for batches are /v1/chat/completions
(Chat Completions API) and /v1/embeddings
(Embeddings API).
To use the Batches endpoint, you need to first upload a JSONL file containing the batch requests using the Files endpoint. The file must be uploaded with the purpose set to batch
. Each line in the JSONL file represents a single request and should have the following format:
{
"custom_id": "request-1",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "gpt-4o",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "What is 2+2?" }
]
}
}
Once you have uploaded the JSONL file, you can create a new batch by providing the file ID, endpoint, and completion window:
response = client.batches.create(
parameters: {
input_file_id: "file-abc123",
endpoint: "/v1/chat/completions",
completion_window: "24h"
}
)
batch_id = response["id"]
You can retrieve information about a specific batch using its ID:
batch = client.batches.retrieve(id: batch_id)
To cancel a batch that is in progress:
client.batches.cancel(id: batch_id)
You can also list all the batches:
client.batches.list
Once the batch["completed_at"] is present, you can fetch the output or error files:
batch = client.batches.retrieve(id: batch_id)
output_file_id = batch["output_file_id"]
output_response = client.files.content(id: output_file_id)
error_file_id = batch["error_file_id"]
error_response = client.files.content(id: error_file_id)
These files are in JSONL format, with each line representing the output or error for a single request. The lines can be in any order:
{
"id": "response-1",
"custom_id": "request-1",
"response": {
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1677858242,
"model": "gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "2+2 equals 4."
}
}
]
}
}
If a request fails with a non-HTTP error, the error object will contain more information about the cause of the failure.
Files
For fine-tuning purposes
Put your data in a .jsonl
file like this:
{"prompt":"Overjoyed with my new phone! ->", "completion":" positive"}
{"prompt":"@lakers disappoint for a third straight night ->", "completion":" negative"}
and pass the path (or a StringIO object) to client.files.upload
to upload it to OpenAI, and then interact with it:
client.files.upload(parameters: { file: "path/to/sentiment.jsonl", purpose: "fine-tune" })
client.files.list
client.files.retrieve(id: "file-123")
client.files.content(id: "file-123")
client.files.delete(id: "file-123")
For assistant purposes
You can send a file path:
client.files.upload(parameters: { file: "path/to/file.pdf", purpose: "assistants" })
or a File object
my_file = File.open("path/to/file.pdf", "rb")
client.files.upload(parameters: { file: my_file, purpose: "assistants" })
See supported file types on API documentation.
Finetunes
Upload your fine-tuning data in a .jsonl
file as above and get its ID:
response = client.files.upload(parameters: { file: "path/to/sarcasm.jsonl", purpose: "fine-tune" })
file_id = JSON.parse(response.body)["id"]
You can then use this file ID to create a fine tuning job:
response = client.finetunes.create(
parameters: {
training_file: file_id,
model: "gpt-4o"
})
fine_tune_id = response["id"]
That will give you the fine-tune ID. If you made a mistake you can cancel the fine-tune model before it is processed:
client.finetunes.cancel(id: fine_tune_id)
You may need to wait a short time for processing to complete. Once processed, you can use list or retrieve to get the name of the fine-tuned model:
client.finetunes.list
response = client.finetunes.retrieve(id: fine_tune_id)
fine_tuned_model = response["fine_tuned_model"]
This fine-tuned model name can then be used in chat completions:
response = client.chat(
parameters: {
model: fine_tuned_model,
messages: [{ role: "user", content: "I love Mondays!" }]
}
)
response.dig("choices", 0, "message", "content")
You can also capture the events for a job:
client.finetunes.list_events(id: fine_tune_id)
Vector Stores
Vector Store objects give the File Search tool the ability to search your files.
You can create a new vector store:
response = client.vector_stores.create(
parameters: {
name: "my vector store",
file_ids: ["file-abc123", "file-def456"]
}
)
vector_store_id = response["id"]
Given a vector_store_id
you can retrieve
the current field values:
client.vector_stores.retrieve(id: vector_store_id)
You can get a list
of all vector stores currently available under the organization:
client.vector_stores.list
You can modify an existing vector store, except for the file_ids
:
response = client.vector_stores.modify(
id: vector_store_id,
parameters: {
name: "Modified Test Vector Store",
}
)
You can delete vector stores:
client.vector_stores.delete(id: vector_store_id)
Vector Store Files
Vector store files represent files inside a vector store.
You can create a new vector store file by attaching a File to a vector store.
response = client.vector_store_files.create(
vector_store_id: "vector-store-abc123",
parameters: {
file_id: "file-abc123"
}
)
vector_store_file_id = response["id"]
Given a vector_store_file_id
you can retrieve
the current field values:
client.vector_store_files.retrieve(
vector_store_id: "vector-store-abc123",
id: vector_store_file_id
)
You can get a list
of all vector store files currently available under the vector store:
client.vector_store_files.list(vector_store_id: "vector-store-abc123")
You can delete a vector store file:
client.vector_store_files.delete(
vector_store_id: "vector-store-abc123",
id: vector_store_file_id
)
Note: This will remove the file from the vector store but the file itself will not be deleted. To delete the file, use the delete file endpoint.
Vector Store File Batches
Vector store file batches represent operations to add multiple files to a vector store.
You can create a new vector store file batch by attaching multiple Files to a vector store.
response = client.vector_store_file_batches.create(
vector_store_id: "vector-store-abc123",
parameters: {
file_ids: ["file-abc123", "file-def456"]
}
)
file_batch_id = response["id"]
Given a file_batch_id
you can retrieve
the current field values:
client.vector_store_file_batches.retrieve(
vector_store_id: "vector-store-abc123",
id: file_batch_id
)
You can get a list
of all vector store files in a batch currently available under the vector store:
client.vector_store_file_batches.list(
vector_store_id: "vector-store-abc123",
id: file_batch_id
)
You can cancel a vector store file batch (This attempts to cancel the processing of files in this batch as soon as possible):
client.vector_store_file_batches.cancel(
vector_store_id: "vector-store-abc123",
id: file_batch_id
)
Assistants
Assistants are stateful actors that can have many conversations and use tools to perform tasks (see Assistant Overview).
To create a new assistant:
response = client.assistants.create(
parameters: {
model: "gpt-4o",
name: "OpenAI-Ruby test assistant",
description: nil,
instructions: "You are a Ruby dev bot. When asked a question, write and run Ruby code to answer the question",
tools: [
{ type: "code_interpreter" },
{ type: "file_search" }
],
tool_resources: {
code_interpreter: {
file_ids: [] # See Files section above for how to upload files
},
file_search: {
vector_store_ids: [] # See Vector Stores section above for how to add vector stores
}
},
"metadata": { my_internal_version_id: "1.0.0" }
}
)
assistant_id = response["id"]
Given an assistant_id
you can retrieve
the current field values:
client.assistants.retrieve(id: assistant_id)
You can get a list
of all assistants currently available under the organization:
client.assistants.list
You can modify an existing assistant using the assistant's id (see API documentation):
response = client.assistants.modify(
id: assistant_id,
parameters: {
name: "Modified Test Assistant for OpenAI-Ruby",
metadata: { my_internal_version_id: '1.0.1' }
}
)
You can delete assistants:
client.assistants.delete(id: assistant_id)
Threads and Messages
Once you have created an assistant as described above, you need to prepare a Thread
of Messages
for the assistant to work on (see introduction on Assistants). For example, as an initial setup you could do:
# Create thread
response = client.threads.create # Note: Once you create a thread, there is no way to list it
# or recover it currently (as of 2023-12-10). So hold onto the `id`
thread_id = response["id"]
# Add initial message from user (see https://platform.openai.com/docs/api-reference/messages/createMessage)
message_id = client.messages.create(
thread_id: thread_id,
parameters: {
role: "user", # Required for manually created messages
content: "Can you help me write an API library to interact with the OpenAI API please?"
}
)["id"]
# Retrieve individual message
message = client.messages.retrieve(thread_id: thread_id, id: message_id)
# Review all messages on the thread
messages = client.messages.list(thread_id: thread_id)
To clean up after a thread is no longer needed:
# To delete the thread (and all associated messages):
client.threads.delete(id: thread_id)
client.messages.retrieve(thread_id: thread_id, id: message_id) # -> Fails after thread is deleted
Runs
To submit a thread to be evaluated with the model of an assistant, create a Run
as follows:
# Create run (will use instruction/model/tools from Assistant's definition)
response = client.runs.create(
thread_id: thread_id,
parameters: {
assistant_id: assistant_id,
max_prompt_tokens: 256,
max_completion_tokens: 16
}
)
run_id = response['id']
You can stream the message chunks as they come through:
client.runs.create(
thread_id: thread_id,
parameters: {
assistant_id: assistant_id,
max_prompt_tokens: 256,
max_completion_tokens: 16,
stream: proc do |chunk, _bytesize|
if chunk["object"] == "thread.message.delta"
print chunk.dig("delta", "content", 0, "text", "value")
end
end
}
)
To get the status of a Run:
response = client.runs.retrieve(id: run_id, thread_id: thread_id)
status = response['status']
The status
response can include the following strings queued
, in_progress
, requires_action
, cancelling
, cancelled
, failed
, completed
, or expired
which you can handle as follows:
while true do
response = client.runs.retrieve(id: run_id, thread_id: thread_id)
status = response['status']
case status
when 'queued', 'in_progress', 'cancelling'
puts 'Sleeping'
sleep 1 # Wait one second and poll again
when 'completed'
break # Exit loop and report result to user
when 'requires_action'
# Handle tool calls (see below)
when 'cancelled', 'failed', 'expired'
puts response['last_error'].inspect
break # or `exit`
else
puts "Unknown status response: #{status}"
end
end
If the status
response indicates that the run
is completed
, the associated thread
will have one or more new messages
attached:
# Either retrieve all messages in bulk again, or...
messages = client.messages.list(thread_id: thread_id, parameters: { order: 'asc' })
# Alternatively retrieve the `run steps` for the run which link to the messages:
run_steps = client.run_steps.list(thread_id: thread_id, run_id: run_id, parameters: { order: 'asc' })
new_message_ids = run_steps['data'].filter_map do |step|
if step['type'] == 'message_creation'
step.dig('step_details', "message_creation", "message_id")
end # Ignore tool calls, because they don't create new messages.
end
# Retrieve the individual messages
new_messages = new_message_ids.map do |msg_id|
client.messages.retrieve(id: msg_id, thread_id: thread_id)
end
# Find the actual response text in the content array of the messages
new_messages.each do |msg|
msg['content'].each do |content_item|
case content_item['type']
when 'text'
puts content_item.dig('text', 'value')
# Also handle annotations
when 'image_file'
# Use File endpoint to retrieve file contents via id
id = content_item.dig('image_file', 'file_id')
end
end
end
You can also update the metadata on messages, including messages that come from the assistant.
metadata = {
user_id: "abc123"
}
message = client.messages.modify(
id: message_id,
thread_id: thread_id,
parameters: { metadata: metadata },
)
At any time you can list all runs which have been performed on a particular thread or are currently running:
client.runs.list(thread_id: thread_id, parameters: { order: "asc", limit: 3 })
Create and Run
You can also create a thread and run in one call like this:
response = client.runs.create_thread_and_run(parameters: { assistant_id: assistant_id })
run_id = response['id']
thread_id = response['thread_id']
Vision in a thread
You can include images in a thread and they will be described & read by the LLM. In this example I'm using this file:
require "openai"
# Make a client
client = OpenAI::Client.new(
access_token: "access_token_goes_here",
log_errors: true # Don't log errors in production.
)
# Upload image as a file
file_id = client.files.upload(
parameters: {
file: "path/to/example.png",
purpose: "assistants",
}
)["id"]
# Create assistant (You could also use an existing one here)
assistant_id = client.assistants.create(
parameters: {
model: "gpt-4o",
name: "Image reader",
instructions: "You are an image describer. You describe the contents of images.",
}
)["id"]
# Create thread
thread_id = client.threads.create["id"]
# Add image in message
client.messages.create(
thread_id: thread_id,
parameters: {
role: "user", # Required for manually created messages
content: [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_file",
"image_file": { "file_id": file_id }
}
]
}
)
# Run thread
run_id = client.runs.create(
thread_id: thread_id,
parameters: { assistant_id: assistant_id }
)["id"]
# Wait until run in complete
status = nil
until status == "completed" do
sleep(0.1)
status = client.runs.retrieve(id: run_id, thread_id: thread_id)['status']
end
# Get the response
messages = client.messages.list(thread_id: thread_id, parameters: { order: 'asc' })
messages.dig("data", -1, "content", 0, "text", "value")
=> "The image contains a placeholder graphic with a tilted, stylized representation of a postage stamp in the top part, which includes an abstract landscape with hills and a sun. Below the stamp, in the middle of the image, there is italicized text in a light golden color that reads, \"This is just an example.\" The background is a light pastel shade, and a yellow border frames the entire image."
Runs involving function tools
In case you are allowing the assistant to access function
tools (they are defined in the same way as functions during chat completion), you might get a status code of requires_action
when the assistant wants you to evaluate one or more function tools:
def get_current_weather(location:, unit: "celsius")
# Your function code goes here
if location =~ /San Francisco/i
return unit == "celsius" ? "The weather is nice 馃尀 at 27掳C" : "The weather is nice 馃尀 at 80掳F"
else
return unit == "celsius" ? "The weather is icy 馃ザ at -5掳C" : "The weather is icy 馃ザ at 23掳F"
end
end
if status == 'requires_action'
tools_to_call = response.dig('required_action', 'submit_tool_outputs', 'tool_calls')
my_tool_outputs = tools_to_call.map { |tool|
# Call the functions based on the tool's name
function_name = tool.dig('function', 'name')
arguments = JSON.parse(
tool.dig("function", "arguments"),
{ symbolize_names: true },
)
tool_output = case function_name
when "get_current_weather"
get_current_weather(**arguments)
end
{
tool_call_id: tool['id'],
output: tool_output,
}
}
client.runs.submit_tool_outputs(
thread_id: thread_id,
run_id: run_id,
parameters: { tool_outputs: my_tool_outputs }
)
end
Note that you have 10 minutes to submit your tool output before the run expires.
Exploring chunks used in File Search
Take a deep breath. You might need a drink for this one.
It's possible for OpenAI to share what chunks it used in its internal RAG Pipeline to create its filesearch results.
An example spec can be found here that does this, just so you know it's possible.
Here's how to get the chunks used in a file search. In this example I'm using this file:
require "openai"
# Make a client
client = OpenAI::Client.new(
access_token: "access_token_goes_here",
log_errors: true # Don't log errors in production.
)
# Upload your file(s)
file_id = client.files.upload(
parameters: {
file: "path/to/somatosensory.pdf",
purpose: "assistants"
}
)["id"]
# Create a vector store to store the vectorised file(s)
vector_store_id = client.vector_stores.create(parameters: {})["id"]
# Vectorise the file(s)
vector_store_file_id = client.vector_store_files.create(
vector_store_id: vector_store_id,
parameters: { file_id: file_id }
)["id"]
# Check that the file is vectorised (wait for status to be "completed")
client.vector_store_files.retrieve(vector_store_id: vector_store_id, id: vector_store_file_id)["status"]
# Create an assistant, referencing the vector store
assistant_id = client.assistants.create(
parameters: {
model: "gpt-4o",
name: "Answer finder",
instructions: "You are a file search tool. Find the answer in the given files, please.",
tools: [
{ type: "file_search" }
],
tool_resources: {
file_search: {
vector_store_ids: [vector_store_id]
}
}
}
)["id"]
# Create a thread with your question
thread_id = client.threads.create(parameters: {
messages: [
{ role: "user",
content: "Find the description of a nociceptor." }
]
})["id"]
# Run the thread to generate the response. Include the "GIVE ME THE CHUNKS" incantation.
run_id = client.runs.create(
thread_id: thread_id,
parameters: {
assistant_id: assistant_id
},
query_parameters: { include: ["step_details.tool_calls[*].file_search.results[*].content"] } # incantation
)["id"]
# Get the steps that happened in the run
steps = client.run_steps.list(
thread_id: thread_id,
run_id: run_id,
parameters: { order: "asc" }
)
# Retrieve all the steps. Include the "GIVE ME THE CHUNKS" incantation again.
steps = steps["data"].map do |step|
client.run_steps.retrieve(
thread_id: thread_id,
run_id: run_id,
id: step["id"],
parameters: { include: ["step_details.tool_calls[*].file_search.results[*].content"] } # incantation
)
end
# Now we've got the chunk info, buried deep. Loop through the steps and find chunks if included:
chunks = steps.flat_map do |step|
included_results = step.dig("step_details", "tool_calls", 0, "file_search", "results")
next if included_results.nil? || included_results.empty?
included_results.flat_map do |result|
result["content"].map do |content|
content["text"]
end
end
end.compact
# The first chunk will be the closest match to the prompt. Finally, if you want to view the completed message(s):
client.messages.list(thread_id: thread_id)
Image Generation
Generate images using DALL路E 2 or DALL路E 3!
DALL路E 2
For DALL路E 2 the size of any generated images must be one of 256x256
, 512x512
or 1024x1024
- if not specified the image will default to 1024x1024
.
response = client.images.generate(
parameters: {
prompt: "A baby sea otter cooking pasta wearing a hat of some sort",
size: "256x256",
}
)
puts response.dig("data", 0, "url")
# => "https://oaidalleapiprodscus.blob.core.windows.net/private/org-Rf437IxKhh..."
DALL路E 3
For DALL路E 3 the size of any generated images must be one of 1024x1024
, 1024x1792
or 1792x1024
. Additionally the quality of the image can be specified to either standard
or hd
.
response = client.images.generate(
parameters: {
prompt: "A springer spaniel cooking pasta wearing a hat of some sort",
model: "dall-e-3",
size: "1024x1792",
quality: "standard",
}
)
puts response.dig("data", 0, "url")
# => "https://oaidalleapiprodscus.blob.core.windows.net/private/org-Rf437IxKhh..."
Image Edit
Fill in the transparent part of an image, or upload a mask with transparent sections to indicate the parts of an image that can be changed according to your prompt...
response = client.images.edit(
parameters: {
prompt: "A solid red Ruby on a blue background",
image: "image.png",
mask: "mask.png",
}
)
puts response.dig("data", 0, "url")
# => "https://oaidalleapiprodscus.blob.core.windows.net/private/org-Rf437IxKhh..."
Image Variations
Create n variations of an image.
response = client.images.variations(parameters: { image: "image.png", n: 2 })
puts response.dig("data", 0, "url")
# => "https://oaidalleapiprodscus.blob.core.windows.net/private/org-Rf437IxKhh..."
Moderations
Pass a string to check if it violates OpenAI's Content Policy:
response = client.moderations(parameters: { input: "I'm worried about that." })
puts response.dig("results", 0, "category_scores", "hate")
# => 5.505014632944949e-05
Whisper
Whisper is a speech to text model that can be used to generate text based on audio files:
Translate
The translations API takes as input the audio file in any of the supported languages and transcribes the audio into English.
response = client.audio.translate(
parameters: {
model: "whisper-1",
file: File.open("path_to_file", "rb"),
}
)
puts response["text"]
# => "Translation of the text"
Transcribe
The transcriptions API takes as input the audio file you want to transcribe and returns the text in the desired output file format.
You can pass the language of the audio file to improve transcription quality. Supported languages are listed here. You need to provide the language as an ISO-639-1 code, eg. "en" for English or "ne" for Nepali. You can look up the codes here.
response = client.audio.transcribe(
parameters: {
model: "whisper-1",
file: File.open("path_to_file", "rb"),
language: "en", # Optional
}
)
puts response["text"]
# => "Transcription of the text"
Speech
The speech API takes as input the text and a voice and returns the content of an audio file you can listen to.
response = client.audio.speech(
parameters: {
model: "tts-1",
input: "This is a speech test!",
voice: "alloy",
response_format: "mp3", # Optional
speed: 1.0, # Optional
}
)
File.binwrite('demo.mp3', response)
# => mp3 file that plays: "This is a speech test!"
Errors
HTTP errors can be caught like this:
begin
OpenAI::Client.new.models.retrieve(id: "gpt-4o")
rescue Faraday::Error => e
raise "Got a Faraday error: #{e}"
end
Development
After checking out the repo, run bin/setup
to install dependencies. You can run bin/console
for an interactive prompt that will allow you to experiment.
To install this gem onto your local machine, run bundle exec rake install
.
To run all tests, execute the command bundle exec rake
, which will also run the linter (Rubocop). This repository uses VCR to log API requests.
[!WARNING] If you have an
OPENAI_ACCESS_TOKEN
in yourENV
, running the specs will use this to run the specs against the actual API, which will be slow and cost you money - 2 cents or more! Remove it from your environment withunset
or similar if you just want to run the specs against the stored VCR responses.
Release
First run the specs without VCR so they actually hit the API. This will cost 2 cents or more. Set OPENAI_ACCESS_TOKEN in your environment or pass it in like this:
OPENAI_ACCESS_TOKEN=123abc bundle exec rspec
Then update the version number in version.rb
, update CHANGELOG.md
, run bundle install
to update Gemfile.lock, and then run bundle exec rake release
, which will create a git tag for the version, push git commits and tags, and push the .gem
file to rubygems.org.
Contributing
Bug reports and pull requests are welcome on GitHub at https://github.com/alexrudall/ruby-openai. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the code of conduct.
License
The gem is available as open source under the terms of the MIT License.
Code of Conduct
Everyone interacting in the Ruby OpenAI project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct.