Home

Awesome

cui-llama.rn

This is a fork of llama.rn meant for ChatterUI

This fork exists to update llama.cpp on a more frequent basis, plus adding useful features to ChatterUI.

The following features have been added for Android:

There is no IOS implementation for these features.

Original repo README.md below.

llama.rn

Actions Status License: MIT npm

React Native binding of llama.cpp.

llama.cpp: Inference of LLaMA model in pure C/C++

Installation

npm install llama.rn

iOS

Please re-run npx pod-install again.

Android

Add proguard rule if it's enabled in project (android/app/proguard-rules.pro):

# llama.rn
-keep class com.rnllama.** { *; }

Obtain the model

You can search HuggingFace for available models (Keyword: GGUF).

For get a GGUF model or quantize manually, see Prepare and Quantize section in llama.cpp.

Usage

import { initLlama } from 'llama.rn'

// Initial a Llama context with the model (may take a while)
const context = await initLlama({
  model: 'file://<path to gguf model>',
  use_mlock: true,
  n_ctx: 2048,
  n_gpu_layers: 1, // > 0: enable Metal on iOS
  // embedding: true, // use embedding
})

const stopWords = ['</s>', '<|end|>', '<|eot_id|>', '<|end_of_text|>', '<|im_end|>', '<|EOT|>', '<|END_OF_TURN_TOKEN|>', '<|end_of_turn|>', '<|endoftext|>']

// Do chat completion
const msgResult = await context.completion(
  {
    messages: [
      {
        role: 'system',
        content: 'This is a conversation between user and assistant, a friendly chatbot.',
      },
      {
        role: 'user',
        content: 'Hello!',
      },
    ],
    n_predict: 100,
    stop: stopWords,
    // ...other params
  },
  (data) => {
    // This is a partial completion callback
    const { token } = data
  },
)
console.log('Result:', msgResult.text)
console.log('Timings:', msgResult.timings)

// Or do text completion
const textResult = await context.completion(
  {
    prompt:
      'This is a conversation between user and llama, a friendly chatbot. respond in simple markdown.\n\nUser: Hello!\nLlama:',
    n_predict: 100,
    stop: [...stopWords, 'Llama:', 'User:'],
    // ...other params
  },
  (data) => {
    // This is a partial completion callback
    const { token } = data
  },
)
console.log('Result:', textResult.text)
console.log('Timings:', textResult.timings)

The binding’s deisgn inspired by server.cpp example in llama.cpp, so you can map its API to LlamaContext:

Please visit the Documentation for more details.

You can also visit the example to see how to use it.

Run the example:

yarn && yarn bootstrap

# iOS
yarn example ios
# Use device
yarn example ios --device "<device name>"
# With release mode
yarn example ios --mode Release

# Android
yarn example android
# With release mode
yarn example android --mode release

This example used react-native-document-picker for select model.

Grammar Sampling

GBNF (GGML BNF) is a format for defining formal grammars to constrain model outputs in llama.cpp. For example, you can use it to force the model to generate valid JSON, or speak only in emojis.

You can see GBNF Guide for more details.

llama.rn provided a built-in function to convert JSON Schema to GBNF:

import { initLlama, convertJsonSchemaToGrammar } from 'llama.rn'

const schema = {
  /* JSON Schema, see below */
}

const context = await initLlama({
  model: 'file://<path to gguf model>',
  use_mlock: true,
  n_ctx: 2048,
  n_gpu_layers: 1, // > 0: enable Metal on iOS
  // embedding: true, // use embedding
  grammar: convertJsonSchemaToGrammar({
    schema,
    propOrder: { function: 0, arguments: 1 },
  }),
})

const { text } = await context.completion({
  prompt: 'Schedule a birthday party on Aug 14th 2023 at 8pm.',
})
console.log('Result:', text)
// Example output:
// {"function": "create_event","arguments":{"date": "Aug 14th 2023", "time": "8pm", "title": "Birthday Party"}}
<details> <summary>JSON Schema example (Define function get_current_weather / create_event / image_search)</summary>
{
  oneOf: [
    {
      type: 'object',
      name: 'get_current_weather',
      description: 'Get the current weather in a given location',
      properties: {
        function: {
          const: 'get_current_weather',
        },
        arguments: {
          type: 'object',
          properties: {
            location: {
              type: 'string',
              description: 'The city and state, e.g. San Francisco, CA',
            },
            unit: {
              type: 'string',
              enum: ['celsius', 'fahrenheit'],
            },
          },
          required: ['location'],
        },
      },
    },
    {
      type: 'object',
      name: 'create_event',
      description: 'Create a calendar event',
      properties: {
        function: {
          const: 'create_event',
        },
        arguments: {
          type: 'object',
          properties: {
            title: {
              type: 'string',
              description: 'The title of the event',
            },
            date: {
              type: 'string',
              description: 'The date of the event',
            },
            time: {
              type: 'string',
              description: 'The time of the event',
            },
          },
          required: ['title', 'date', 'time'],
        },
      },
    },
    {
      type: 'object',
      name: 'image_search',
      description: 'Search for an image',
      properties: {
        function: {
          const: 'image_search',
        },
        arguments: {
          type: 'object',
          properties: {
            query: {
              type: 'string',
              description: 'The search query',
            },
          },
          required: ['query'],
        },
      },
    },
  ],
}
</details> <details> <summary>Converted GBNF looks like</summary>
space ::= " "?
0-function ::= "\"get_current_weather\""
string ::=  "\"" (
        [^"\\] |
        "\\" (["\\/bfnrt] | "u" [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F] [0-9a-fA-F])
      )* "\"" space
0-arguments-unit ::= "\"celsius\"" | "\"fahrenheit\""
0-arguments ::= "{" space "\"location\"" space ":" space string "," space "\"unit\"" space ":" space 0-arguments-unit "}" space
0 ::= "{" space "\"function\"" space ":" space 0-function "," space "\"arguments\"" space ":" space 0-arguments "}" space
1-function ::= "\"create_event\""
1-arguments ::= "{" space "\"date\"" space ":" space string "," space "\"time\"" space ":" space string "," space "\"title\"" space ":" space string "}" space
1 ::= "{" space "\"function\"" space ":" space 1-function "," space "\"arguments\"" space ":" space 1-arguments "}" space
2-function ::= "\"image_search\""
2-arguments ::= "{" space "\"query\"" space ":" space string "}" space
2 ::= "{" space "\"function\"" space ":" space 2-function "," space "\"arguments\"" space ":" space 2-arguments "}" space
root ::= 0 | 1 | 2
</details>

Mock llama.rn

We have provided a mock version of llama.rn for testing purpose you can use on Jest:

jest.mock('llama.rn', () => require('llama.rn/jest/mock'))

NOTE

iOS:

Android:

Contributing

See the contributing guide to learn how to contribute to the repository and the development workflow.

License

MIT


Made with create-react-native-library


<p align="center"> <a href="https://bricks.tools"> <img width="90px" src="https://avatars.githubusercontent.com/u/17320237?s=200&v=4"> </a> <p align="center"> Built and maintained by <a href="https://bricks.tools">BRICKS</a>. </p> </p>