Home

Awesome

ChatARKit: Using ChatGPT to Create AR Experiences with Natural Language

Copyright 2022-2023 Bart Trzynadlowski

Demo Video

Click here for a demo video.

A car placed using ChatARKit A frog placed using ChatARKit

Overview

ChatARKit is an experiment to see whether ChatGPT can be harnessed to write code using custom user-defined APIs. You can speak a prompt asking ChatARKit to place objects of a certain type on nearby planes and perform some basic manipulations of their position, scale, and orientation. More interactions could readily be added. Here are some sample prompts to try:

Full disclosure: the demo video represents some of the best results I got. Performance is generally worse. ChatGPT produces highly variable results for identical prompts. It frequently injects JavaScript functions that are not present in my JavaScriptCore execution context and will misinterpret the relationship between the user, cameraPosition, and planes. Sometimes it converts object descriptions into code-like identifiers (e.g., "school bus" becomes "schoolBus"), which breaks the Sketchfab querying logic. Much of this is fixable and it's a good idea to examine the generated code, which is printed to the console by the iOS app.

I encourage interested developers to contact me and discuss ways this could be improved and turned into a more robust demo. And if you have any other fun ideas for AI projects, I'm always up to chat :)

ChatARKit makes use of the following projects:

Usage Instructions

1. Download the Whisper Model Weights

Before opening the iOS project, make sure to download the required Whisper model. From the root of the repository directory on MacOS, type:

curl -L --output iOS/ChatARKit/ChatARKit/ggml-base.en.bin https://huggingface.co/datasets/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin

2. Obtain an OpenAI API Key

The initial release of ChatARKit used a Python bridge to interact with the ChatGPT web site using Chromium. Now, the entire app is self-contained and uses the OpenAI directly. This is much simpler and faster but requires a paid account. Obtaining a new API key is straightforward from the account management page, shown below. Open iOS/ChatARKit/ChatARKit/APIKeys.swift and paste the key into the openAI string.

OpenAI API Key

If you are nervous about how much this will cost, I have racked up a $0.08 bill after 52 queries. Set aggressive hard limits to be safe (mine is $32.00).

3. Obtain a Sketchfab API Token

Sketchfab is used to fetch 3D assets (except "cube" which is handled natively). Sign up for a free user account and then under your profile settings, find your API token. Open up iOS/ChatARKit/ChatARKit/APIKeys.swift and paste the key into the sketchfab string.

The profile settings can be found by clicking on your user icon in the upper right. Then, click Edit Profile.

Sketchfab Edit Profile location

The API token is under Passwords & API.

Sketchfab Token

4. Launch ChatARKit on iPhone and Try a Prompt!

Open iOS/ChatARKit/ChatARKit.xcodeproj and deploy ChatARKit to your iPhone.

Look around to ensure some planes are detected and then press the Record button and speak a prompt. For example: "Place a cube on the nearest plane." Press Stop when finished to parse the result. Receiving code from ChatGPT and then executing it can take a while. Be patient. Downloading Sketchfab models is usually the most time-consuming part of the process and it often fails. Currently, no status or error messages are printed on-screen. Running the app connected to Xcode to inspect debug output can be helpful in diagnosing problems.

How It Works

The ChatARKit iOS app has a few key parts. A JavaScript environment is set up allowing scripts that create entities (objects with a visual representation that can be manipulated) to be run. The important source files to understand are:

The following methods and properties are exported to JavaScript:

It is interesting to note that ChatGPT is usually smart enough to implement the functionality of getNearestPlane() and sometimes even getGroundPlane() on its own. However, it occasionally gets spectacularly confused and will do something nonsensical. Luckily, it will usually end up using functions that are described to it in the prompt. ChatGPT also has a nasty habit of defining functions after it has used them, which produces code that fails to run.

Common sources of problems:

Future Work