Home

Awesome

smol-plugins

Generate OpenAI plugins for your favorite APIs

Quick Start

generate a plugin for the Etherscan API to fetch Ethereum account balances

  1. python main_no_modal.py etherscan-plugin
    this takes a few mins to run
  2. cd generated && pip install -r requirements.txt
    head into the generated repository and install requirements. There may be some dependency version issues, just remove them if that's case
  3. python app.py
    run your app. There may be some discreptancies between openapi.yaml and app.py routes that need to be fixed.
  4. Test your plugin in on OpenAI Interface by plugging pointing it to localhost:3333

If you run into issues, you can see a working example in the etherscan-plugin repository. Note you will need an Etherscan key

Generate Your Own Plugin

  1. Run python plugin_generator.py and input an app_name and description. This will auto generate a plugin template.
  2. Run main_mo_modal.py on newly created {app_name}.md file
  3. Repeat steps from quickstart.

smol developer

Human-centric & Coherent Whole Program Synthesis aka your own personal junior developer

Build the thing that builds the thing! a smol dev for every dev in every situation

this is a prototype of a "junior developer" agent (aka smol dev) that scaffolds an entire codebase out for you once you give it a product spec, but does not end the world or overpromise AGI. instead of making and maintaining specific, rigid, one-shot starters, like create-react-app, or create-nextjs-app, this is basically create-anything-app where you develop your scaffolding prompt in a tight loop with your smol dev.

AI that is helpful, harmless, and honest is complemented by a codebase that is simple, safe, and smol - <200 lines of Python and Prompts, so this is easy to understand and customize.

<p align="center"> <img height=200 src="https://pbs.twimg.com/media/FwEzVCcaMAE7t4h?format=jpg&name=large" /> </p>

engineering with prompts, rather than prompt engineering

The demo example in prompt.md shows the potential of AI-enabled, but still firmly human developer centric, workflow:

Loop until happiness is attained. Notice that AI is only used as long as it is adding value - once it gets in your way, just take over the codebase from your smol junior developer with no fuss and no hurt feelings. (we could also have smol-dev take over an existing codebase and bootstrap its own prompt... but that's a Future Direction)

Not no code, not low code, but some third thing.

Perhaps a higher order evolution of programming where you still need to be technical, but no longer have to implement every detail at least to scaffold things out.

6 minute video demo

https://youtu.be/UCo7YeTy-aE (sorry for sped up audio, we were optimizing for twitter, bad call)

arch diagram

naturally generated with gpt4, like we did for babyagi image

innovations and insights

Please subscribe to https://latent.space/ for a fuller writeup and insights and reflections

Please subscribe to https://latent.space/ for a fuller writeup and insights and reflections

caveats

We were working on a Chrome Extension, which requires images to be generated, so we added some usecase specific code in there to skip destroying/regenerating them, that we haven't decided how to generalize.

We dont have access to GPT4-32k, but if we did, we'd explore dumping entire API/SDK documentation into context.

The feedback loop is very slow right now (time says about 2-4 mins to generate a program with GPT4, even with parallelization due to Modal (occasionally spiking higher)), but it's a safe bet that it will go down over time (see also "future directions" below).

install

it's basically:

There are no python dependencies to wrangle thanks to using Modal as a self-provisioning runtime.

Unfortunately this project also uses 3 waitlisted things:

yes, the most important skill in being an ai engineer is social engineering to get off waitlists. Modal will let you in if you say the keyword "swyx"

you'll have to adapt this code on a fork if you want to use it on other infra. please open issues/PRs and i'll happily highlight your fork here.

trying the example chrome extension

the /generated and /exampleChromeExtension folder contains a Chrome Manifest V3 extension that reads the current page, and offers a popup UI that has the page title+content and a textarea for a prompt (with a default value we specify). When the user hits submit, it sends the page title+content to the Anthropic Claude API along with the up to date prompt to summarize it. The user can modify that prompt and re-send the prompt+content to get another summary view of the content.

this entire extension was generated by the prompt in prompt.md (except for the images), and was built up over time by adding more words to the prompt in an iterative process.

smol dev

basic usage

modal run main.py --prompt "a Chrome extension that, when clicked, opens a small window with a page where you can enter a prompt for reading the currently open page and generating some response from openai"   

after a while of adding to your prompt, you can extract your prompt to a file, as long as your "prompt" ends in a .md extension we'll go look for that file

modal run main.py --prompt prompt.md   

each time you run this, the generated directory is deleted (except for images) and all files are rewritten from scratch.

In the shared_dependencies.md file is a helper file that ensures coherence between files.

if you make a tweak to the prompt and only want it to affect one file, and keep the rest of the files, specify the file param:

modal run main.py --prompt prompt.md  --file popup.js

smol debugger

take the entire contents of the generated directory in context, feed in an error, get a response. this basically takes advantage of longer (32k-100k) context so we basically dont have to do any embedding of the source.

modal run debugger.py --prompt "Uncaught (in promise) TypeError: Cannot destructure property 'pageTitle' of '(intermediate value)' as it is undefined.    at init (popup.js:59:11)"

# gpt4
modal run debugger.py --prompt "your_error msg_here" --model=gpt-4

smol pm

take the entire contents of the generated directory in context, and get a prompt back that could synthesize the whole program. basically smol dev, in reverse.

modal run code2prompt.py # ~0.5 second

# use gpt4
modal run code2prompt.py --model=gpt-4 # 2 mins, MUCH better results

We have done indicative runs of both, stored in code2prompt-gpt3.md vs code2prompt-gpt4.md. Note how incredibly better gpt4 is at prompt engineering its future self.

Naturally, we had to try code2prompt2code...

# add prompt... this needed a few iterations to get right
modal run code2prompt.py --prompt "make sure all the id's of the DOM elements, and the data structure of the page content (stored with {pageTitle, pageContent }) , referenced/shared by the js files match up exactly. take note to only use Chrome Manifest V3 apis. rename the extension to code2prompt2code" --model=gpt-4 # takes 4 mins. produces semi working chrome extension copy based purely on the model-generated description of a different codebase

# must go deeper
modal run main.py --prompt code2prompt-gpt4.md --directory code2prompt2code

We leave the social and technical impacts of multilayer generative deep-frying of codebases as an exercise to the reader.

future directions

things to try/would accept open issue discussions and PRs: