Home

Awesome

ImaginAIry šŸ¤–šŸ§ 

Downloads Downloads image image Discord

AI imagined images. Pythonic generation of stable diffusion images and videos *!.

"just works" on Linux and macOS(M1) (and sometimes windows).

# on macOS, make sure rust is installed first
# be sure to use Python 3.10, Python 3.11 is not supported at the moment
>> pip install imaginairy
>> imagine "a scenic landscape" "a photo of a dog" "photo of a fruit bowl" "portrait photo of a freckled woman" "a bluejay"
# Make an AI video
>> aimg videogen --start-image rocket.png

Stable Video Diffusion

<p float="left"> <img src="docs/assets/svd-rocket.gif" height="190"> <img src="docs/assets/svd-athens.gif" height="190"> <img src="docs/assets/svd-pearl-girl.gif" height="190"> <img src="docs/assets/svd-starry-night.gif" height="190"> <img src="docs/assets/svd-dog.gif" height="190"> <img src="docs/assets/svd-xpbliss.gif" height="190"> </p>

Rushed release of Stable Diffusion Video!

Works with Nvidia GPUs. Does not work on Mac or CPU.

On Windows you'll need to install torch 2.0 first via https://pytorch.org/get-started/locally/

Usage: aimg videogen [OPTIONS]

  AI generate a video from an image

  Example:

      aimg videogen --start-image assets/rocket-wide.png

Options:
  --start-image TEXT       Input path for image file.
  --num-frames INTEGER     Number of frames.
  --num-steps INTEGER      Number of steps.
  --model TEXT             Model to use. One of: svd, svd_xt, svd_image_decoder, svd_xt_image_decoder
  --fps INTEGER            FPS for the AI to target when generating video
  --output-fps INTEGER     FPS for the output video
  --motion-amount INTEGER  How much motion to generate. value between 0 and 255.
  -r, --repeats INTEGER    How many times to repeat the renders.   [default: 1]
  --cond-aug FLOAT         Conditional augmentation.
  --seed INTEGER           Seed for random number generator.
  --decoding_t INTEGER     Number of frames decoded at a time.
  --output_folder TEXT     Output folder.
  --help                   Show this message and exit.

Images

<p float="left"> <img src="docs/assets/026882_1_ddim50_PS7.5_a_scenic_landscape_[generated].jpg" height="256"> <img src="docs/assets/026884_1_ddim50_PS7.5_photo_of_a_dog_[generated].jpg" height="256"> <img src="docs/assets/026890_1_ddim50_PS7.5_photo_of_a_bowl_of_fruit._still_life_[generated].jpg" height="256"> <img src="docs/assets/026885_1_ddim50_PS7.5_girl_with_a_pearl_earring_[generated].jpg" height="256"> <img src="docs/assets/026891_1_ddim50_PS7.5_close-up_photo_of_a_bluejay_[generated].jpg" height="256"> <img src="docs/assets/026893_1_ddim50_PS7.5_macro_photo_of_a_flower_[generated].jpg" height="256"> </p>

Whats New

See full Changelog here

14.3.0

14.2.0

14.1.1

14.1.0

14.0.4

14.0.3

14.0.0

Run API server and StableStudio web interface (alpha)

Generate images via API or web interface. Much smaller featureset compared to the command line tool.

>> aimg server

Visit http://localhost:8000/ and http://localhost:8000/docs

<img src="https://github.com/Stability-AI/StableStudio/blob/a65d4877ad7d309627808a169818f1add8c278ae/misc/GenerateScreenshot.png?raw=true" width="512">

Image Structure Control by ControlNet

(Not supported for SDXL yet)

Generate images guided by body poses, depth maps, canny edges, hed boundaries, or normal maps.

Openpose Control

imagine --control-image assets/indiana.jpg  --control-mode openpose --caption-text openpose "photo of a polar bear"
<p float="left"> <img src="docs/assets/indiana.jpg" height="256"> <img src="docs/assets/indiana-pose.jpg" height="256"> <img src="docs/assets/indiana-pose-polar-bear.jpg" height="256"> </p>

Canny Edge Control

imagine --control-image assets/lena.png  --control-mode canny "photo of a woman with a hat looking at the camera"
<p float="left"> <img src="docs/assets/lena.png" height="256"> <img src="docs/assets/lena-canny.jpg" height="256"> <img src="docs/assets/lena-canny-generated.jpg" height="256"> </p>

HED Boundary Control

imagine --control-image dog.jpg  --control-mode hed  "photo of a dalmation"
<p float="left"> <img src="docs/assets/000032_337692011_PLMS40_PS7.5_a_photo_of_a_dog.jpg" height="256"> <img src="docs/assets/dog-hed-boundary.jpg" height="256"> <img src="docs/assets/dog-hed-boundary-dalmation.jpg" height="256"> </p>

Depth Map Control

imagine --control-image fancy-living.jpg  --control-mode depth  "a modern living room"
<p float="left"> <img src="docs/assets/fancy-living.jpg" height="256"> <img src="docs/assets/fancy-living-depth.jpg" height="256"> <img src="docs/assets/fancy-living-depth-generated.jpg" height="256"> </p>

Normal Map Control

imagine --control-image bird.jpg  --control-mode normal  "a bird"
<p float="left"> <img src="docs/assets/013986_1_kdpmpp2m59_PS7.5_a_bluejay_[generated].jpg" height="256"> <img src="docs/assets/bird-normal.jpg" height="256"> <img src="docs/assets/bird-normal-generated.jpg" height="256"> </p>

Image Shuffle Control

Generates the image based on elements of the control image. Kind of similar to style transfer.

imagine --control-image pearl-girl.jpg  --control-mode shuffle  "a clown"

The middle image is the "shuffled" input image

<p float="left"> <img src="docs/assets/girl_with_a_pearl_earring.jpg" height="256"> <img src="docs/assets/pearl_shuffle_019331_1_kdpmpp2m15_PS7.5_img2img-0.0_a_clown.jpg" height="256"> <img src="docs/assets/pearl_shuffle_clown_019331_1_kdpmpp2m15_PS7.5_img2img-0.0_a_clown.jpg" height="256"> </p>

Editing Instructions Control

Similar to instructPix2Pix (below) but works with any SD 1.5 based model.

imagine --control-image pearl-girl.jpg  --control-mode edit --init-image-strength 0.01 --steps 30  --negative-prompt "" --model openjourney-v2 "make it anime" "make it at the beach" 
<p float="left"> <img src="docs/assets/girl_with_a_pearl_earring.jpg" height="256"> <img src="docs/assets/pearl_anime_019537_521829407_kdpmpp2m30_PS9.0_img2img-0.01_make_it_anime.jpg" height="256"> <img src="docs/assets/pearl_beach_019561_862735879_kdpmpp2m30_PS7.0_img2img-0.01_make_it_at_the_beach.jpg" height="256"> </p>

Add Details Control (upscaling/super-resolution)

Replaces existing details in an image. Good to use with --init-image-strength 0.2

imagine --control-image "assets/wishbone.jpg" --control-mode details "sharp focus, high-resolution" --init-image-strength 0.2 --steps 30 -w 2048 -h 2048 
<p float="left"> <img src="docs/assets/wishbone_headshot_badscale.jpg" height="256"> <img src="docs/assets/wishbone_headshot_details.jpg" height="256"> </p>

Image (re)Colorization (using brightness control)

Colorize black and white images or re-color existing images.

The generated colors will be applied back to the original image. You can either provide a caption or allow the tool to generate one for you.

aimg colorize pearl-girl.jpg --caption "photo of a woman"
<p float="left"> <img src="docs/assets/girl_with_a_pearl_earring.jpg" height="256"> <img src="docs/assets/pearl-gray.jpg" height="256"> <img src="docs/assets/pearl-recolor-a.jpg" height="256"> </p>

Instruction based image edits by InstructPix2Pix

(Broken as of 14.0.0)

Just tell imaginairy how to edit the image and it will do it for you!

<p float="left"> <img src="docs/assets/scenic_landscape_winter.jpg" height="256"> <img src="docs/assets/dog_red.jpg" height="256"> <img src="docs/assets/bowl_of_fruit_strawberries.jpg" height="256"> <img src="docs/assets/freckled_woman_cyborg.jpg" height="256"> <img src="docs/assets/014214_51293814_kdpmpp2m30_PS10.0_img2img-1.0_make_the_bird_wear_a_cowboy_hat_[generated].jpg" height="256"> <img src="docs/assets/flower-make-the-flower-out-of-paper-origami.gif" height="256"> <img src="docs/assets/girl-pearl-clown-compare.gif" height="256"> <img src="docs/assets/mona-lisa-headshot-anim.gif" height="256"> <img src="docs/assets/make-it-night-time.gif" height="256"> </p> <details> <summary>Click to see shell commands</summary> Use prompt strength to control how strong the edit is. For extra control you can combine with prompt-based masking.
# enter imaginairy shell
>> aimg
šŸ¤–šŸ§ > edit scenic_landscape.jpg -p "make it winter" --prompt-strength 20
šŸ¤–šŸ§ > edit dog.jpg -p "make the dog red" --prompt-strength 5
šŸ¤–šŸ§ > edit bowl_of_fruit.jpg -p "replace the fruit with strawberries"
šŸ¤–šŸ§ > edit freckled_woman.jpg -p "make her a cyborg" --prompt-strength 13
šŸ¤–šŸ§ > edit bluebird.jpg -p "make the bird wear a cowboy hat" --prompt-strength 10
šŸ¤–šŸ§ > edit flower.jpg -p "make the flower out of paper origami" --arg-schedule prompt-strength[1:11:0.3]  --steps 25 --compilation-anim gif

# create a comparison gif
šŸ¤–šŸ§ > edit pearl_girl.jpg -p "make her wear clown makeup" --compare-gif
# create an animation showing the edit with increasing prompt strengths
šŸ¤–šŸ§ > edit mona-lisa.jpg -p "make it a color professional photo headshot" --negative-prompt "old, ugly, blurry" --arg-schedule "prompt-strength[2:8:0.5]" --compilation-anim gif
šŸ¤–šŸ§ > edit gg-bridge.jpg -p "make it night time" --prompt-strength 15  --steps 30 --arg-schedule prompt-strength[1:15:1] --compilation-anim gif
</details>

Quick Image Edit Demo

Want just quickly have some fun? Try edit-demo to apply some pre-defined edits.

>> aimg edit-demo pearl_girl.jpg
<p float="left"> <img src="docs/assets/girl_with_a_pearl_earring_suprise.gif" height="256"> <img src="docs/assets/mona-lisa-suprise.gif" height="256"> <img src="docs/assets/luke-suprise.gif" height="256"> <img src="docs/assets/spock-suprise.gif" height="256"> <img src="docs/assets/gg-bridge-suprise.gif" height="256"> <img src="docs/assets/shire-suprise.gif" height="256"> </p>

Prompt Based Masking by clipseg

Specify advanced text based masks using boolean logic and strength modifiers. Mask syntax:

When writing strength modifiers keep in mind that pixel values are between 0 and 1.

>> imagine \
    --init-image pearl_earring.jpg \
    --mask-prompt "face AND NOT (bandana OR hair OR blue fabric){*6}" \
    --mask-mode keep \
    --init-image-strength .2 \
    --fix-faces \
    "a modern female president" "a female robot" "a female doctor" "a female firefighter"

<img src="docs/assets/mask_examples/pearl000.jpg" height="200">āž”ļø <img src="docs/assets/mask_examples/pearl_pres.png" height="200"> <img src="docs/assets/mask_examples/pearl_robot.png" height="200"> <img src="docs/assets/mask_examples/pearl_doctor.png" height="200"> <img src="docs/assets/mask_examples/pearl_firefighter.png" height="200">

>> imagine \
    --init-image fruit-bowl.jpg \
    --mask-prompt "fruit OR fruit stem{*6}" \
    --mask-mode replace \
    --mask-modify-original \
    --init-image-strength .1 \
    "a bowl of kittens" "a bowl of gold coins" "a bowl of popcorn" "a bowl of spaghetti"

<img src="docs/assets/000056_293284644_PLMS40_PS7.5_photo_of_a_bowl_of_fruit.jpg" height="200">āž”ļø <img src="docs/assets/mask_examples/bowl004.jpg" height="200"> <img src="docs/assets/mask_examples/bowl001.jpg" height="200"> <img src="docs/assets/mask_examples/bowl002.jpg" height="200"> <img src="docs/assets/mask_examples/bowl003.jpg" height="200">

Face Enhancement by CodeFormer

>> imagine "a couple smiling" --steps 40 --seed 1 --fix-faces

<img src="https://github.com/brycedrennan/imaginAIry/raw/master/assets/000178_1_PLMS40_PS7.5_a_couple_smiling_nofix.png" height="256"> āž”ļø <img src="https://github.com/brycedrennan/imaginAIry/raw/master/assets/000178_1_PLMS40_PS7.5_a_couple_smiling_fixed.png" height="256">

Image Upscaling

Upscale images easily.

=== "CLI" bash aimg upscale assets/000206_856637805_PLMS40_PS7.5_colorful_smoke.jpg --upscale-model real-hat

=== "Python" ```py from imaginairy.api.upscale import upscale

img = upscale(img="assets/000206_856637805_PLMS40_PS7.5_colorful_smoke.jpg")
img.save("colorful_smoke.upscaled.jpg")

```

<img src="docs/assets/000206_856637805_PLMS40_PS7.5_colorful_smoke.jpg" width="25%" height="auto"> āž”ļø <img src="docs/assets/000206_856637805_PLMS40_PS7.5_colorful_smoke_upscaled.jpg" width="50%" height="auto">

Upscaling uses Spandrel to make it easy to use different upscaling models. You can view different integrated models by running aimg upscale --list-models, and then use it with --upscale-model <model-name>. Also accepts url's if you want to upscale an image with a different model. Control the new file format/location with --format.

from imaginairy.enhancers.upscale_realesrgan import upscale_image
from PIL import Image
img = Image.open("my-image.jpg")
big_img = upscale_image(i)

Tiled Images

>> imagine  "gold coins" "a lush forest" "piles of old books" leaves --tile

<img src="docs/assets/000066_801493266_PLMS40_PS7.5_gold_coins.jpg" height="128"><img src="docs/assets/000066_801493266_PLMS40_PS7.5_gold_coins.jpg" height="128"><img src="docs/assets/000066_801493266_PLMS40_PS7.5_gold_coins.jpg" height="128"> <img src="docs/assets/000118_597948545_PLMS40_PS7.5_a_lush_forest.jpg" height="128"><img src="docs/assets/000118_597948545_PLMS40_PS7.5_a_lush_forest.jpg" height="128"><img src="docs/assets/000118_597948545_PLMS40_PS7.5_a_lush_forest.jpg" height="128"> <br> <img src="docs/assets/000075_961095192_PLMS40_PS7.5_piles_of_old_books.jpg" height="128"><img src="docs/assets/000075_961095192_PLMS40_PS7.5_piles_of_old_books.jpg" height="128"><img src="docs/assets/000075_961095192_PLMS40_PS7.5_piles_of_old_books.jpg" height="128"> <img src="docs/assets/000040_527733581_PLMS40_PS7.5_leaves.jpg" height="128"><img src="docs/assets/000040_527733581_PLMS40_PS7.5_leaves.jpg" height="128"><img src="docs/assets/000040_527733581_PLMS40_PS7.5_leaves.jpg" height="128">

360 degree images

imagine --tile-x -w 1024 -h 512 "360 degree equirectangular panorama photograph of the desert"  --upscale
<img src="docs/assets/desert_360.jpg" height="128">

Image-to-Image

Use depth maps for amazing "translations" of existing images.

>> imagine --init-image girl_with_a_pearl_earring_large.jpg --init-image-strength 0.05  "professional headshot photo of a woman with a pearl earring" -r 4 -w 1024 -h 1024 --steps 50
<p float="left"> <img src="tests/data/girl_with_a_pearl_earring.jpg" width="256"> āž”ļø <img src="docs/assets/pearl_depth_1.jpg" width="256"> <img src="docs/assets/pearl_depth_2.jpg" width="256"> </p>

Outpainting

Given a starting image, one can generate it's "surroundings".

Example: imagine --init-image pearl-earring.jpg --init-image-strength 0 --outpaint all250,up0,down600 "woman standing"

<img src="tests/data/girl_with_a_pearl_earring.jpg" height="256"> āž”ļø <img src="tests/expected_output/test_outpainting_outpaint_.png" height="256">

Work with different generation models

<p float="left"> <img src="docs/assets/fairytale-treehouse-sd15.jpg" height="256"> <img src="docs/assets/fairytale-treehouse-openjourney-v1.jpg" height="256"> <img src="docs/assets/fairytale-treehouse-openjourney-v2.jpg" height="256"> </p> <details> <summary>Click to see shell command</summary>
imagine "valley, fairytale treehouse village covered, , matte painting, highly detailed, dynamic lighting, cinematic, realism, realistic, photo real, sunset, detailed, high contrast, denoised, centered, michael whelan" --steps 60 --seed 1 --arg-schedule model[sd14,sd15,sd20,sd21,openjourney-v1,openjourney-v2] --arg-schedule "caption-text[sd14,sd15,sd20,sd21,openjourney-v1,openjourney-v2]"
</details>

Prompt Expansion

You can use {} to randomly pull values from lists. A list of values separated by | and enclosed in { } will be randomly drawn from in a non-repeating fashion. Values that are surrounded by _ _ will pull from a phrase list of the same name. Folders containing .txt phraselist files may be specified via --prompt_library_path. The option may be specified multiple times. Built-in categories:

  3d-term, adj-architecture, adj-beauty, adj-detailed, adj-emotion, adj-general, adj-horror, animal, art-scene, art-movement, 
  art-site, artist, artist-botanical, artist-surreal, aspect-ratio, bird, body-of-water, body-pose, camera-brand,
  camera-model, color, cosmic-galaxy, cosmic-nebula, cosmic-star, cosmic-term, desktop-background, dinosaur, eyecolor, f-stop, 
  fantasy-creature, fantasy-setting, fish, flower, focal-length, food, fruit, games, gen-modifier, hair, hd,
  iso-stop, landscape-type, national-park, nationality, neg-weight, noun-beauty, noun-fantasy, noun-general, 
  noun-horror, occupation, painting-style, photo-term, pop-culture, pop-location, punk-style, quantity, rpg-item, scenario-desc, 
  skin-color, spaceship, style, tree-species, trippy, world-heritage-site

Examples:

imagine "a {lime|blue|silver|aqua} colored dog" -r 4 --seed 0 (note that it generates a dog of each color without repetition)

<img src="docs/assets/000184_0_plms40_PS7.5_a_silver_colored_dog_[generated].jpg" height="200"><img src="docs/assets/000186_0_plms40_PS7.5_a_aqua_colored_dog_[generated].jpg" height="200"> <img src="docs/assets/000210_0_plms40_PS7.5_a_lime_colored_dog_[generated].jpg" height="200"> <img src="docs/assets/000211_0_plms40_PS7.5_a_blue_colored_dog_[generated].jpg" height="200">

imagine "a {_color_} dog" -r 4 --seed 0 will generate four, different colored dogs. The colors will be pulled from an included phraselist of colors.

imagine "a {_spaceship_|_fruit_|hot air balloon}. low-poly" -r 4 --seed 0 will generate images of spaceships or fruits or a hot air balloon

<details> <summary>Python example</summary>
from imaginairy.enhancers.prompt_expansion import expand_prompts

my_prompt = "a giant {_animal_}"

expanded_prompts = expand_prompts(n=10, prompt_text=my_prompt, prompt_library_paths=["./prompts"])
</details>

Credit to noodle-soup-prompts where most, but not all, of the wordlists originate.

Generate image captions (via BLIP)

>> aimg describe assets/mask_examples/bowl001.jpg
a bowl full of gold bars sitting on a table

Example Use Cases

>> aimg
# Generate endless 8k art
šŸ¤–šŸ§ > imagine -w 1920 -h 1080 --upscale "{_art-scene_}. {_painting-style_} by {_artist_}" -r 1000 --steps 30 --model sd21v

# generate endless desktop backgrounds 
šŸ¤–šŸ§ > imagine --tile "{_desktop-background_}" -r 100

# convert a folder of images to pencil sketches
šŸ¤–šŸ§ > edit other/images/*.jpg -p "make it a pencil sketch"

# upscale a folder of images
šŸ¤–šŸ§ > upscale my-images/*.jpg

# generate kitchen remodel ideas
šŸ¤–šŸ§ > imagine --control-image kitchen.jpg -w 1024 -h 1024 "{_interior-style_} kitchen" --control-mode depth -r 100 --init-image 0.01 --upscale --steps 35 --caption-text "{prompt}"

Additional Features

How To

For full command line instructions run aimg --help

from imaginairy import imagine, imagine_image_files, ImaginePrompt, WeightedPrompt, LazyLoadingImage

url = "https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/Thomas_Cole_-_Architect%E2%80%99s_Dream_-_Google_Art_Project.jpg/540px-Thomas_Cole_-_Architect%E2%80%99s_Dream_-_Google_Art_Project.jpg"
prompts = [
    ImaginePrompt("a scenic landscape", seed=1, upscale=True),
    ImaginePrompt("a bowl of fruit"),
    ImaginePrompt([
        WeightedPrompt("cat", weight=1),
        WeightedPrompt("dog", weight=1),
    ]),
    ImaginePrompt(
        "a spacious building", 
        init_image=LazyLoadingImage(url=url)
    ),
    ImaginePrompt(
        "a bowl of strawberries", 
        init_image=LazyLoadingImage(filepath="mypath/to/bowl_of_fruit.jpg"),
        mask_prompt="fruit OR stem{*2}",  # amplify the stem mask x2
        mask_mode="replace",
        mask_modify_original=True,
    ),
    ImaginePrompt("strawberries", tile_mode=True),
]
for result in imagine(prompts):
    # do something
    result.save("my_image.jpg")

# or

imagine_image_files(prompts, outdir="./my-art")

Requirements

Running in Docker

See example Dockerfile (works on machine where you can pass the gpu into the container)

docker build . -t imaginairy
# you really want to map the cache or you end up wasting a lot of time and space redownloading the model weights
docker run -it --gpus all -v $HOME/.cache/huggingface:/root/.cache/huggingface -v $HOME/.cache/torch:/root/.cache/torch -v `pwd`/outputs:/outputs imaginairy /bin/bash

Running on Google Colab

Example Colab

Q&A

Q: How do I change the cache directory for where models are stored?

A: Set the HUGGINGFACE_HUB_CACHE environment variable.

Q: How do I free up disk space?

A: The AI models are cached in ~/.cache/ (or HUGGINGFACE_HUB_CACHE). To delete the cache remove the following folders:

Not Supported