Awesome
<img src="./samples/artificial_intelligence.png" width="250px"></img>
artificial intelligence
<img src="./samples/cosmic_love_and_attention.png" width="250px"></img>
cosmic love and attention
<img src="./samples/fire_in_the_sky.png" width="250px"></img>
fire in the sky
<img src="./samples/a_pyramid_made_of_ice.png" width="250px"></img>
a pyramid made of ice
<img src="./samples/a_lonely_house_in_the_woods.png" width="250px"></img>
a lonely house in the woods
<img src="./samples/marriage_in_the_mountains.png" width="250px"></img>
marriage in the mountains
<img src="./samples/a_lantern_dangling_from_the_tree_in_a_foggy_graveyard.png" width="250px"></img>
lantern dangling from a tree in a foggy graveyard
<img src="./samples/a_vivid_dream.png" width="250px"></img>
a vivid dream
<img src="./samples/balloons_over_the_ruins_of_a_city.png" width="250px"></img>
balloons over the ruins of a city
<img src="./samples/the_death_of_the_lonesome_astronomer.png" width="250px"></img>
the death of the lonesome astronomer - by <a href="https://github.com/moirage">moirage</a>
<img src="./samples/the_tragic_intimacy_of_the_eternal_conversation_with_oneself.png" width="250px"></img>
the tragic intimacy of the eternal conversation with oneself - by <a href="https://github.com/moirage">moirage</a>
<img src="./samples/demon_fire.png" width="250px"></img>
demon fire - by <a href="https://github.com/WiseNat">WiseNat</a>
Big Sleep
<a href="https://twitter.com/advadnoun">Ryan Murdock</a> has done it again, combining OpenAI's <a href="https://github.com/openai/CLIP">CLIP</a> and the generator from a <a href="https://arxiv.org/abs/1809.11096">BigGAN</a>! This repository wraps up his work so it is easily accessible to anyone who owns a GPU.
You will be able to have the GAN dream up images using natural language with a one-line command in the terminal.
User-made notebook with bugfixes and added features, like google drive integration
Install
$ pip install big-sleep
Usage
$ dream "a pyramid made of ice"
Images will be saved to wherever the command is invoked
Advanced
You can invoke this in code with
from big_sleep import Imagine
dream = Imagine(
text = "fire in the sky",
lr = 5e-2,
save_every = 25,
save_progress = True
)
dream()
You can now train more than one phrase using the delimiter "|"
Train on Multiple Phrases
In this example we train on three phrases:
an armchair in the form of pikachu
an armchair imitating pikachu
abstract
from big_sleep import Imagine
dream = Imagine(
text = "an armchair in the form of pikachu|an armchair imitating pikachu|abstract",
lr = 5e-2,
save_every = 25,
save_progress = True
)
dream()
Penalize certain prompts as well!
In this example we train on the three phrases from before,
and penalize the phrases:
blur
zoom
from big_sleep import Imagine
dream = Imagine(
text = "an armchair in the form of pikachu|an armchair imitating pikachu|abstract",
text_min = "blur|zoom",
)
dream()
You can also set a new text by using the .set_text(<str>)
command
dream.set_text("a quiet pond underneath the midnight moon")
And reset the latents with .reset()
dream.reset()
To save the progression of images during training, you simply have to supply the --save-progress
flag
$ dream "a bowl of apples next to the fireplace" --save-progress --save-every 100
Due to the class conditioned nature of the GAN, Big Sleep often steers off the manifold into noise. You can use a flag to save the best high scoring image (per CLIP critic) to {filepath}.best.png
in your folder.
$ dream "a room with a view of the ocean" --save-best
Larger model
If you have enough memory, you can also try using a bigger vision model released by OpenAI for improved generations.
$ dream "storm clouds rolling in over a white barnyard" --larger-model
Experimentation
You can set the number of classes that you wish to restrict Big Sleep to use for the Big GAN with the --max-classes
flag as follows (ex. 15 classes). This may lead to extra stability during training, at the cost of lost expressivity.
$ dream 'a single flower in a withered field' --max-classes 15
Alternatives
<a href="https://github.com/lucidrains/deep-daze">Deep Daze</a> - CLIP and a deep SIREN network
Citations
@misc{unpublished2021clip,
title = {CLIP: Connecting Text and Images},
author = {Alec Radford, Ilya Sutskever, Jong Wook Kim, Gretchen Krueger, Sandhini Agarwal},
year = {2021}
}
@misc{brock2019large,
title = {Large Scale GAN Training for High Fidelity Natural Image Synthesis},
author = {Andrew Brock and Jeff Donahue and Karen Simonyan},
year = {2019},
eprint = {1809.11096},
archivePrefix = {arXiv},
primaryClass = {cs.LG}
}