From Obama hitting the nae-nae in outer space to Fidel Castro eating ice cream in Vincent van Gogh’s painting style, images are reaching a new level never before seen. Why is that? Because the machine has learned to make images from the words we feed it.
DALL·E 2, OpenAI’s, Parti, and Google’s Imagen allow us to call forth complicated computer-generated imagery with simple text prompts. If you’ve been on the internet in any way over the past couple of months, you have detected something called “Dall-E mini”,: An AI model that creates images based on the prompts you give it. Created in July 2021, Boris Dayma created it as a “part of a competition held by Google and an AI community called Hugging Face”.
Algorithms, artificial intelligence, images, media, database: all words that bubble like a movie pitch directed by a hybrid of Phillip K. Dick and Stanley Kubrick. But it’s real, as real as a 3×3 grid containing nine images can get. The non-profit OpenAI Inc. developed the Dall-E mini. A software whose name is an autological word product of the names of the animated robot Pixar character WALL-E and the Spanish surrealist artist Salvador Dalí.
Generating cursed images with Dall-E mini is my new favorite thing, help pic.twitter.com/vL59j5UPk3— Roxi (kickflip fox) (@thefoxycritter) June 6, 2022
Korean artist Jihyun Park uses incense sticks to burn thousands of tiny holes into rice paper until he makes recognizable images of clouds, mountains, and trees. He says that “the burning of incense sticks creates emptiness where once was substance”. Park strives to achieve a balance between light and dark: substance and emptiness. The artist, rather than adding, takes away. Jihyun Park is the perfect example of AI if it was a Korean contemporary artist.
Text-to-image generators begin in a cloud of diffusion and attract whatever meaning they can find from references in the form of words.
The AI machine, artist, or whatever, acts in the same way. It works by first inspecting all levels of data in a cloud of visual smoke, and then it learns the linguistic relationship between pictures and their description. Finally, to create variations of a specific image. The AI becomes an all-encompassing organ that lives in an environment of noise and sketches. That comes back to us with a delivered work and a fabrication that our eyes recognize. The incense is not burned, but it becomes real from the smoke. Kind of like pointillism in reverse.
Imagen (Google’s own text-to-image generator) is a Diffusion model, which learns to convert a pattern of random dots to images.Yonghui Wu
Distinguished Software Engineer
Our ideas of what is finally possible are expanding. It’s not a small fling that will skulk and fade as the rest of the trends do. You can find random, unorthodox creations, from Mickey Mouse as a fascist leader to Fisher Price my first jail cell. In the vast ocean that is social media, everything is possible. But it goes beyond simple memetic or lazy tours de force creations. It’s a testament to “choosing all words with some disregard;/ No better choice than a somewhat blurred song/ That treats clear and not-clear as but one“, as Verlaine would put it.
It’s a unique literary approach to image-making. Artists Matthew Dryhurst and Holly Herndon claim that “these tools will soon contain all the elements necessary to produce limitless resolution compositions guided by language and stylistic prompts”. Some may disregard it as a disembodied cognitive system trained on billions of data points. And the images that it produces are weird or strange. But new experiences are weird until they are no longer. It’s a shift, a shift from attempts to reflect objective reality to subjective play.