In December 2020, OpenAI released DALL-E, an artificial intelligence (AI) model that can generate images from textual descriptions, based on a 12-billion parameter training version of the GPT-3 transformer model. The model was trained on a dataset of text-to-image pairs composed of textual descriptions of images from the Internet and corresponding images.
DALL-E can generate images of birds, flowers, animals, humans, and objects, as well as more abstract concepts such as ideas, emotions, and colors. The images are high-resolution and photo-realistic.
The program, which is part of the open-source TensorFlow library, was revealed by Google last week. It’s still in its early stages, but it’s already capable of some impressive feats.
For example, when asked to generate an image of “a baby Panda eating bamboo,” DALL-E produced a painting of a panda cub chomping on a stalk of bamboo.
To use the DALL-E API, you need to first create a DALL-E account. Then, you can create a new app or use an existing app. To create a new app, you need to specify a name and description for your app and choose a programming language. Currently, the DALL-E API supports the following programming languages: Python, Java, Node.js, and Go.
Once you have created an app, you can generate images by calling the generate method. This method takes two parameters: a textual description of the image you want to generate, and the desired width of the generated image. The textual description can be up to 256 characters long.
The DALL-E API will return a JSON object that contains the generated image. The image will be in the PNG format and will have a width that is equal to or less than the desired width.
You can also use the DALL-E API to generate images from scratch, without providing a textual description. To do this, you need to call the generate method with the following parameters:
-text: “” (an empty string)
-width: the desired width of the generated image
The style parameter specifies the type of image you want to generate. The possible values are: “default”, “ Abstract”, “animals”, “birds”, “faces”, “flowers”, “food”, “humans”, “ objects”, and “scenes”.
As Google noted in its blog post about the program, the results are often surreal.
But they’re also sometimes eerily accurate. When asked to generate an image of “a person riding a unicycle on a tightrope over a cityscape,” DALL-E produced a painting of a tightrope walker crossing the skyline of a city.
The program is based on a type of artificial intelligence called a generative adversarial network or GAN. This type of AI is often used to generate images that are realistic enough to fool humans.
In the case of DALL-E, the GAN is trained on a dataset of 12 million images. This allows it to generate images that are plausible, even if they’re not perfect.
This is just the beginning for DALL-E. As the program is further developed, it will become even more capable of generating realistic images.
In the meantime, anyone who wants to use the program can do so by installing the TensorFlow library.