Google introduced the Imagen neural network
Miscellaneous / / May 24, 2022
And it does it at least as well as DALL-E 2.
Google announced Imagen is a neural network that converts a text query into images. It is a direct competitor DALL-E2 from OpenAI - which works even better in some scenarios.
To recognize a text query, the neural network uses large language models - natural speech processing algorithms like GPT-3 are also based on them.
The system works in three stages. The first one draws a small 64 x 64 pixel image, which is refined until the neural network can change it to better match the original request. The image is then scaled up to 256 x 256 pixels and Imagen refines the details. At the third stage, the same thing is repeated already with the canvas of the final size - 1024 x 1024 pixels.
The text of the study notes that Imagen copes with understanding complex queries better than DALL-E 2. For example, for the query “Panda makes latte art”, DALL-E 2 returned exclusively latte art with pandas, while the Google neural network managed to produce mostly correct results:
But Google also admits that none of these neural networks could handle the query “horse riding astronaut”: both stubbornly put the astronaut on the horse, and not vice versa. Both obviously have room to grow.
Independent viewer evaluation results show that Imagen outperforms DALL-E 2 in terms of accuracy and relevance. And although this comparison can be considered subjective, such results are still impressive, given that DALL-E 2 has so far been an unattainable ideal that other neural networks of a similar nature could not match. destination.
In any case, Imagen remains an experimental project for now, which ordinary users cannot access. It is not clear how long it will be before Google creates an open access service based on it.
Read also🧐
- New neural network Paint Transformer turns a photo into a painting object
- Polaroid of the Future: NVIDIA's New Neural Network Turns 2D Images Into 3D Models
- Sber launched the ruDALL-E neural network, which generates images according to the description
Best offer of the week: discounts from AliExpress, Lamoda, Mixit and other stores