Generating images using generative adversarial networks based on text descriptions

Marzhan Turarova, Roza Bekbayeva, Lazzat Abdykerimova, Murat Aitimov, Aigulim Bayegizova, Ulmeken Smailova, Leila Kassenova, Natalya Glazyrina


Modern developments in the fields of natural language processing (NLP) and computer vision (CV) emphasize the increasing importance of generating images from text descriptions. The presented article analyzes and compares two key methods in this area: generative adversarial network with conditional latent semantic analysis (GAN-CLS) and ultra-long transformer network (XLNet). The main components of GAN-CLS, including the generator, discriminator, and text encoder, are discussed in the context of their functional tasks—generating images from text inputs, assessing the realism of generated images, and converting text descriptions into latent spaces, respectively. A detailed comparative analysis of the performance of GAN-CLS and XLNet, the latter of which is widely used in the organic light-emitting diode (OEL) field, is carried out. The purpose of the study is to determine the effectiveness of each method in different scenarios and then provide valuable recommendations for selecting the best method for generating images from text descriptions, taking into account specific tasks and resources. Ultimately, our paper aims to be a valuable research resource by providing scientific guidance for NLP and CV experts.


Discriminator; Extra-long transformer network; Generative adversarial network with conditional latent semantic; Generator; Machine learning; Natural language processing

Full Text:



Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).