Text to image generation using BERT and GAN

Soppari, Kavitha and Vangapally, Bhanu and Sohail, Syed Sameer and Dubba, Harish (2025) Text to image generation using BERT and GAN. International Journal of Science and Research Archive, 14 (1). pp. 720-725. ISSN 2582-8185

Abstract

Generating text to images is a difficult task that combines natural language processing and computer vision. Currently available generative adversarial network (GAN)-based models usually employ text encoders that have already been trained on image-text pairs. Nevertheless, these encoders frequently fall short in capturing the semantic complexity of unread text during pre-training, which makes it challenging to produce images that accurately correspond with the written descriptions supplied.Using BERT, a very successful pre-trained language model in natural language processing, we present a novel text-to-image generating model in order to address this problem. BERT's aptitude for picture generating tasks is improved by allowing it to encode rich textual information through fine-tuning on a large text corpus. Results from experiments on a CUB_200_2011 dataset show that our approach performs better than baseline models in both qualitative and quantitative measures.

Item Type:	Article
Official URL:	https://doi.org/10.30574/ijsra.2025.14.1.0137
Uncontrolled Keywords:	Text to image generation; Multimodal data; BERT; GAN; High quality
Date Deposited:	13 Jul 2025 14:19
Related URLs:	https://journalijsra.com/node/349 https://doi.org/10.30574/ijsra.2025.14.1... https://journalijsra.com/sites/default/f...
URI:	https://eprint.scholarsrepository.com/id/eprint/635

View Item