|
Zapiski Nauchnykh Seminarov POMI, 2023, Volume 530, Pages 24–37
(Mi znsl7430)
|
|
|
|
Vector graphics generation with LLMs: approaches and models
B. Timofeenkoa, V. Efimovaa, A. Filchenkovb a ITMO University
b GO AI LAB
Abstract:
The task of generating vector graphics with AI is under-researched. Recently, large language models (LLMs) have been successfully applied to many downstream tasks. For example, modern LLMs achieve remarkable quality in code generation tasks and are open for public access. This study compares approaches to vector graphics generation with LLMs, namely ChatGPT (GPT-3.5) and GPT-4. GPT-4 has noticeable improvements compared to ChatGPT. Both models easily generate geometric primitives but struggle even with simple objects. The results produced by GPT-4 visually resemble the prompts but are inaccurate. GPT-4 is able to correct the output according to instructions. Additionally, it is challenging for both models to recognize an object from an SVG image. Both models recognize only primitive objects correctly.
Key words and phrases:
large language models, vector graphics, generative AI, image generation, text-to-image synthesis.
Received: 06.09.2023
Citation:
B. Timofeenko, V. Efimova, A. Filchenkov, “Vector graphics generation with LLMs: approaches and models”, Investigations on applied mathematics and informatics. Part II–2, Zap. Nauchn. Sem. POMI, 530, POMI, St. Petersburg, 2023, 24–37
Linking options:
https://www.mathnet.ru/eng/znsl7430 https://www.mathnet.ru/eng/znsl/v530/p24
|
Statistics & downloads: |
Abstract page: | 150 | Full-text PDF : | 106 | References: | 26 |
|