Unlike the widespread #AIart, this task had the constraint to adhere to the original text, thus keep in check some peculiarities of generator models, such as #hallucinations. When working on my task, I quickly discovered the obvious: that #DescriptiveText-s are somewhat easier to generate for. (example RedCap). However, fairy tales contain much less descriptions than one might recall from child memories. So the challenge of the task was to generate illustrations also for #narrative. While it might be too ambitious to try to illustrate for a sequence of events (what would make a narrative), even describing a single event requires an interaction or a scene composition. However, interactions or compositions were notoriously difficult to get right by generative models. Until in mid-2022 Google's Parti (https://arxiv.org/abs/2206.10789) made a notable breakthrough by linking the image generation to (text) transformer models
The first stage of the process is converting the intended text to a prompt, without deviating from the original vocabulary. This means condensing content into a single phrase, removing words that are not meant to be visualised, e.g. this, here, he, and substituting them with what they refer to.
In the second stage, considering the outcome of the first step, I aimed to isolate parts of the prompt to be removed, added or replaced to improve the composition of the image, aiming to add important elements and to remove unwanted ones.