Unlike the widespread #AIart, this task had the constraint to adhere to the original text, thus keep in check some peculiarities of generator models, such as #hallucinations. When working on my task, I quickly discovered the obvious: that #DescriptiveText-s are somewhat easier to generate for. (example RedCap). However, fairy tales contain much less descriptions than one might recall from child memories. So the challenge of the task was to generate illustrations also for #narrative. While it might be too ambitious to try to illustrate for a sequence of events (what would make a narrative), even describing a single event requires an interaction or a scene composition. However, interactions or compositions were notoriously difficult to get right by generative models. Until in mid-2022 Google's Parti (https://arxiv.org/abs/2206.10789) made a notable breakthrough by linking the image generation to (text) transformer models
Other models followed, in November came Midjourney v4 and in December Structured Diffusion Guidance (https://arxiv.org/abs/2212.05032). Better composition was notably easier to achieve. This allowed me to proceed with (and complete) my tasks of illustration of fairy tales. All resulting images can be seen in the paper, but the other important outcome is the definition of a preliminary process for the generation of images aligned to the original story text.
In the second stage, considering the outcome of the first step, I aimed to isolate parts of the prompt to be removed, added or replaced to improve the composition of the image, aiming to add important elements and to remove unwanted ones.
Once composition is at least roughly right, the third stage to choose a style that helps the efficiency. One that possibly reduces hallucinations, yet eases interpretability by the viewer. For fairytales, "book illustration" is a possibility