Thoughts on prompt execution 032423

After using DALL-E for a few months, I have recently switched mostly to Midjourney and I have noticed some differences in the way the two AIs handle prompt instructions. My approach to generative art often involves prompts that have unusual or poorly defined actors engaged in likewise unusual or poorly defined activities, with the goal of seeing how the AI can creatively interpret those instructions.

I have noticed that Midjourney excels at representing the actors and the settings derived from prompts (the nouns), but without style instructions those representations can be rather formulaic and prosaic. Even “surreal” images tend to be so only in a way characterized by a certain look and feel.

Both AIs can have trouble with interpreting the activities commanded by prompts (the verbs), but the Midjourney interpretations tend to be more formulaic and prosaic.

DALL-E, on the other hand, seems to have less powerful and detailed rendering abilities, but its rendering without style instructions will yield a wider range of look and feel, without enforcing a dominant style. Likewise, I find DALL-E yields more interesting results for vague and unusual activity prompts.

As an example I fed both DALL-E and Midjourney the following simple prompt: “A kachina is hallucinating in a slot canyon”. It contains only an actor in a setting and the activity for that actor—it lacks any style or other instructions. I wanted to see how the AIs would interpret this “on their own”, so to speak, without style or formatting instructions.

The Midjourney generations show a beautifully rendered actor in an equally impressive setting, but the activity of hallucination is hard to tease out of the images. Some of the kachinas are modeled in a way that somewhat suggests hallucination, as is also true for some of the canyon walls, but it is “hallucination light” at best.

The DALL-E generations do not even represent the actor at all—kachinas are missing. They focus instead on the ways the activity of hallucination affects the setting of the canyon. I find these images to be subtle and engaging interpretations of the idea of hallucinating.

So- two different approaches, yielding different results. I plan to continue these experiments and hopefully I can employ even more AIs in the future.

You may also like...