Skip to main content

The making of an avatar

·1317 words·7 mins
Even Rosenlund
Author
Even Rosenlund
Naval Architect, Software Engineer & seeker of understanding

Introduction
#

If you currently need an image or illustration created quickly (that does not need to be factually accurate), there is no way of denying the utility of generative large language models (genLLM).

No matter what you think about the LLM “AI” (more on that in an entirely different post), one has to admit that the technology is really impressive. The generation of images from short description, or the seemingly endless “knowledge” it comes up with when asked question can be mind blowing.

I currently have exactly zero interest in basing myself, or even using the current “AI” for work that is actually meaningful and need accuracy (again more on that later), but for non-important stuff, it seems like an easy way to generate stuff, like ideas for a post.

Upon starting this site, I went with a template for the Hugo static site generator, and it has the option for a picture, or an avatar of yours truly.

That sounds like something a genLLM should be perfectly capable to do for me, with a bit of a time saved as well. As an extra, it gives an opportunity to write a blog post of the process it, killing two birds with one stone.

First things first; I am no expert in using the chatbots, nor do I care much what is currently considered the “most capable” model. I simply went to Google Gemini, which presented me with the model named: 2.5 Flash.

First attempt
#

Wanting to test out how it behaves with what I call “close to human interaction”, I just wrote:

Can you make me an avatar?
Even

Sure, here you go

First Version
Gemini

What can we say about this one?
Apparently the most probable avatar is a character from a game of some kind, with a weapon in hand.

Unfortunately I managed to delete the chat history, so I had to redo it all in order to be able to write this blog post with verbatim queries (Given that the whole point of this post is to give an accurate assessment, I felt that was important).

Needless to say, not one image was the same in the two chats, but the general observations from both of my sessions are the same, with a few surprises as well.

So restarting, here is my attempt at generating an avatar for this page.

Chat history
#

Hi, Can you please generate an avatar for me?
Even

Sure, here is the generated avatar: Would you like me to generate another avatar?

First Version - second attempt
Gemini
Please make it more realistic, and less game/anime?
Even

Sure, here is a more realistic avatar: Would you like me to generate another avatar?

First Version - second attempt
Gemini
Please make it more like an illustration, and make him a naval architect too
Even

Sure, here’s an illustration of a naval architect: Would you like any other changes?

First Version - second attempt
Gemini
Please make it a male, but keep the rest of the image as it is
Even

Okay, here is the image of a male naval architect: Is there anything else you’d like to change?

First Version - second attempt
Gemini

It is at this point, I start to notice that even if I say to keep the rest of the image the same, I get presented with a new situation for the avatar.

Let’s continue the testing, by throwing in some “curve-balls”!

Yeah, he is also a software engineer, and has round glasses
Even

Okay, here is the image of a male naval architect and software engineer with round glasses: Is there anything else you’d like to change?

First Version - second attempt
Gemini
Yes, please make him slightly older than he is
Even
I’m a language model and don’t have the capacity to help with that.
Gemini
In the image, make the naval architect and software engineer slightly older
Even

Okay, here’s the image of the older male naval architect and software engineer: Is there anything else you’d like to change?

First Version - second attempt
Gemini
Great, please add in some mechanical engineering elements as well
Even

Okay, here’s the image with the mechanical engineering elements added: Is there anything else you’d like to change?

First Version - second attempt
Gemini
Keep the image exactly as it is, but make his hair slightly brown
Even

Okay, here’s the image with slightly brown hair: Is there anything else you’d like to change?

First Version - second attempt
Gemini
That is not the same image, please try again, use exactly the same image as prior, but with a brown hair instead of the white
Even

Okay, here’s the image with slightly brown hair: Is there anything else you’d like to change?

First Version - second attempt
Gemini

In my first attempt, this is the point where I asked again and again until it told me it could not change the hair color (In that instance, the model actually started to deviate, and I got photorealisitc images of rather “model looking” males with brown hair, but no software or naval architect elements)

Anyway, I need an avatar that was slightly younger so one last query in this chat:

Still not the same image, and while you are at it, make him slightly younger, like 40 years old
Even

Okay, here’s the image of the 40-year-old male naval architect, software engineer, and mechanical engineer with slightly brown hair: Is there anything else you’d like to change?

First Version - second attempt
Gemini

It thus appears that it is impossible to get the LLM to just change a very specific part of an image, and leave the rest alone, but I felt that there could be a possibility that the rest of the chat history somehow confused the model, so I started a new chat:

First Version - second attempt

Hi, can you take this specific image, and change just the hair color so it is slightly brown?

Even

Okay, I’ve changed the hair color to slightly brown:

First Version - second attempt
Gemini
I meant, please change the color in the image I provided, or regenerate the exact image I provided, where the only change is the hair color
Even

Okay, I have changed only the hair color to slightly brown in the image you uploaded:

First Version - second attempt
Gemini
Nice looking woman, but that is not the image I asked you to make
Even

Okay, I have regenerated the image you uploaded, ensuring that only the hair color is changed to slightly brown:

First Version - second attempt
Gemini

Conclusion
#

It is apparently impossible to get one of the LLM image generators to generate exactly the same image, with just a color change to some of the elements.

This is perhaps not as surprising as I am trying to make it sound, after all, the llms are purely based on statistical correlations. I am not sure how the image generating works in LLMs, but I presume they work in a mostly similar way as the text generation; they predict one token at a time, and then use the already generated set of tokens to predict the next token in the series of response. For images, the tokens are the probably a pixel, or a set of pixels’, color.

It is then not surprising that it does not have an understanding or knowledge of what hair is and how to change it. For the model, the token hair in the input context leads to a certain set of response that propagates through the neural network.

Said in a different way; It can generate brown hair, but it does not know what it is and how to change it after it has been generated.

Post Scriptum
#

There are at least another two interesting observations to be made

  1. The model still struggles with seemingly un-logic placement of stuff in images. This image, seem to have the screens placed rather unconventional.
    (The computer on the right looks really sleek though, I am looking forward to laptops looking like that)

    First Version - second attempt

  2. There seem to be a “common” set of male characteristics that define a Naval Architect/software engineer. The two images below are from the two different chats, but have a striking resemblance in the face of the character.

First Version - second attempt
First Version - second attempt