More VQGAN+CLIP fun

August 6, 2021, 12:07 pm

I'm still finding VQGAN+CLIP with the ImageNet model a really interesting tool for image synthesis. You can define an initial image, that is then iterated upon to match the prompt, for example, here's a picture of Adastral Park, iterated using the prompt "Cyberpunk".

Or how about "fairy-tale castle"?

With a light touch you can change an image into another style, with a heavier touch you can create entirely new scenes that retain a reminisence of the original.

I can see this being highly applicable to games, but I also wonder if there's a potential opportunity in data visualisation. With the right weighting of prompts and using a combination of text and image, you could create consistent transformations that change how a user experiences data.

Next Post Previous Post