Frequently Asked Questions
Who are you, and what are you doing here?
I'm Amorphant. I make things, and am now making fake real photographs of monster girls, along with this wiki.
Why is your FAQ about AI and not monster girls?
There's an ongoing misinformation-based harassment campaign against AI tool users. It's why I've created this FAQ, as well as outlined my workflow and "artistic credentials". Being able to link it is the best way to respond to active harassment, creating a shining beacon of Barbara Streisand.
Generative AI tools
What are AI-generated images?
In case you missed it, a new type of AI called diffusion models are capable of generating high-quality images. They learn concepts by observing billions of images of everything. While you can have the models generate images entirely for you with text prompting, there are many other means of controlling their output to varying degrees. They can be used as render engines for CAD software, for example.
Under the hood, they're the same type of AI as large language models such as ChatGPT: transformers using the attention mechanism. If you're curious, here's a good introductory video by respected mathematics educator Grant Sanderson of 3blue1brown, which he produced for the LLM exhibit in the Computer History Museum in California. Lengthy deep-dives into the technology are also available on his channel.
What are AI-assisted images?
AI-assisted is a term used to describe images that involve both human output and AI output. Several examples are:
- A drawing, photograph or painting created by a person, then run through a filter using AI
- A 3D work created by a person, where textures and lighting are applied by AI as a filter
- An AI-generated image that's then reworked or heavily modified by a person
Is generative AI theft, as some say?
No, not metaphorically, not literally: They not only don't deprive the original creator of their works, they don't store any portion of those works in any way. They can't be used to recreate works they learned from, outside of maliciously crafting them to do so. The only works that they have a chance of reproducing verbatim are cultural cornerstones that are short and heavily used in our media in general, like the phrase "in God we trust" or the lyrics to Happy Birthday.
Another popular false claim is the idea that copyrighted material is used without permission when, in actuality, no copyright violation happens. Copyrighted material shouldn't be reproduced in whole or part by anyone but the copyright holder, and when it comes to diffusion models and LLMs, none of it is.
This needs to be emphasized because fabricated claims are being spread that it's already been "proven" that diffusion models store parts of source images that they then stitch together to create new images. It's in fact easy to prove to one's self that this is not possible: by going to CivitAI NSFW⚠️ to get a 6gb model, preferably a good finetune of SDXL NSFW⚠️, then getting a Stable Diffusion client like Automatic1111 to drop the model into, one can use an 8gb GPU to generate vastly more unique images in unique styles than could possibly be stored as re-usable pieces in 6gb, even using jpeg compression.
It should be noted that, despite the hate directed at AI images being unwarranted, one legitimate concern is that of styles being ripped off. Precisely imitating everything about another person's style, say by including "in the style of a painting by john doe" in a prompt is, even if not theft, extremely shady. More reputable models obscure or forbid using specific artists' styles, including the one that I use. My own style was carefully created and tuned by me using tools such as iterative LoRA training to slowly approach what I wanted, rather than asking the diffusion model for a specific look.
For more information on how attention-based transformers work, see Grant Sanderson's introductory video from the LLM exhibit in the Computer History Museum in California, the same one linked above.
What about attribution?
Attribution doesn't apply to the mechanics of generative AI, in the same way that it doesn't apply to human works not inspired by a specific artist: Let's say you listen to all of the roughly 100,000 country music albums in existence. You've now got a solid understanding of all things country, and make a country album of your own. Who do you attribute?
Why is there such venom in the backlash?
The companies that first created them were scummy enough to suggest replacing artists with them. Given that they can only produce things of value in the hands of artists, and are tools for artists, marketing diffusion models as replacements for artists is about the worst kind of sales pitch they could have conceived of.
Unfortunately, anger about that gets misdirected at people making legitimate use of these tools as part of a larger creative process, and detractors feel justified in using any tactics to fight against it. This includes fabricating misinformation and acting as toxic as they feel justified in acting. Compounding this is the fact that it's a disruptive technology to begin with.
Do you support the companies in question?
No, none of them get a red cent from me. I use a free model run locally on a PC with a beefy GPU. I fine-tune it myself, developing a style based on an aesthetic I wanted that no one was doing. I'm not releasing any of my own tunings, nor the specific techniques I use to bring things to life.
Will lost jobs become a problem?
No. While a permanent decrease in the number of total jobs would be a problem, the idea that this actually happens is based on the lump of labour fallacy. If it were true, only a small percentage of us would be employed today due to past inventions like the printing press taking all the jobs. The labor market is an actual market, and it behaves like a market.
There will be a shift, but it will look like the many other shifts of the same kind that hit the labor market. It also won't be as pronounced as some fear, with the world now discovering that it takes nearly as much time to get good output from diffusion models as to create by other means.
Is it artistic?
There are various use cases, creating a spectrum. Text prompting is obviously less artistic than manually editing an image afterward, or creating a textureless 3D scene in Blender and using a diffusion model for the final render. In my current workflow, I'm doing the latter.
Where did the deadpan go?
It's tucked away for the moment, as to some, this is serious business.
Resources
⚠️ Before generating your own images, it's important to know that the base models produce awful results compared to the better community finetunes. You have been warned.
Stable Diffusion clients
- Automatic1111, a WebUI-based client
- ✅ Most popular
- ✅ User friendly
- ❌ Limited control
- ❌ Not updated much anymore
- Forge, a fork of Automatic1111
- ✅ Improved performance
- ✅ Fixes some Automatic1111 issues
- ❌ Breaks some Automatic1111 extensions
- ComfyUI, a rich but complex client
- ✅ Modular workflows for flexibility
- ✅ Extremely configurable
- ❌ Challenging to learn and use
Diffusion models
- CivitAI, the most popular diffusion model repository
- ⚠️ NSFW
- Hugging Face, the most popular machine learning resource site
Would you like to know more?