FAQ: Difference between revisions

Help

Latest revision as of 14:33, 21 December 2024

Frequently Asked Questions

**Who are you, and what are you doing here?**

I'm Amorphant. I make things, and am now making ~~fake~~ real photographs of monster girls, along with this wiki.

Why is your FAQ about AI and not monster girls?

There's an ongoing misinformation-based harassment campaign against AI tool users. It's why I've created this FAQ, as well as outlined my workflow and "artistic credentials". Sometimes, being able to link to it is the best way to respond to active harassment, creating a shining beacon of Barbara Streisand.

Generative AI tools

What are AI-generated images?

In case you missed it, a new type of AI called diffusion models are capable of generating high-quality images. They learn concepts by observing billions of images of everything. While you can have the models generate images entirely for you with text prompting, there are many other means of controlling their output to varying degrees. They can be used as render engines for CAD software, for example.

Under the hood, they're the same type of AI as large language models such as ChatGPT: transformers using the attention mechanism. If you're curious, here's a good introductory video by respected mathematics educator Grant Sanderson of 3blue1brown, which he produced for the LLM exhibit in the Computer History Museum in California. Lengthy deep-dives into the technology are also available on his channel.

What are AI-assisted images?

AI-assisted is a term used to describe images that involve both human output and AI output. Several examples are:

A drawing, photograph or painting created by a person, then run through a filter using AI
A 3D work created by a person, where textures and lighting are applied by AI as a filter
An AI-generated image that's then reworked or heavily modified by a person

Is generative AI theft, as some say?

No, not metaphorically, not literally: They not only don't deprive the original creator of their works, they don't store any portion of those works in any way. They can't be used to recreate works they learned from, outside of maliciously crafting them to do so. The only works that they have a chance of reproducing verbatim are cultural cornerstones that are short and heavily used in our media in general, like the phrase "in God we trust" or the lyrics to Happy Birthday.

Another popular false claim is the idea that copyrighted material is used without permission when, in actuality, no copyright violation happens. Copyrighted material shouldn't be reproduced in whole or part by anyone but the copyright holder, and when it comes to diffusion models and LLMs, none of it is.

This needs to be emphasized because fabricated claims are being spread that it's already been "proven" that diffusion models store parts of source images and stitch them together. It's in fact easy to prove to one's self that this is not possible: by going to CivitAI NSFW⚠️ to get a 6gb model, preferably a good finetune of SDXL NSFW⚠️, then getting a Stable Diffusion client like Automatic1111 to drop the model into, one can use an 8gb GPU to generate vastly more unique images in unique styles than could possibly be stored as re-usable pieces in 6gb, even using jpeg compression.

It should be noted that, despite the hate directed at AI images being unwarranted, one legitimate concern is that of styles being ripped off. Precisely imitating everything about another person's style, say by including "in the style of a painting by john doe" in a prompt is, even if not theft, extremely shady. More reputable models obscure or forbid using specific artists' styles, including the one that I use. My own style was carefully created and tuned by me using tools such as iterative LoRA training to slowly approach what I wanted, rather than asking the diffusion model for a specific look.

For more information on how attention-based transformers work, see Grant Sanderson's introductory video from the LLM exhibit in the Computer History Museum in California, the same one linked above.

What about attribution?

Attribution doesn't apply to the mechanics of generative AI, in the same way that it doesn't apply to human works not inspired by a specific artist: Let's say you listen to all of the roughly 100,000 country music albums in existence. You've now got a solid understanding of all things country, and make a country album of your own. Who do you attribute?

Why is there such venom in the backlash?

The companies that first created them were scummy enough to suggest replacing artists with them. Given that they can only produce things of value in the hands of artists, and are tools for artists, marketing diffusion models as replacements for artists is about the worst kind of sales pitch they could have conceived of.

Unfortunately, anger about that gets misdirected at people making legitimate use of these tools as part of a larger creative process, and detractors feel justified in using any tactics to fight against it. This includes fabricating misinformation and acting as toxic as they feel justified in acting. Compounding this is the fact that it's a disruptive technology to begin with.

Do you support the companies in question?

No, none of them get a red cent from me. I use a free model run locally on a PC with a beefy GPU. I fine-tune it myself, developing a style based on an aesthetic I wanted that no one was doing. I'm not releasing any of my own tunings, nor the specific techniques I use to bring things to life.

Will lost jobs become a problem?

No. While a permanent decrease in the number of total jobs would be a problem, the idea that this actually happens is based on the lump of labour fallacy. If it were true, only a small percentage of us would be employed today due to past inventions like the printing press taking all the jobs. The labor market is an actual market, and it behaves like a market.

There will be a shift, but it will look like the many other shifts of the same kind that hit the labor market. It also won't be as pronounced as some fear, with the world now discovering that it takes nearly as much time to get good output from diffusion models as to create by other means.

Is it artistic?

There are various use cases, creating a spectrum. Text prompting is obviously less artistic than manually editing an image afterward, or creating a 3D scene manually and using a diffusion model for the final pass/render. In my current workflow, I'm doing the latter.

Where did the deadpan go?

It's tucked away for the moment, as to some, this is serious business.

Resources

⚠️ Before generating your own images, it's important to know that the base models produce awful results compared to the better community finetunes. You have been warned.

Stable Diffusion clients

Automatic1111, a WebUI-based client
- ✅ Most popular
- ✅ User friendly
- ❌ Limited control
- ❌ Not updated much anymore
Forge, a fork of Automatic1111
- ✅ Improved performance
- ✅ Fixes some Automatic1111 issues
- ❌ Breaks some Automatic1111 extensions
ComfyUI, a rich but complex client
- ✅ Modular workflows for flexibility
- ✅ Extremely configurable
- ❌ Challenging to learn and use

Diffusion models

CivitAI, the most popular diffusion model repository
- ⚠️ NSFW
Hugging Face, the most popular machine learning resource site

Would you like to know more?

@@ Line 1: / Line 1: @@
-== <big><big>'''Frequently asked questions'''</big></big> ==
+== <big><big>'''Frequently Asked Questions'''</big></big> ==
+=== '''Who ''are'' you, and what are you doing here?''' ===
+I'm [[Amorphant]]. I make things, and am now making <s>fake</s> real photographs of [[Monster girl|monster girls]], along with this wiki.
+=== '''Why is your FAQ about AI and not monster girls?''' ===
+There's an ongoing misinformation-based harassment campaign against AI tool users. It's why I've created this FAQ, as well as outlined my [[Amorphant#Workflow|workflow]] and "[[Amorphant#Artistic_Credentials|artistic credentials]]". Sometimes, being able to link to it is the best way to respond to active harassment, creating a shining beacon of ''[https://en.wikipedia.org/wiki/Streisand%20effect Barbara Streisand]''.
+== <big><big>'''Generative AI tools'''</big></big> ==
 === '''What are AI-generated images?''' ===
-In case you haven't heard, a new type of AI called diffusion models are capable of generating high-quality images. They learn concepts by observing billions of images of ''everything''. While you can have the models generate images entirely for you with text prompting, there are many other means of controlling their output to varying degrees. They can be used as render engines for CAD software, for example.
+In case you missed it, a new type of AI called diffusion models are capable of generating high-quality images. They learn concepts by observing billions of images of ''everything''. While you can have the models generate images entirely for you with text prompting, there are many other means of controlling their output to varying degrees. They can be used as render engines for CAD software, for example.
-Under the hood, they're the same type of AI as large language models such as ChatGPT: [https://en.wikipedia.org/wiki/Transformer%20(deep%20learning%20architecture) transformers] using the [https://en.wikipedia.org/wiki/Attention_Is_All_You_Need attention mechanism].
+Under the hood, they're the same type of AI as large language models such as ChatGPT: [https://en.wikipedia.org/wiki/Transformer%20(deep%20learning%20architecture) transformers] using the [https://en.wikipedia.org/wiki/Attention_Is_All_You_Need attention mechanism]. If you're curious, here's a good [https://www.youtube.com/watch?v=LPZh9BOjkQs introductory video] by respected mathematics educator Grant Sanderson of [https://www.youtube.com/@3blue1brown 3blue1brown], which he produced for the LLM exhibit in the [https://computerhistory.org/ Computer History Museum] in California. Lengthy deep-dives into the technology are also available on his channel.
 === '''What are AI-assisted images?''' ===
-'''AI-assisted''' is used to describe images that involve both human output and AI output. Several examples are:
+'''AI-assisted''' is a term used to describe images that involve both human output and AI output. Several examples are:
 * A drawing, photograph or painting created by a person, then run through a filter using AI
 * A 3D work created by a person, where textures and lighting are applied by AI as a filter
 * An AI-generated image that's then reworked or heavily modified by a person
 === '''Is generative AI theft, as some say?''' ===
-No, not metaphorically, not literally. The difference is sharp and easy to see when the distinction is drawn between the reality and the misinformation: They not only don't deprive the original creator of their works, they don't store any portion of those works in any way. They can't be used to recreate works they learned from, outside of maliciously crafting them to do so. The only works that they have a chance of reproducing are cultural cornerstones that are short and heavily used in our media in general, like the phrase "in God we trust" or the lyrics to Happy Birthday.
+No, not metaphorically, not literally: They not only don't deprive the original creator of their works, they don't store any portion of those works in any way. They can't be used to recreate works they learned from, outside of maliciously crafting them to do so. The only works that they have a chance of reproducing verbatim are cultural cornerstones that are short and heavily used in our media in general, like the phrase "in God we trust" or the lyrics to Happy Birthday.
-Another popular claim that falls apart under scrutiny is the idea that copyrighted material is used without permission when, in actuality, no copyright violation happens. Copyrighted material shouldn't be reproduced in whole or part by anyone but the copyright holder, and when it comes to diffusion models and LLMs, ''[https://www.youtube.com/watch?v=9-Jl0dxWQs8 none of it is]''.
+Another popular false claim is the idea that copyrighted material is used without permission when, in actuality, no copyright violation happens. Copyrighted material shouldn't be reproduced in whole or part by anyone but the copyright holder, and when it comes to diffusion models and LLMs, ''[https://www.youtube.com/watch?v=9-Jl0dxWQs8 none of it is]''.
-This needs to be emphasized because there are false claims that it's already been "proven" that diffusion models store parts of source images that they then stitch together to create new images. It's in fact ''easy'' to prove to one's self that it's not possible: by going to [https://civitai.com/ CivitAI] <small>'''''NSFW'''''</small>⚠️ to get a 6gb model, preferably a [https://civitai.com/models/133005/juggernaut-xl good finetune of SDXL] <small>'''''NSFW'''''</small>⚠️, then getting a WebUI client like [https://github.com/AUTOMATIC1111/stable-diffusion-webui Automatic1111], one can quickly use an 8gb GPU to generate more unique images in unique styles than could possibly be stored in 6gb, even using jpeg compression.
+This needs to be emphasized because fabricated claims are being spread that it's already been "proven" that diffusion models store parts of source images and stitch them together. It's in fact ''easy'' to prove to one's self that this is not possible: by going to [https://civitai.com/ CivitAI] <small>'''''NSFW'''''</small>⚠️ to get a 6gb model, preferably a [https://civitai.com/models/133005/juggernaut-xl good finetune of SDXL] <small>'''''NSFW'''''</small>⚠️, then getting a Stable Diffusion client like [https://github.com/AUTOMATIC1111/stable-diffusion-webui Automatic1111] to drop the model into, one can use an 8gb GPU to generate ''vastly'' more unique images in unique styles than could possibly be stored as re-usable pieces in 6gb, even using jpeg compression.
-It should also be noted that there is a massive amount of toxic misinformation and behavior from a vocal minority being directed at people using AI tools to make images.
+It should be noted that, despite the hate directed at AI images being unwarranted, one legitimate concern is that of styles being ripped off. Precisely imitating everything about another person's style, say by including "in the style of a painting by john doe" in a prompt is, even if not theft, extremely shady. More reputable models obscure or forbid using specific artists' styles, including the one that I use. My own style was carefully created and tuned by me using tools such as iterative LoRA training to slowly approach what I wanted, rather than asking the diffusion model for a specific look.
-It should ''also'' be noted that, despite the hate directed at AI art being unwarranted, one legitimate concern is that of styles being ripped off to a shameless degree. While inspiration is one thing, precisely imitating everything specific about another person's style, say by including "in the style of a painting by john doe" in a prompt is, even if not theft, extremely shady. More reputable models obscure or forbid using specific artist's styles, including the one I use.
+For more information on how attention-based transformers work, see Grant Sanderson's [https://www.youtube.com/watch?v=LPZh9BOjkQs introductory video] from the LLM exhibit in the [https://computerhistory.org/ Computer History Museum] in California, the same one linked above.
+=== '''What about attribution?''' ===
-For the curious, here's a good [https://www.youtube.com/watch?v=LPZh9BOjkQs introductory video] by respected mathematics educator Grant Sanderson of [https://www.youtube.com/@3blue1brown 3blue1brown] going over how attention-based models work. Lengthy deep-dives into the technology are also available on his channel.
+Attribution doesn't apply to the mechanics of generative AI, in the same way that it doesn't apply to human works not inspired by a specific artist: Let's say you listen to all of the roughly 100,000 country music albums in existence. You've now got a solid understanding of all things country, and make a country album of your own. Who do you attribute?
 === '''Why is there such venom in the backlash?''' ===
 The companies that first created them were scummy enough to suggest ''replacing'' artists with them. Given that they can only produce things of value in the hands of artists, and are tools ''for'' artists, marketing diffusion models as replacements for artists is about the worst kind of sales pitch they could have conceived of.
-Unfortunately, anger about that gets misdirected at people making legitimate use of these tools as part of a larger creative process, and detractors feel justified in using '''''any''''' tactics to fight against it. This includes fabricating misinformation and acting as toxic as they feel justified in acting. This is compounded by the fact that it's a disruptive technology to begin with.
+Unfortunately, anger about that gets misdirected at people making legitimate use of these tools as part of a larger creative process, and detractors feel justified in using '''''any''''' tactics to fight against it. This includes fabricating misinformation and acting as toxic as they feel justified in acting. Compounding this is the fact that it's a disruptive technology to begin with.
 === '''Do you support the companies in question?''' ===
-No, none of them get a ''red cent'' from me. I use a free model run locally on a PC with a beefy GPU. I fine-tune it myself, developing a style based on an aesthetic I wanted that no one was doing. I'm not releasing any of my own tunings, nor the techniques I developed to bring things to life.
+No, none of them get a ''red cent'' from me. I use a free model run locally on a PC with a beefy GPU. I fine-tune it myself, developing a style based on an aesthetic I wanted that no one was doing. I'm not releasing any of my own tunings, nor the specific techniques I use to bring things to life.
 === '''Will lost jobs become a problem?''' ===
-No. While a permanent decrease in the number of total jobs ''would'' be a problem, the idea that this actually happens is based on the [https://en.wikipedia.org/wiki/Lump%20of%20labour%20fallacy lump of labour fallacy]. If it were true, only a small percentage of us would be employed today due to past inventions like the printing press taking all the jobs. The labor market is an actual market, and it behaves like a market. There ''will'' be a shift, but it will look like the many other shifts of the same kind that hit the labor market.
+No. While a permanent decrease in the number of total jobs ''would'' be a problem, the idea that this actually happens is based on the [https://en.wikipedia.org/wiki/Lump%20of%20labour%20fallacy lump of labour fallacy]. If it were true, only a small percentage of us would be employed today due to past inventions like the printing press taking all the jobs. The labor market is an actual market, and it behaves like a market.
+There ''will'' be a shift, but it will look like the many other shifts of the same kind that hit the labor market. It also won't be as pronounced as some fear, with the world now discovering that it takes nearly as much time to get good output from diffusion models as to create by other means.
 === '''Is it artistic?''' ===
-There are various use cases, creating a spectrum. Text prompting is obviously less artistic than manually editing an image afterward, or creating a textureless 3D scene in Blender and using a diffusion model for the final render. In my [[Amorphant#Workflow|current workflow]], I'm doing the latter.
+There are various use cases, creating a spectrum. Text prompting is obviously less artistic than manually editing an image afterward, or creating a 3D scene manually and using a diffusion model for the final pass/render. In my [[Amorphant#Workflow|current workflow]], I'm doing the latter.
-=== '''Where is the deadpan?''' ===
+=== '''Where did the deadpan go?''' ===
-Tucked away for the moment, as to some, this is ''serious business''.
+It's tucked away for the moment, as to some, this is ''serious business''.
 == '''Resources''' ==
+⚠️ Before generating your own images, it's '''important''' to know that the base models produce ''awful'' results compared to the better community finetunes. You have been warned.
 === '''Stable Diffusion clients''' ===
 * [https://github.com/AUTOMATIC1111/stable-diffusion-webui Automatic1111], a WebUI-based client