mNo edit summary |
mNo edit summary |
||
Line 4: | Line 4: | ||
In case you haven't heard, a new type of AI called diffusion models are capable of generating high-quality images. They learn concepts by observing billions of images of ''everything''. While you can have the models generate images entirely for you with text prompting, there are many other means of controlling their output to varying degrees. They can be used as render engines for CAD software, for example. | In case you haven't heard, a new type of AI called diffusion models are capable of generating high-quality images. They learn concepts by observing billions of images of ''everything''. While you can have the models generate images entirely for you with text prompting, there are many other means of controlling their output to varying degrees. They can be used as render engines for CAD software, for example. | ||
Under the hood, they're the same type of AI as large language models such as ChatGPT: | Under the hood, they're the same type of AI as large language models such as ChatGPT: [https://en.wikipedia.org/wiki/Transformer%20(deep%20learning%20architecture) transformers] using the [https://en.wikipedia.org/wiki/Attention_Is_All_You_Need attention mechanism]. | ||
=== '''Is generative AI theft, as some say?''' === | === '''Is generative AI theft, as some say?''' === | ||
No, not metaphorically, not literally. The difference is sharp and easy to see when the distinction is drawn between the reality and the misinformation: They not only don't deprive the original creator of their works, they don't store any portion of those works in any way. They can't be used to recreate works they learned from, outside of maliciously crafting them to do so. The only works that they have a chance of reproducing are cultural cornerstones that are short and heavily used in our media in general, like the phrase "in God we trust" or the lyrics to Happy Birthday. | No, not metaphorically, not literally. The difference is sharp and easy to see when the distinction is drawn between the reality and the misinformation: They not only don't deprive the original creator of their works, they don't store any portion of those works in any way. They can't be used to recreate works they learned from, outside of maliciously crafting them to do so. The only works that they have a chance of reproducing are cultural cornerstones that are short and heavily used in our media in general, like the phrase "in God we trust" or the lyrics to Happy Birthday. |
Revision as of 00:45, 11 December 2024
Frequently asked questions
What are AI generated images?
In case you haven't heard, a new type of AI called diffusion models are capable of generating high-quality images. They learn concepts by observing billions of images of everything. While you can have the models generate images entirely for you with text prompting, there are many other means of controlling their output to varying degrees. They can be used as render engines for CAD software, for example.
Under the hood, they're the same type of AI as large language models such as ChatGPT: transformers using the attention mechanism.
Is generative AI theft, as some say?
No, not metaphorically, not literally. The difference is sharp and easy to see when the distinction is drawn between the reality and the misinformation: They not only don't deprive the original creator of their works, they don't store any portion of those works in any way. They can't be used to recreate works they learned from, outside of maliciously crafting them to do so. The only works that they have a chance of reproducing are cultural cornerstones that are short and heavily used in our media in general, like the phrase "in God we trust" or the lyrics to Happy Birthday.
Another popular claim that falls apart under scrutiny is the idea that copyrighted material is used without permission when, in actuality, no copyright violation happens in any way. Copyrighted material shouldn't be reproduced in whole or part by anyone but the copyright holder, and when it comes to diffusion models and LLMs, none of it is.
It should also be noted that there is a massive amount of toxic misinformation and behavior from a vocal minority being directed at imagery making use of AI tools.
It should also be noted that, despite most of the hate directed at AI art being unwarranted, one legitimate concern is that of styles being ripped off to a shameless degree. While inspiration is one thing, precisely imitating everything specific about another person's style, say by including "in the style of a painting by john doe" in a prompt is, even if not theft, extremely shady. More reputable models obscure or forbid using specific artist's styles, including the one I use.
For the curious, here's a good introductory video by respected mathematics educator Grant Sanderson of 3blue1brown going over how they work. Lengthy deep-dives into the technology are also available on his channel.
Why is there such backlash?
The companies that first created them were scummy enough to suggest replacing artists with them. Given that they can only produce things of value in the hands of artists, and are tools for artists rather than replacements, marketing diffusion models as artist replacements is about the worst kind of sales pitch they could have conceived of.
Unfortunately, anger about that gets misdirected at people making legitimate use of these tools as part of an actual creative process, and detractors feel justified in using any tactics to fight against it, including flat out fabricating misinformation and acting as toxic as they feel justified in acting. This is compounded by the fact that it's a disruptive technology.
Do you support these companies?
No, none of them get a red cent from me. I use a free model run locally on a PC with a beefy GPU. I fine-tune it myself, developing a style based on an aesthetic I wanted that no one was coming close to. I'm not releasing any of my own tunings, nor techniques I developed to bring things to life.
Will lost jobs become a problem?
No. This claim is based on the lump of labor fallacy. If it were true, we would have rampant unemployment due to the printing press. The labor market is an actual market, and behaves like a market. There will be a shift, but it will look like the many other shifts of the same kind that hit the labor market.
Is it artistic?
There are various use cases, creating a spectrum. Text prompting is obviously less artistic than manually editing an image afterward, or creating a textureless 3D scene in Blender and using a diffusion model for the final render.
Where is the deadpan?
Tucked away for the moment, as to some, this is serious business.
When are you going to finish this FAQ?
Probably in the next few days.
Would you like to know more?