LAwLzaWU1A t1_j5xvmyt wrote on January 26, 2023 at 8:53 AM

Reply to comment by Fafniiiir in CNET's AI Journalist Appears to Have Committed Extensive Plagiarism by iingot

I genuinely do not understand why you find that creepy and worrisome. We have allowed humans to do the exact same thing since the beginning of art, yet it seems like it is only an issue when an AI does it. Is it just that people have been unaware of it before and now that people realize how the world works they react to it?

If you have ever commissioned an artist to draw something for you, would you suddenly find it creepy and worrisome if you knew that said artist had once seen an ISIS video on the news? Because seeing that ISIS video on the news did alter how the artist's brain was wired, and could potentially have influenced how they drew your picture in some way (maybe a lot, maybe just 0,0001%, depending on what picture you asked them to draw).

The general advice is that if you don't want someone to see your private vacation photos, don't upload them to public websites for everyone to see. These training data sets like LAION did not hack into peoples' phones and steal the pictures. The pictures ended up in LAION because they were posted to the public web where anyone could see them. This advice was true before AI tools were invented, and it will be true in the future as well. If you don't want someone to see your picture then don't post it on the public web.

Also, there would be ethical problems even if we limited this to just massive corporations. I mean, first of all, it's ridiculous to say "we should limit this technology to massive corporations because they will behave ethically". I mean, come on.

But secondly and more importantly, about companies that don't produce their own content to train their AI on, but rather would rely on user submitted content? If Facebook and Instagram included a clause that said that they were allowed to train their AI models on images submitted, do you think people would stop using Facebook? Hell, for all I know they might already have a clause allowing them to do this. I doubt many people are actually aware of what they allow or don't allow in the terms of service they agree to when signing up for websites.

Edit:

It is also important to understand the amount of data that goes into these models and data sets. LAION-5B consists of 5,85 million images. That is a number so large that it is near impossible for a human to even comprehend it. Here is a good quick and easy visualization of one what one billion is. And here is a longer and more stark visualization because the first video actually uses 100,000 dollars as the "base unit", which by itself is almost too big for humans to comprehend.

Even if someone were to find 1 million images of revenge porn or whatever in the dataset, that's still just 0.02% of the data set, which in and of itself is not the same as 0.02% of the final model produced by the training. We're talking about a million images maybe affecting the output by 0.02%.

How much inspiration does a human draw from the works they have seen? Do we give humans a pass just because we can't quantify how much influence a human artist drew from any particular thing they have seen and experienced?

I also think the scale of these data sets brings up another point. What would a proposed royalty structure even look like? Does an artist which had 100 of their images included in the data set get 100/5,000,000,000 of a dollar (0.000002% of a dollar)? That also assumes that their works actually contributed to the final model in an amount that matches the portion of images in the data set. LAION-5B is 240TB large, and a model trained on it would be ~4GB. 99.99833% of all data is removed when transforming from training data to data model.

How to we accurately calculate the amount of influence you had on the final model which is 0.001% the size of the data set, of which you contributed 0.000002% to? Not to mention that these AIs might create internal models within themselves, which would further diminish the percentages.

Are you owed 0.000002% of 0.001%? And that also assumes that the user of the program accounts for none of the contributions either.

It's utterly ridiculous. These things are being discussed by people who have no understanding of how any of it works, and it really shows.

LAwLzaWU1A t1_j5xtymw wrote on January 26, 2023 at 8:30 AM

Reply to comment by Fafniiiir in CNET's AI Journalist Appears to Have Committed Extensive Plagiarism by iingot

And the consequence of that is that Disney could say that artists who used Disney works to learn how to draw without consent owe them royalties. I don't think that is what is going to happen, but logically that is the implication.

If you go through some of the lawsuits being done regarding AI you will see that what they are arguing is not exclusive to AI art tools. For example, the lawsuit from Getty seems to just states that it should be considered illegal to "use the intellectual property of others - absent permission or consideration - to build a commercial offering of their own financial benefit".

That wording applies to human artists as well, not just AI. Did you use someone else's intellectual property to build a financial offering, such as artists on fiverr advertising that they will "draw X in the style of Disney"? Then you might be affected by the outcome of this lawsuit, even if you don't use AI art tools. Hell, does your drawings draw inspiration from Disney? Then you have most likely used Disney as "training data" for your own craft as well and it could therefore be argued that these rulings apply to you as well.

I understand that artists are mainly focused on AI tools, but since an AI tool in many ways functions like a human (see publicly available data and learns from it), these lawsuits could affect human artists too.

And like I said earlier, the small artists who are worried that big companies might use AI tools instead of recruiting them are completely missing the mark with these lawsuits, because the big companies will be able to afford to buy and train on their own datasets. Disney have no problem getting the legal right to train their future AI on whichever data they want. These lawsuits will only harm individuals and small companies by making it harder for them to match the AI capabilities of big companies.

It is my firm belief that these tools have to be as open and free to use by anyone as possible, in order to ensure that massive companies don't get an even bigger advantage over everyone else. At the end of the day, the big companies currently suing companies like StabilityAI are doing so for their own personal gains. Getty images don't want people to be able to generate their own "stock images" because that's their entire business. Disney doesn't want the average Joe to be able to recreate their characters and movies with ease. They want to keep that ability to themselves.

LAwLzaWU1A t1_j5p7pc7 wrote on January 24, 2023 at 4:41 PM

Reply to comment by natepriv22 in CNET's AI Journalist Appears to Have Committed Extensive Plagiarism by iingot

Making it illegal to use pictures for learning, even if publicly available, is exactly what the lawsuits are about, and a huge portion of people (mainly artists who have had their art used for learning) support this idea.

It's in my opinion very stupid, but that's what a lot of people are asking for without even realizing the consequences if such a system was put in place (not that it can be to begin with).

LAwLzaWU1A t1_j5nns9q wrote on January 24, 2023 at 7:42 AM

Reply to comment by Mgrecord in CNET's AI Journalist Appears to Have Committed Extensive Plagiarism by iingot

Also worth pointing out that it's done by an organization that represent companies like Disney.

My guess is that massive companies like Disney are very interested in setting precedence that if their pictures are used for learning, they deserve payment. They will have their own datasets to train their AI on anyway, so they will still be able to use it.

These types of lawsuits will only serve to harm individuals and small companies, while giving massive companies a big advantage.