T O P

  • By -

FuturologyBot

The following submission statement was provided by /u/Maxie445: --- "Microsoft AI boss Mustafa Suleyman incorrectly believes that the moment you publish anything on the open web, it becomes “freeware” that anyone can freely copy and use. When CNBC’s Andrew Ross Sorkin asked him whether “AI companies have effectively stolen the world’s IP,” he said: >"I think that with respect to content that’s already on the open web, the social contract of that content since the ‘90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been “freeware,” if you like, that’s been the understanding" I think that with respect to content that’s already on the open web, the social contract of that content since the ‘90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been “freeware,” if you like, that’s been the understanding. I am not a lawyer, but even I can tell you that [the moment you create a work](https://www.copyright.gov/help/faq/faq-general.html#:~:text=Your%20work%20is%20under%20copyright%20protection%20the%20moment%20it%20is%20created), it’s automatically protected by copyright in the US. You don’t even need to apply for it, and you certainly don’t void your rights just by publishing it on the web. In fact, it’s [so difficult to waive your rights](https://creativecommons.org/public-domain/cc0/) that lawyers had to come up with [special web licenses](https://en.wikipedia.org/wiki/Copyleft#:~:text=Copyleft%20is%20the%20legal%20technique,be%20preserved%20in%20derivative%20works.) to help! Fair use, meanwhile, is not granted by a “social contract” — it’s granted by a court. It’s a legal defense that allows *some* uses of copyrighted material once that court [weighs what you’re copying, why, how much, and whether it’ll harm the copyright owner](https://www.copyright.gov/fair-use/#:~:text=Fair%20use%20is%20a%20legal,protected%20works%20in%20certain%20circumstances.)." --- Please reply to OP's comment here: https://old.reddit.com/r/Futurology/comments/1dshg9r/microsofts_ai_boss_thinks_its_perfectly_ok_to/lb2f5mh/


THE-BS

So they shouldn't have a problem with this copy of windows 10 I found..


FartyPants69

Right?! This is such easy "logic" to defeat that it's beyond shameless. His argument is that the moment anyone posts a Microsoft product online, it's "fair use." Somehow I have a feeling he'd object to that assertion pretty damn quickly. I'm so fucking tired of big tech's rapacious bullshit.


shion12312

I feel you buddy


69CunnyLinguist69

What did he feel like?


surle

Boiled cabbage


ProudMount

My cabbages!


Audio9849

Yeah didn't an IT refurbisher recently go to jail because of Microsoft? Basically he was refurbishing old dell machines and selling them and since they almost always used to come with a Microsoft key on the machine he was using that for the OS and Microsoft had a problem with that and sued him. Went to jail for 3 years I think.


Lifesagame81

Microsoft sells and distributed refurbish discs. Lundgren had 10s of thousands of copies made abroad, imported them, and sold them to refurbishers.  Customs brought the case against him, not Microsoft, and they actually issued him a warning the first time the caught him importing these counterfeit discs. He continued to improve and sell them. 


Sourpowerpete

You use this logic to point out that it isn't morally correct. I use this logic to say DMCA is bullshit and current copyright laws and business models still don't fully grasp the idea of unlimited free widespread instant distribution.


oshinbruce

People should get paid for the work. Problem with the current model is only those than can afford a team of lawyers can take advantage of that.


Elissiaro

And also the copyright lasts way, waaaay past the creators death. Iirc it's like 80 years. Basically a whole lifetime.


Terpomo11

I say reduce it to the original term of 14 years plus an optional 14-year extension.


Jack_Harb

I agree with you. Basically everyone of us is a criminal. We all used pictures or texts out of the internet for free for our PowerPoint slides in school. The artists out there learned on redrawing existing pictures, without paying anything to the OC for the learning material. Hell even Reddit is just such a criminal system. People post, repost or write so many threads with content they don’t own. Still they do it and Reddit is growing and growing and making money. So basically they make money with content they don’t own and blame it on the user if someone is mad. In the end we ALL pirating in one way or another, since everything on the web is accessible. But people don’t realize they are pirates. They throw stones out of a glasshouse not knowing they are in a glasshouse. The current laws do not update fast enough to keep up with technology.


jayvil

Pirating windows is just a lose lose situation. Yea, technically they didn't receive money from you directly but they can still collect info from your computer through their telemetry and sell that to third party.


Talinoth

They do that anyway even if you buy a legal copy. "Lose-lose" adequately describes *paying* for Windows.


IAteAGuitar

Not it you use an unattended (and cleaned up of all the bloatware and spyware) version. I have very little moral problems using one after reading this post.


Cr4zko

Do you think Microsoft gives a fuck if you pirate Windows? If they did, why did they stop trying after XP? No, it's the businesses that pirate that they go after.


reallyserious

Is that cleaning of the iso done by a third party tool? That would mean you have to trust that third party to not install a root kit and whatnot.  Also, MS could just push an update that installs all their stuff again.


LeftHandofNope

And hubris


Catch_022

Yes, but iirc Microsoft doesn't really care if people pirate their OS - they would prefer you to use a 'stolen' copy of Windows rather than Apple or Linux for e.g.


FlappyBoobs

Actually yes, they are fine with it. MS have always allowed copies of their software (even says it on the disc or cd) and twice have given legit registered windows away for free. Windows 11 is still basically free. You can get the ISO from MS direct, and use it unregistered without penalty, a key to register can be got by most people for free from MS provided they have an older copy of windows post XP, but it's a little hoop jumpy for my tastes. Most of their profit comes from business licensed which is why they tend to leave personal people alone, especially as most people still buy from SIs


username_elephant

They also low key provide support for pirated versions to cut the spread of malware.


dreadcain

I'm running windows 11 with an xp key right now, though idk if they stopped allowing that sometime after I registered mine


Somnambulist815

what do you mean, always? You don't recall Bill Gates putting out an open letter to tech devs about how much he hated freeware and file sharing?


FlappyBoobs

I mean sure, fine, you are the best kind of correct. But Bill Gates also said: >I believe that if you show people the problems and you show them the solutions they will be moved to act. So I guess he can be proved wrong and change his mind ;)


Mindfucker223

They don't, never had. Why do you think there is a windows key generator on Github? They make more money if you use windows then when you dont


visarga

They tolerated piracy in many countries for decades. At least for home use.


Badgerized

"Found".... -yar- -har- -har- -har- ![gif](emote|free_emotes_pack|joy)


Faleya

I mean you can even find it on the official microsoft page, sure you get the "this is unlicensed" info but thats about it.


danielv123

It also prevents you from changing your desktop background through the new settings app. Can still right click image and "set as background" though.


stprnn

Ironically,they don't.


IIlIIlIIlIlIIlIIlIIl

Also even if they did pirated copies of Windows are not posted by Microsoft themselves so it wouldn't really be the same thing as what the guy is talking about. Unregistered (not pirated) Windows though would be fair-game, which again always has been.


Etroarl55

It’s supposedly not secure anymore based on that new LTT video. And conveniently won’t be supported by Microsoft either as they try to push everyone to w11


Aetheus

Link to the vid? I've heard that grey market keys can theoretically be deactivated remotely anytime, but I've yet to hear of it actually happening in the wild.


Seralth

They 100% can be deactivated at any time. Microsoft very much has the power to basically kill the grey market entirely. Realistically it won't happen since they learned after XP that it's pointless. It's better to give everyone a copy and allow piracy enmass. Build the product to collect data and other metrics that are worth more money. Microsoft only benefits from you stealing from them in basically every reasonable sense. Larger market share, data harvesting, bulk contracts to system intergrators that is the supply of the grey market. Microsoft will profit off you no matter what. You /can't/ steal from Microsoft. They have entirely sidestepped the problem. Hell the only reason windows for consumers even has a price tag at all at this point is because people will stay pay it so there's no reason to not take those people money if they are entirely willing to give it to them. Monetize EVERYTHING.


novophx

they literally do not care, you can use unregistered official windows forever only difference with unofficial copy is text on bottom right


Siebje

In fact, I wanted to start a software company, so I googled what could be a good name for it. I found this name "Microsoft" online, which even already comes with a logo. Looks good to me, I think I'll use that.


haritos89

The windows copy you found wasn't posted legally. The masterpiece I drew was. Artists copy (or steal) content all the time. Why are we now shocked over this practice? Because a program does it? I think people are asking the wrong question / blaming the wrong things.


amanda_sac_town

What are you talking about, you can download isos of most windows operating systems directly from Microsofts website...


Human-Sorry

Their company was buld on the theft of IP from employees. Of course the company culture condones this as it works quite well for them. Most companies have followed suit. You now sign your IP over for a meager paycheck. Read that fine print. Boycotting is the only recourse. https://livingwage.mit.edu/ and/or End Crapitalism r/SolarPunk


Ax0nJax0n01

I think you mean 11


TheLowClassics

Point taken. Open season on ms software.  Sail the high seas. They’re cool with it.


dumpling-loverr

They're welcomed as honorary members on r/piracy. Big tech already uses user uploaded data to train their AI models whether we like it or not anyway. Reddit even agreed to use the site to train OpenAI.


taimusrs

> Reddit even agreed to use the site to train OpenAI More like OpenAI already used it anyway, and maybe decided to give a bit of a kickback because Sam Altman is also Reddit shareholder. AI companies already crawled websites before anybody knew about the bot


VenomsViper

>More like OpenAI already used it anyway, and maybe decided to give a bit of a kickback No. The whole third party app thing was because of this and Google. They were using the Reddit API to be able to digest all of the posts and comments with an API call. Reddit decided it needed to be paid and paid well for essentially being the central hub of training language model AIs and started to charge hefty sums to access the API. This is why small third party Reddit app devs had to shut down. With so many API calls made from apps like RIF the cost got insane. But not insane for the likes of OpenAI and Google, who now pay for Reddit API calls. Worth noting for anyone that is missing their third party apps on Android, there is a way to make it happen. There's still a floor for number of API calls an app makes before money enters into it and there's a way to register an app with Reddit that is "yours" but is just a build of RiF for example. DM me if you'd like to know how.


v0gue_

I know that was sarcastic, but I'm pretty sure MS *is* cool with it. They obviously don't encourage pirating their software, but they make their money from cloud services. They'd probably rather you pirate their software and remain in their telemetry ecosystem than not have their software at all


hawklost

Microsoft will even do updates on pirated software because they prefer to keep malware at bay than to allow it to harm their rep. People saying 'oh, lets just pirate Windows' ignore the reality that Windows has been given away for free multiple times, and Microsoft has never prosecuted individuals (companies are different) for pirating a copy.


VenomsViper

Can confirm. Have had pirated Windows since Windows 7 and it just keeps letting me use support, upgrade to the next version for free, etc.


_163

You can also literally download a windows ISO for free from their website, and if you install it without a license pretty much all that happens is you have the "activate windows" watermark, which can be removed anyway


InnerDorkness

I built this software from portions of existing software that 100 of my friends gave me, I didn’t pirate shit


Hoggel123

I can't wait until AI start citing its sources, and they're all from porn or malicous sites


LindsayLuohan

Q: How do you find the hypotenuse of a triangle. A: The big cock hypotenuse is found by…


MrGOOGIE

The t.m.i Length times Girth over Angle of the Shaft (aka YAW) divided by mass over WIDTH.


Pubelication

> Under the agreement, the company behind the ChatGPT chatbot will get access to Reddit content, while it will also bring AI-powered features to the social media platform. https://www.bbc.com/news/articles/cxe92v47850o.amp


quondam47

Not that I agree with it but it makes sense for companies to do deals like these since the AI companies are just going to scrape your site regardless.


WeeklyBanEvasion

It would be hard to cite thousands of sources simultaneously


fuck_the_fuckin_mods

Lots of people *still* have zero clue how any of this works. They seem to think it’s just making a collage from chunks of a few different sources. That is not at all what is happening (obviously) but many seem to have trouble getting away from this misconception.


wellboys

I mean, it is. It's a probability machine that responds to natural language prompts in order to create a facsimile of your intended product. Or maybe I'm wrong; please educate me then.


IIlIIlIIlIlIIlIIlIIl

It's just super fancy auto-resolve. It doesn't quite "cite" a source as much as it goes through a bunch of relevant sources and gives you the output based on all of them. Unless it's literally quoting, for every word it says it'd have hundreds or thousands of sources, so it's generally just difficult to boil it down to one thing to cite.


wellboys

I don't disagree with you.


kaibee

> it goes through a bunch of relevant sources and gives you the output based on all of them. Wrong. Once the model is trained/being used, there is no more going through sources.


danielv123

The sources are already gone through. I guess you can site the whole training set and context window for every token produced.


fuck_the_fuckin_mods

In terms of image generation, there is no way to track which individual pixel or group of pixels came from where. That’s not how it works. There are no intact “chunks” of something copied from somewhere else. The output is for all intents and purposes “original.” Same with text really. You might incidentally end up with similar working to an individual source, but it’s really looking at patterns across thousands or millions of sources and amalgamating those patterns into something “original.” It’s not “quoting” or “copying” anything. That’s kind of the whole idea. In the same way I can look at a thousand Disney characters and design my own unique character that shows similarities to Disney’s style without infringing on copyright, generative AI can do more or less the same thing. It should be judged through the same lens with the same laws. As to scraping data from the open web, that’s common practice for all kinds of purposes, and would need new laws that apply to all of them. As it stands, the guy seems like a douche, but he’s not really wrong. I can scrape a million Disney character images from Google image search, study them intensively, and create something “in the style of” Disney, without violating any laws (unless I directly copied their logo, or trademarked colors or whatever).


WhyWasXelNagaBanned

The problem is that ***machines are not people***. Machines do not "draw inspiration" from looking at a thousand characters, like people do. The machine requires the direct input of source data to teach it and generate things based off of that data. The human artists who created the source data used to teach the machine should rightly be compensated for their work being used, as it is often done without their permission.


EqualityWithoutCiv

Problem is, copyright law most of the time fucks over the poor so much more than the rich in its current state.


Macaw

working as intended. The golden rule, those with the gold makes the rules.


Parada484

Copyright law is a huge field. Large enough to fill specialty law firms with lawyers to practice in. Large enough to fill libraries with secondary sources regarding its origins, explaining statutes, and discussing the common law decisions of hundreds of cases. Copyright law is what allows Project Gutenburg to make thousands of works publicly available. It helps start-ups gain competitive advantage through patents. It forms the backbone of Open Source licensing agreements that have helped launch dozens of technologies. The issue is much, much more complicated than just rich people creating rules. Does that happen? Oh yeah (looking at you Disney), but it is by no means an entire branch of the law designed to aid rich people. If that's what you're looking for then mosey on over to Trusts and Estates/Tax Planning. That's my wheelhouse and I guarantee that the rich have a field day over here.


Janktronic

> Copyright law is what allows Project Gutenburg to make thousands of works publicly available. Copyright law is also what **prevents** Project Gutenburg from making countless thousands MORE being publicly available, when they should be. Namely everything that that had its copyright retroactively extended by the copyright law of 1976. They fucking STOLE the public domain.


-The_Blazer-

Right? Disney gets a 100-year-long copyright on the concept itself of Mickey Mouse wearing a specific set of clothes that they can ruin people's life with for even minor infractions, but your thing that's been out for 3 months can be copied (and then used) into the learning data of a private, for-profit AI system that is more locked-down than the Coca Cola formula and strikes billion-dollar agreements with other corporations.


Hubbardia

> Problem is, ~~copyright~~ law most of the time fucks over the poor so much more than the rich in its current state FTFY


D4RK3N3R6Y

Not really, in my country you get persecuted only if you profit out of it.


parke415

If it’s something that I, a human being, am allowed to use freely, then AI should as well. Just make sure the AI cites sources whenever a human would be expected to.


lynxbird

> Just make sure the AI cites sources whenever a human would be expected to. This would trigger so many legal issues that they will never fully disclose all the sources.


FluffyCelery4769

It will just make them up lol


maybelying

This is it. AI should be free to learn from public information, but restricted from simply copying and misrepresenting existing content as their own, just like we are.


Masonjaruniversity

Companies who use the internet to train their models should 100% have to pay out to the public similar to the Alaskan Permanent Fund Dividend. The internet is a resource that they 100% need to train their models. We provide that resource. Citizens of the world should get a piece of that as well as free access to whatever discoveries the models come up with. Again, we’re giving them access to the training data. They’re going to make trillions of dollars with the multitude of applications they’ll be able to apply this technology to. I know this 100% isn’t going to happen because how else are we gonna have immortal trillionaires.


Macaw

it is going to be private the profits, socialize the losses, as usual.


CremousDelight

I agree, similar kind of thing as government-funded research and the public deserving access to it. If AI is trained on the people, then it should be for the people.


maybelying

In Alaska, companies paying into that fund are taking resources that can't be replenished, which justifies the fee. Charging companies for allowing AI to access the internet is like charging people to access a public library. The information is out there and nothing is being lost.


FactChecker25

> Companies who use the internet to train their models should 100% have to pay out to the public similar to the Alaskan Permanent Fund Dividend. This makes zero sense. Do you need to pay out to the public for using the internet? Why would they? There is absolutely no legal standing to support what you're proposing.


Days_End

How about a compromise? You get the same amount you got from everyone who learned to program, draw, write, or fix plumbing from the internet.


-The_Blazer-

This already exists, it's called selling a textbook or a course.


Auno94

Funny thing, most of the things people do to earn money are things they have learned in a school that they have paid for. While AI does not pay for it and wants to earn money from scraping it from the internet and remixing it


visarga

You never pay for information in school, just for tutoring.


Whotea

Writers don’t have to pay anyone to read something online and get inspiration from it so why should they 


Accomplished_Cap_994

Please cite your sources to comment


Krazygamr

This is the problem I have with ChatGPT now. It tells me things, but I need to know what its referencing because I dont want to keep asking it questions. There comes a point where it is better/faster to reference the source.


mangopanic

The thing is, it's not referencing anything. I think people assume LLMs are pulling information from sources, but it's literally just a sophisticated word predictor. It's "source" is "my weights estimate word X appears 80% of the time in this context"


LichtbringerU

Some models are connected to the internet and can pull current data or links. But in general yeah true.


bremidon

Usually just saying "include references" works for me.


Sixhaunt

It doesnt always work and often it doesn't know the source or will make it up. It's like if you were asked where you learned that zebras have stripes. You have just seen it often in zoos or tv and it's been mentioned often enough but you don't really have a specific source to point to. You could reference specific instances you remember of seeing it but you're not going to remember where the original source is that you learned it from and sometimes things are never explicitly spoken but instead are inferred so it doesn't necessarily have a source. You might also reference the dictionary/encyclopedia's entry on zebras or something, even though you have never actually looked at that page in your life but can assume the fact to be there and so it makes sense for the AI to make up guesses for sources even if they are not always valid.


bremidon

Agreed that it does not *always* work, but often enough that I generally don't have a problem. And if something is generated with bad references, I just regenerate.


hawklost

ChatGPT isn't something you should trust with randomly asking questions. **Nothing** it says in the wild should be considered factual. Now, if you ask it to look at a specific article and summarize it, that is different. But if you just ask if 'Who was the first president of the US' you shouldn't trust its answer even if it is likely 100% going to answer correctly.


TyrialFrost

>AI cites sources whenever a human would So literally nowhere except in academic papers and some forms of journalism?


parke415

Guess so! Want more restrictions? Place them on humans, too.


RelativetoZero

Oh my god! I love restrictions! /s


rascal6543

I'm imagining the shitshow that would occur when AIs start citing the source as ChatGPT, and it's beautiful


Mad1ibben

Except for the whole profit thing. People can still be sued for using IP, this pretending the internet is a buffer has been repeatedly proven in court to not be a valid arguement. This is the same thing with an extra step. As long as nobody is making money off of it it is legit, as soon as whoever is interacting with the AI makes profit off of that IP they are as much in violation as a producer that has swipped a sample.


nextnode

No. No one is getting sued for having learnt things off the web and internalized the content. Which is what the commentator was saying. There is a difference between using other works as inspiration and including them in your work.


bremidon

>As long as nobody is making money off of it it is legit No, that is not how copyrights work. If you want to enforce copyrights across the web, get ready for entering a world of pain. There are no hard and fast rules. There are guidelines that are all weighed against each other. Things like whether you are creating something new, whether it is in the public interest, whether it would infringe on the original author being able to profit off their original, whether it is parody, or whether it transforms the original work so much as to no longer say the same thing: all these (plus more) get looked at and balanced against each other. The internet \*has\* been a place where things are a bit more loose, but that is more out of convention than because it is strictly legal. The only thing stopping the big companies from making our lives hell is that they already tried it, and we made them regret it. Let's not give them any hope that they can try again.


parke415

Right, but the same applies to profit-seeking humans.


flatulentence

Technically we owe all our wealth to neanderthals


Njumkiyy

Frankly I agree with you. AI isn't some big evil, but a tool. The better it gets, the better it's likely to positively impact humanity. The Internet ended a whole bunch of jobs and changed the landscape of possible jobs to a degree we only really saw with things like the printing press but you don't see a massive amount of people saying that we are ultimately worse from it. Same thing with Photoshop and digital drawings lowering the bar for entry into the arts. We should be taking steps to ease the transition to AI though as it's got the ability to be massively harmful if used incorrectly.


war-and-peace

The way everything seems to going, the only thing AI will be used for is to serve better ads for us.


notirrelevantyet

That's not at all the way everything seems to be going though. Why do you feel that way?


Njumkiyy

I don't really know about you, but I've used AI to help me in calculus assignment learning where I went wrong or to figure out the steps of certain areas I may have misunderstood after reading my textbook, writing small lines in SQL and Java, to helping me expand backstories that I write for DnD characters and generating pictures of them instead of just pulling a random google image picture. Chatgpt and other AI programs definitely have benefits beyond just ads, it depends on how you use them. That isn't even mentioning how scientific communities are using LLM AI's to basically brute force different types of material sciences. It definitely will increase the increase of low effort content, but it also helps in tons of ways as well


size_matters_not

With regards to the printed press, this is not true. The whole dumbing down and partisanship of the media has been driven by the internet as media companies slash costs due to advertising revenues tumbling. If anyone has ever complained about fake news or clickbait - that’s a direct result of the internet on the press.


parke415

I agree, and much as I feel about GMO foods, I support them as long as they’re clearly labeled as such. A “Made By Artificial Intelligence” disclaimer will suffice.


TaqPCR

When polled 80% of Americans were in favor of mandatory labels for food containing DNA. (vs 82% for GMOs) The general public is not qualified to know what should be on a label.


StateChemist

Some companies are doing this now, and your cellphone uses enough AI that every one of your personal photos gets tagged.  Every photoshop user would be tagged and the distinction between human works and AI works gets blurrier instead of clearer.


Mephisto506

Sure. Just try using the IP of a big corporation and see how far you get.


nsfwtttt

I use them every day when I “train” my brain on reading, writing, talking, drawing. So far no one said anything as long as I didn’t copy directly.


parke415

Disney and Nintendo fan art abound.


Fresque

Flashbacks of petabytes of pokeporn


PremedicatedMurder

Why? Why are you granting an AI (a product which eats other products) the same rights as a human being? That's like saying: If I, a human being, am allowed to vote, then AI should as well. Completely nonsensical.


FillThisEmptyCup

When artists tell me I should get perpetual royalties for my work as a programmer or bricklayer, I might consider their request for royalties on training data. Until then, I’ve seen the same people who are complaining loudly happily take chinese and sweatshop labor for their disposable products… as well as buy knockoff products that plainly put US and european brands out of business. I don’t have much sympathy left.


Stahlreck

> That's like saying: If I, a human being, am allowed to vote, then AI should as well. No...that is not even remotely the same thing. Are the requirements for voting the same ones as opening the internet? I kinda doubt it.


bremidon

Careful. Be very careful. I see lots of people jumping in with rash comments. The problem is that he is more right than wrong. We spent a \*lot\* of time and effort to try to make sure that we can just copy and paste whatever we find without being afraid of being sued into oblivion. If you want to see how it can all go wrong, just look at YouTube. Copyright claims are absolutely a type of warfare there. Even when the law is 100% on your side, good luck trying to get YouTube to pay attention. The way it is set up, you end up facing some lengthy expensive legal battles. And if you are YouTuber that depends on that channel, you also face the possibility of having your entire presence and livelihood zapped. The moment we start screaming about needing stronger copyrights, the big companies will happily swoop in and we will find ourselves effectively locked out of participating on the Internet. And do not think that there is any easy way to somehow block off AI learning and not blow back on us. There isn't, and the lobbies will be more than happy to screw over 99% of people so their bosses can get a fatter paycheck.


MrSimQn

This is something I thought of during the great AI art debate of 1-2 years ago. When people started making AI models to mimic the art style of certain artists and some groups started advocating that an art style should be fall under copyright to protect the artist. But no one seemingly lacked the foresight that Disney or another mega conglomerate would just swoop in and own not only the art itself but also any art that would fall under the same "art style".


IIlIIlIIlIlIIlIIlIIl

People generally just seem mad that AIs can do the things that they do but easier/automatically. When a human knows all of the artworks from X and makes their own completely original artworks but in the style of X that's ✨*inspiration/a tribute/a modern take on*✨ but an AI does the exact same thing and that's copyright infringement!!!


MrSimQn

Don't get me wrong I'm not some massive AI art supporter. Morally I side with the artist who took years to hone their craft and develop their skills vs a tool that can generate art off a whim. However now pandoras box is open and we have to make the best of it.


Sad-Set-5817

People are mad that a machine is training off of their final professional copyrighted work and selling it in a way in which the actual artist gets none of the benefit. The machine doesn't learn like a human, nor does it matter, you can not grant copyright to an image you did not create. Granting AI the same copyright protections as if a human made it would only hurt society as a whole. People are mad "hustle culture" types are wholesale plagiarizing creators work by having chatGPT reword popular videos and reupload them with AI footage. AI adds nothing that isn't already in its training data. It adds nothing, just remixing already existing data. This isn't progress. I don't know why we are so keen on replacing artists with their own work.


TaxIdiot2020

No one considered that this is literally how actual artists learn, either. People observe other types of art, work on their own skills, but ultimately use what they've already seen as the basis of their own work.


notmyrealnameatleast

Yeah that's the thing. That's how democracy is undermined. You want to push some new law or something, you first push something else that makes the public want to make that law, then you swoop in and do what the public wants...


notirrelevantyet

If the browser was created today people would be outraged that it has right click save as functionality.


FelixtheFarmer

And that is why at the domain I help run we recently blocked all access from Microsoft's Autonomous System so they can no longer crawl our website for their AI training. Over the last few weeks we had noticed 30 or 40 instances of their crawler at a time indexing our site and wondered what was going on. Now CloudFlare blocks all attempts from their AS and we won't be letting them back in ever again.


yoomiii

Can you tell the difference between Microsoft "AI" crawlers and Bing search engine crawlers?


FelixtheFarmer

On our Domain Bing shows up as Bing\[Bot\] in the list of online users, the relevant part of it's user agent is this bit "+http://www.bing.com/bingbot.htm". The other user agent was different but don't remember what it was and CloudFlare's logs only go back a few days. Block Microsoft's entire AS is a bit of a blunt tool to use and it does block Bing as well but that is a price we're willing to pay. We also block Facebook after they repeatedly sent waves of crawlers over the site, their crawlers had "AI" in the user agent name so was easy to identify. I don't feel the the knowledge built up by our user base is there for large corporations to help themselves to. They could ask and we could poll the members but they just thought they had the right to take it without even asking


arothmanmusic

I'd venture to guess this is how most humans think as well, as evidenced by the existence of memes.


space_monster

The title is a logical fallacy anyway, begging the question - the argument is assumed to be correct in the premise. Using public content for training is not stealing. People do it all the time. Directly reproducing copyrighted content is however a copyright infringement. Modifying public content is fair use.


chcampb

His terms are wrong but the crux of the issue relates to copying vs perceiving. It's generally accepted as fair use to make a copy of a text or image from the internet, in order to process it. What is being stretched is the nature of "process." In general, a human is totally free to perceive web content. Perceive in this case means it is routed through the brain in a way that the brain can recall or use the content in some form to synthesize new information or be creative or answer questions. If a human can do it, AI can do it. To be more clear, if we had a switch in our brains to turn off learning - make it impossible to store content you see long term, to learn, to synthesize. Would we require that people flip this switch in order to access internet content to which they have no explicit right to learn? No, of course not, and also we would consider this borderline abuse.


joomla00

The difference is when a human does it, there are natural limitations that we all, as humans, agree is acceptable. Limitations both in learning, and in production. A machine that can, in hours, consume, "learn", then output instantaneously and infinitely is not equivalent. We're in a new frontier, so we need updated laws. In the end, human laws are there to protect humans.


Kirbyoto

>Limitations both in learning, and in production. Even though people make their living as chefs, it is not possible to copyright a recipe - because it is ultimately a simple list of ingredients and instructions. This is because not everything a human produces can be copyrighted. Some things just belong to the general public, and they *have* to in order for society to function. >In the end, human laws are there to protect humans. And you don't think that, for example, huge corporations will be better at leveraging those "human laws" than the average person is?


nextnode

Uhm, no. I never heard about any limitations to what you can learn nor am I agreeing to any. If you want to have limitations about what we can do with the material - such as sharing it with others - sure, but then we can just apply it to both humans and machines, and you better be able to formalize that as a law rather than some undefined expectation that only exists in your head.


InsufferableMollusk

This is kind of where my thinking is too. And besides, anyone that believes that efforts to hamstring Western AI efforts aren’t at least *partially* encouraged by bad actors, is naive AF. These things *will* happen, regardless. Some authoritarian nation with access to your data and your work will use it to train an AI. They won’t GAF about our laws 😆


The_Iron_Goat

He’s conveniently pretending not to know that, especially in the case of images, the original creator is often NOT the one who put it out there where a google or bing search could find it


FartyPants69

That's also irrelevant, as the creator owns copyright from the moment the shutter snaps, whether they know it or not.


Leave_Hate_Behind

Then everybody needs to stop that the AI is the reason that all this content is there in the first place.


Cr4zko

I have to agree... it's not stealing if you put it out there for free to people to gawk at.


EngineerBig1851

Ha yes, *stealing*. Guess i stole this post, and the article too. Gonna go steal some more reddit posts.


IamSeekingAnswers

"It's ok for me to use this picture, I found it on Google Images"


duckrollin

This is a shitty article. Training AI falls under Fair Use, it's not stealing and nothing goes missing from the originator. Developers "steal" code every day, it's called Open Source and it's brought software dev forward decades. You "steal" a reddit post every time you read it or show it to someone. This is the free sharing of information and Mustafa Suleyman is absolutely right.


Omnom_Omnath

Is it stealing content when a person reads something on the open web?


MikeDizzIe

Sweet, that's where I found the cracked versions of office.


stonertear

Yep, it's all fair game if it's on the internet and not locked behind a paywall. I don't see an issue with AI looking at content that is freely out there on websites. It's the same shit when we read or scrape websites.


Ok-Seaworthiness7207

Funny how the phrase "rules for thee but none for me" is drifting from being pointed at governments to now tech corps...


Aware-Feed3227

It’s perfectly okay to steal Microsoft products as well


Gibbonici

I think the only real salient point in this whole article is this line from Suleyman: >That’s a grey area, and I think it’s going to work its way through the courts. It's specifically in reference to robots.txt, the file any website can use to request that bots don't crawl it for any reason other than indexing. It's never been binding and relies entirely on goodwill to be honoured. It's an artefact from the idealistic days of the early internet which probably does need some kind of international legal ruling to strengthen it. Somehow. In a world full of competing nations, where international law is fluid and optional wherever an advantage can be had. But the important part of that line is the courts and how internet content can and should be protected. It's easy to say "no, it's all protected by copyright and nobody should be able to use it or make money out of it without express permission", but think about what that means. How many times per week, or even per day, do we read something on the intenet, then use that in our body of knowledge? I'm a web developer and at this point nobody in my field could do their job without reading content from the internet and adapting for their own use. I don't doubt that it's the same in almost every professional and creative field to some degree. AI uses information from the internet in the same way. It doesn't just search its databases and paste content directly into its responses. It doesn't even store that information verbatim in its databases. It uses that information to build a context around a given subject, and then generates its own content derived from that context. Pretty much the same as humans do. So how do we differentiate fair use for humans from fair use for AI? In both cases content is consumed, processed and turned into knowledge which is then used to create more content. In both cases millions of dollars, if not billions, are made out of it, either by humans collectively, or by the companies that develop, maintain, and run AI. Of course, we could just say if it's AI then it can't use it this way, but if you're human you can. Nice, simple and tidy, right? Only we're back to the problem with international law and competitive nations. All it takes is for one power, the US, the EU, China or - who knows? - Nigeria or Brazil to set themselves up as an AI friendly nation and we're back at square one. We're living in a very complex time where new technologies have arisen over a very short period of time, and our global system and legal systems, even our moral and social systems, have yet to catch up with it. It's easy to forget how very new AI is, and easy to forget how very new the internet itself is. Hell, even affordable, universal computer technology is only a few decades old. All of these rapid advances compound and intersect in ways that we, as humans, are still struggling to understand, and by the time we start to grasp one thing, a load other things have already emerged. We're living in the most universally transformative time in human history, more so than the Industrial Revolution, or the Rennaissaince, or the golden ages of Athens and Rome. And it's happening at a pace that's measured in months and years instead of decades and centuries. There are no simple, easy answers to any of this. It's going to take us a while to figure it out. Right now we're still chasing the bus without really understanding what the bus is.


FactChecker25

I do not agree with the outrage in here. The AI model is basically doing what humans would do, where they read various sources to gain knowledge and form an opinion. Even if you look at artists, you can often tell who their influences were. I never see anyone suing filmmakers accusing them of being inspired by Stanley Kubrick or suing photographers for being inspired by Ansel Adams. But when it comes to AI everyone is demanding to know where the "inspiration" came from. I think if we applied the same level of scrutiny to human artists you'd find similar levels of "creative plagiarism".


Pavement-69

The Coca Cola logo is on the internet, so it's fair use and I'm free to use it however I want!?! Wow!!! I did not know this... 🤦🏻‍♂️🤦🏻‍♂️🤦🏻‍♂️


AugustsEve

If reading content on the open web is stealing so is reading a billboard. Where do the folks against this want the line drawn? AI reads and regurgitates information, same way we all do. If AI provided an answer to a question based on information it has consumed from the open web, it's stealing? What about when we as humans do it, because there's precious little of our individual knowledge that we've earned for ourselves. Are we all to never repeat any bit of information we haven't personally discovered? Limiting access to knowledge is never a good thing.


FlorinidOro

Lmao wtf? Sooooo if I buy Microsoft keys on AliBaba for a couple bucks, leave me the F*** alone


CBrinson

Microsoft fought and lost this legal battle over LinkedIn. They wanted to stop people from scraping and copying it and courts ordered they had to let people use the data since it was on the open web.


Shoose

Is looking at something, then imitating it exactly what humans do anyway?


KoolKat5000

This article is clickbait. And judging by the comments on here. People misunderstand copyright law and people misunderstand GPT's.


SilveredFlame

Tech companies for 30 years: "Downloading stuff from the internet is stealing!" Tech companies today: "Hey it's on the internet it's free!" Personally, I *agree* with the AI folks here. AI isn't doing anything we don't do, it's just better at it. We read stuff or otherwise consume it, then at some point we generate something from junk we've taken in. Same with art. We look at/learn different styles of art then generate something based on that knowledge and what we're trying to create.


OrbitOli

It *is* different though? We don't produce 100s of images in minutes, ai doesn't know if what they made is good or bad, it compares it to 1000s pieces of art it has seen before and tries to make it similar but it does not know technique on how to do things, it copy and pastes by the pixel. And if it's not good enough according to the user they can just say "do it again" and out comes another 100 images.


green_meklar

Copying isn't stealing, copyright is a bad idea and always has been, and the sooner AI can kill it the better. Learning from the Internet is what *actual humans* already do anyway. Is it a problem when biological neural nets incorporate Web content into their training dataset? Should we all pay for the data we store in our brains? Actually I'm pretty sure a lot of media companies *would* make us pay for the data we store in our brains if they could, which to me just seems like an argument against IP. It seems really arbitrary to declare that storing data on magnetic disks is somehow fundamentally worse than storing it in meat.


Maxie445

"Microsoft AI boss Mustafa Suleyman incorrectly believes that the moment you publish anything on the open web, it becomes “freeware” that anyone can freely copy and use. When CNBC’s Andrew Ross Sorkin asked him whether “AI companies have effectively stolen the world’s IP,” he said: >"I think that with respect to content that’s already on the open web, the social contract of that content since the ‘90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been “freeware,” if you like, that’s been the understanding" I think that with respect to content that’s already on the open web, the social contract of that content since the ‘90s has been that it is fair use. Anyone can copy it, recreate with it, reproduce with it. That has been “freeware,” if you like, that’s been the understanding. I am not a lawyer, but even I can tell you that [the moment you create a work](https://www.copyright.gov/help/faq/faq-general.html#:~:text=Your%20work%20is%20under%20copyright%20protection%20the%20moment%20it%20is%20created), it’s automatically protected by copyright in the US. You don’t even need to apply for it, and you certainly don’t void your rights just by publishing it on the web. In fact, it’s [so difficult to waive your rights](https://creativecommons.org/public-domain/cc0/) that lawyers had to come up with [special web licenses](https://en.wikipedia.org/wiki/Copyleft#:~:text=Copyleft%20is%20the%20legal%20technique,be%20preserved%20in%20derivative%20works.) to help! Fair use, meanwhile, is not granted by a “social contract” — it’s granted by a court. It’s a legal defense that allows *some* uses of copyrighted material once that court [weighs what you’re copying, why, how much, and whether it’ll harm the copyright owner](https://www.copyright.gov/fair-use/#:~:text=Fair%20use%20is%20a%20legal,protected%20works%20in%20certain%20circumstances.)."


Geetee52

Shouldn’t there be some sort of designation like a disclaimer or unique icon or something that discloses when content is AI generated?


CremousDelight

Should be easy enough to just do things this way, right?


Chuckleyan

When I was still a practicing attorney I was involved in a couple of lawsuits wherein some dipwad thought that just because something had been posted online that copyright had been waived, and they were reproducing it for their own profit. There was no real defense both times and it was strictly about the tally of the damages. Social contract my butt.


AlfaLaw

Same. Especially at the beginning of memedom, marketing agencies just thought it would be cool to have their brands post all kinds of copyright infringement in their social media.


mdog73

Did they lose cases just because they viewed a web page? I think that’s fine as long as they aren’t directly using the content. It’s just for learning purposes like a human would.


AlfaLaw

You are correct.


yakofalltrades

Never trust a capitalist further than you can throw the building they work in.


PhotographyBanzai

Microsoft guy trying to redefine law with his logic. 🤣👏👏 If their LLMs were completely open source then I'd kinda think he is all about the betterment of humanity or whatever he is trying to get at, but nah.


Windowplanecrash

Then the AI programs should simply be appropriated, given to all the people to use as they please. It’s only fair if it works both ways


ExoticWeapon

That is how the open web has worked since its inception generally speaking.


actual1

Movies, music, books, stories and memoirs…..open web free.


Swaggy669

So he would have no problem if I helped myself to his purchased groceries after he finished shopping. Because we are in public.


Sedu

"Copyright is for corporations, not for people." Increasingly there is this idea that human beings create ideas which are free for all to use, but corporations own whatever they touch. It is beyond vile.


TheSecondTraitor

I'm ok with it too. If I can read about a topic from reddit or learn to draw from deviantart, why shouldn't a neural network of some megacorporation?


[deleted]

Their software like Windows and Office is on the internet. I'll just take his word then. Folks, don't buy Microsoft software, just do what their AI boss does.


swibirun

Amyone remember the Cook's Source infringement hubaloo (aka But honestly Monica)? This sounds familiar: *But honestly Monica, the web is considered 'public domain' and you should be happy we just didn't 'lift' your whole article and put someone else's name on it!*


NogginToggin

Whelp, it's official. MS supports piracy! Yo ho, yo ho, Adobe suite for ye~


90ssudoartest

I foresee the internet regressing back to 1.0 days of bland text on a wall.


AHardCockToSuck

What is the difference between a human learning from content they see online and AI? Just treat the final product the same, does it break copyright?


TheBlackestIrelia

There is no one who works in any of these AI models that actually cares about the intellectual property of anyone besides themselves. All the models are made on stealing shit lol


momolamomo

Once you build a house, someone else can copy the plans and build the same house halfway around the world and you wouldn’t know. That’s his point. When a design is out there, the only thing that “protects” against its use is annoying reactive bureaucracy. Which only works a small amount of time and only when you the owner have been tipped off that someone’s using your intellectual property. Once a design is out there in the web, it’s liable to be reproduced


100GbE

\_UGH 5 DAYS IN AND IT'S STILL BEING SPAMMED MY HEAD HURTS\_ LET ME EVEN GUESS THE FIRST COMMENT: **OH SO I CAN JUST PIRATE SOFTWARE.** VERY ORIGINAL - VERY CLEVER - SEE YOU TOMORROW MORNING. FUCK MY CAPS LOCK BECAUSE FUCK SHORT TERM MEMORY! :D


NFTArtist

As someone who's had my art literally stolen online, there's nothing more annoying than people saying if it's online you're free to download, modify and reuse it. Apparently we are now living in China where everything is free game to be counterfeited. I'll just download some pictures from Disneys website, print them as posters and sell them on eBay. Or grab art from Nintendo and make my own version of Super Mario, according to redditors that's totally not theft or IP infringement.


nuke-from-orbit

He is wrong, of course but so is everyone saying training AI on copyrighted material is violating the rights of copyright holders. Copyright is violated when copyrighted material is published by someone else, not when it's processed. Google indexing the full-text web is not a breach of copyright as long as they only publish fair use amounts of each text. Gen AI is only violating copyright if a user is using it to reproduce and publish text and images.


1eho101pma

By your logic "Copyright isnt violated if you onpy process it", any AI company should be able to scrape reddit for data at any time with no repercussions. I wonder why Reddit can be paid for training AI but individuals who publish online is free game.


nuke-from-orbit

Reddit has a TOS which forbids scraping. AI companies scraping should respect robots.txt which is a machine-readable file which has been around for 25+ years for websites to tell scrapers what content is fair game. Edit: [30 years](https://en.m.wikipedia.org/wiki/Robots.txt)


Snafuregulator

So pirating thier software is cool then ? Because that's  the reasoning I am hearing


RelativetoZero

You can just download a .iso from microsoft. You need a MSLive account to manage it fully. Thats how MS keeps track of things now as far as I can tell.


Dapper_Energy777

Nah, you get the iso then get KMS and avoid their bullshit


Zobe4President

I think the issue is that the Internet has quite literally EVERYTHING on it in some crevasse or another, so anything the AI does "Create" will undoubtedly resemble something off the internet so it will be difficult to gauge what is and what isn't the AI cloning rather than amalgamating from its training data what it believes is to represent the command line. Even then the training data is from the internet so that line of thinking becomes a negative feedback loop..


xiikjuy

life is easier once you let yourself cross the red line of ethics lol


Beer-Milkshakes

I've got to agree on principle. Because I've been "stealing" content found on the web for 2 decades. Copyright and IP laws really don't matter when you have a VPN and everything has a crack file.


Centralredditfan

Idiots like this will cause everything being hidden behind login portals.


Arcticmarine

Which is how things should be protected. If it's on the public internet, it is available to be consumed by the public. All they are doing is consuming things. If you wanted to crawl the internet to find out what the most common letters used in the English language were, should that be allowed? I would argue absolutely, and AI companies aren't doing anything different. If you want to protect something, don't put it online or put it behind a login and make it private, that's all there is to it.


TheRatingsAgency

Sweet. Use his logic to nab a bunch of Microsoft stuff then that’s on the “open web”.


mpbh

This was always reddit users' stance too... Don't tell me we're changing tunes