Google says public data is fair game for training its AIs
30 comment bubble on white

we're just being honest, says web giant

Google has updated its privacy policy to confirm it scrapes public data from the internet to train its AI models and services – including its chatbot Bard and its cloud-hosted pro

the fine print

"Google uses information to improve our services and to develop new products, features and technologies that benefit our users and the public"

theregister.com/2023/07/06/goo

Some folks are unhappy that their own content is not only being used to build machine learning systems that replicate their work, and thus potentially endangering their livelihoods, but that the output of the models flies too close to copyright or license infringement by, instance, regurgitating this training data unaltered.

As you should know, just because something's on the internet doesn't mean you can automatically use it for whatever purpose you feel like: terms and conditions may apply

Follow

imo if things are on the open web then it's fair game

if folk like content providers don't want their stuff being used there are ways to stop it being on the open web then AI wouldn't be able to use it

they can use paywall services like patreion, substcsk, rumble, medium etc.,,,,, to name a few all subscrition based platforms can't be used for AI learning like open platforms can cause they have good terms & conditions about that

open web stuff is a free for all

@ecksmc When I worked in IT, we just blocked scrapers and left normal traffic alone, and did nothing to impact our real users. There are/were many tools available to throttle abuse of a system that don't involve bothering users.

Sign in to participate in the conversation

CounterSocial is the first Social Network Platform to take a zero-tolerance stance to hostile nations, bot accounts and trolls who are weaponizing OUR social media platforms and freedoms to engage in influence operations against us. And we're here to counter it.