Some folks are unhappy that their own content is not only being used to build machine learning systems that replicate their work, and thus potentially endangering their livelihoods, but that the output of the models flies too close to copyright or license infringement by, instance, regurgitating this training data unaltered.
As you should know, just because something's on the internet doesn't mean you can automatically use it for whatever purpose you feel like: terms and conditions may apply
@ecksmc When I worked in IT, we just blocked scrapers and left normal traffic alone, and did nothing to impact our real users. There are/were many tools available to throttle abuse of a system that don't involve bothering users.
imo if things are on the open web then it's fair game
if folk like content providers don't want their stuff being used there are ways to stop it being on the open web then AI wouldn't be able to use it
they can use paywall services like patreion, substcsk, rumble, medium etc.,,,,, to name a few all subscrition based platforms can't be used for AI learning like open platforms can cause they have good terms & conditions about that
open web stuff is a free for all