Tumgik
#(IMAGINE. hottest legal battle of 2025 is google vs. disney litigating whether ai generated fanfiction is fair use……….)
gender-trash · 2 years
Text
incredibly funny how a bunch of people interpreted “ao3 was almost certainly scraped as part of the gpt training dataset because it’s a big easily accessible body of english language text, so you can prompt gpt with surprisingly vague stuff and it will autocomplete with snarry underage or wangxian a/b/o” as “elon musk Personally is Currently scraping ao3 and training an ai to plagiarize fic, going to go lock ALL my works on ao3 IMMEDIATELY”
its. its already in the dataset. how do you think these things work. “locking my works to registered users only until after the scraping stops!” my dude the ao3 team just needs to like add a robots.txt and check the useragent and stuff to prevent this from happening in the future*, and theyre already on it, but not only is the existing body of work presumably In the Dataset, the model has ALREADY BEEN TRAINED. that omelet isnt going to get unscrambled
(*im assuming that everyone gathering datasets for large language models is being reasonably Polite about it bc these are both very simple to circumvent — if this assumption is false then ao3 might need to graduate to Offensive Measures but also we would definitely need to bully the culprits off of hacker news)
anyway im not taking any Stance one way or the other on the “ai art debate” (other than maybe “none of you know what the hell you’re talking about”) but we’re definitely going to see a whole new world of copyright claims against the big art models and ml researchers developing new tools for “removing” stuff from a trained model, and i for one think that it will be SO entertaining to watch
718 notes · View notes