#AI data scraping
Explore tagged Tumblr posts
Text
So I'm hearing UESP is getting scrubbed by AI companies.
I spent the better part of today and yesterday downloading as many images as I can from there in the event it's DDOS'd or similarly taken out of action
82 notes
·
View notes
Text
Because of AI data scraping
Hemlock, Nightshade, Queen Anne’s Lace and Dancing on the Clock will no longer be available to non-registered archive users. I hate that I have to do this, but I want to protect my works from further exploitation and blatant copyright violation for the purposes of generative AI.
3 notes
·
View notes
Text
Block This AI-Tool Account On Fanfiction Dot Net ASAP
If you are still using FFnet like I am, block this AI Tool Account That Pretends to be a Fic Writer who randomly leaves reviews in a very ominous way that bothers me. I got a barrage of emails informing of sudden reviews and follows. Each review is a copypasta, and I felt like my fics were being branded. SO BLOCK IT IS!



Log in, choose the account option, find the block users option like in the screengrab below, then add this FFId: 16123984 into the slot and select save.

I honestly don't know if this app can data scrape or not, since we don't have an option to lock or private our FFnet accounts, keep blocking anything that you think is suspicious.
Here is the Link to the account. They have one AI-Generated so-called Naruto Fic, which literally has nothing to do with Naruto.

Account ID: 16123984
Account Link:
Share, reblog, amplify!
#ffnet#fanfiction dot net#fanfiction dot hell#AI data scraping#ai tools#ai is theft#data scraping#ai is a plague#fic writing#fic writers#writing for the web#writers on tumblr#writing community#writers#fic writer problems#ao3 writer#ao3 author#fanfiction#fanfic writing#fanfic woes#artificial intelligence
10 notes
·
View notes
Text
What should you ask a ProcureTech solution provider before you agree to see their demo?
#Agent-Based (AI) Model#Agentic AI#AI data scraping#generative AI#ProcureTech demos#strand commonality
0 notes
Note
If you want a fic to be finished, then perhaps leave a comment for the writer?
oh I wasn’t aware it was feeding the ai. I’ve inserted hundreds of fics into chatgpt for their continuation or for a different plot within the same context just for fun and out of curiosity… but I’ve never posted any of them…
Indeed, anything that is given to AI it can use later to draw from. That's why it doesn't matter if you post them or not as it has now access to those writers' texts without their permission.
~Mod L
#fanfiction#fanfiction writers#fandom writers#WIP#WIPs#work in progress#Ai#ai data scraping#ai generate fiction#not cool bro#what a mean thing to do.#ao3#AO3#Archive of Our Own#archiveofourown.org#kudos#comments#commenting
36K notes
·
View notes
Note
will you be locking your stories so that only registered users can access it?
Oh, because of the recent news of AI developers/companies data scraping from AO3?
Don't worry, I will be keeping my stories public. I thought about it, but at this point it's most likely too late to do anything about it, especially for older stories when the data scraping has already happened. They've got what they wanted from the site to train their AI with.
From what I've heard, AO3 only caught on after the deed was done, and they've now implemented measures against it (or at least against the common form of it). It might prevent future data scraping of the more recent works, but it won't ever be enough to truly stop it from happening again.
So I get why they're recommending AO3 users to lock their work for registered users only (and/or select the option to hide their work from search engines). It will help to make it a little harder for other AI scraper bots. However, I wouldn't be surprised if a creative programmer or whatnot codes a scraper to somehow bypass such measures, maybe even make an account to get access that way.
It's not impossible. Look at Tumblr, we got bots that can make accounts, send messages and interact with posts. All because behind their creation was a determined person/s who made it happen.
In the end, most writers will have to decide what's best for them and if it's worth limiting their audience or not.
0 notes
Text
And while we're talking about ai theft: turn. off. grammarly. Disable it. Delete it. Get that shit off of your computer ASAP.
I never realized how much of my shit is scanned by grammarly until today. It scans my emails, my text posts on this bewitched platform, my wips on google docs, my youtube comments--literally everything ive ever typed on my laptop is scanned by grammarly. And I've been allowing this to happen for years.
Turn. Off. Grammarly.
#blue rambles#ai theft#ai#grammarly#data scraping#i dont like how many suggestions are made when im editing my writing anyway#theyre distracting+irrelevant half the time#and the only reason i even have grammarly is bc of uni
12K notes
·
View notes
Note
How are you live what's happening with ao3 and the AI? Does it discourage you in any way from publishing your stories?
Great question. I haven't archive locked my stories and don't plan to. That's a personal decision I've made for myself and my own content, and that doesn't mean I don't wholeheartedly support my fellow authors who do so. But I'm of the (again personal) opinion that my works already have been scraped, and will continue to be scraped in some capacity. As have all of my texposts on here.
I appreciate the work the OTW is doing to take down data on other sites where it has been scraped. I think that's absolutely the right course of action. But personally, I am under no illusions that by archive-locking my fics, I am 100% preventing the scraping/sharing/AI use of my content. And at this point, even when we first learned of that big "scrape" a while back, it was too late.
My goal is to make my content as widely available for readers as possible, which comes with drawbacks. Archive-locking fics came with a significant reduction in hits/comments/kudos for some authors, and I decided that was a risk I personally did not want to take. Especially when, again, I was of the belief that many of my fics had already been scraped/were vulnerable to being scraped before we learned about these mass-scraping incidents.
Additionally, I'm quite certain people have been feeding my fics into AI processors, ChatGPT, etc, for a while now. It's not something I have control over, and people will continue to do it even when they know it's wrong. Even with ao3 accounts.
I don't own my fanfiction content, I can't make money off of it, and I don't want to. This would be a very different conversation if I did. Truthfully, my only hope is that by continuing to write a/b/o, and large amounts of it, I can "spike" whatever dataset is using my fics. That thought brings me joy, even if it's a little silly and far-fetched with these better algorithms.
#asks#anon#ao3#archive of our own#myfic#theresurrectionist#writing#data scraping#OTW#AI generators#chat gpt
190 notes
·
View notes
Text
Agreeing with the above, but also very curious how that's going to play out with companies/organisations that use google drive as their main storage/shared work place... (also how that would then affect cyber security and data protection regulations in varying countries)
Like one of my old jobs had everything from funding documents and draft articles to workflows on google drive, and a non-zero amount of those were on google doc or sheets (despite my desperate pleading due to hating how it messed up with Microsoft Word and Excel formatting)...
Hey, I usually don't make serious posts, but this one is a big deal to me and many other small creatives, so please read or spread the message, especially in light of the WGA and SAG-AFTRA strikes.
Google Labs is introducing a new ai that will just scrape your documents for ideas, plots, and other things to feed its algorithms. This information is stored SEPARATELY from your Google account.
Google will be collecting this user data and have at their disposal the ability to have free scripts for any content they want to produce without giving a cent to the original authors of these scripts, novels, character organizers, fics, etc. that the ai would be heavily basing these generated stories on. But that's not all, these ai's are also showing these works to human reviewers in order to "help improve quality of products" sharing works without consent of the author with real actual people to see if the ai is responding well enought to the prompts and imput (aka the intire documents as this is a fully intagrated ai -_-)
The reason this is going to be so terrible is that not if but when Google decides to open a production company (youtube movies will be first but they likely will open another streaming platform) these scripts their ai wrote them are probably going to be almost wholly plagiarized from docs user content that was stolen, without paying an actual writer anything at all.
If you are at all creative in the slightest PLEASE get your work off of docs before the features are fully integrated, scrub your google drive, and move to another program. Mircosoft, Libre (free and open source) ANYTHING to protect your works and if possible move away from Google products generally. Use a different search engine even. I'm going to link two TikTok creators from who I found this information as well as a list of alternative products to use away from Google in hopes to avoid them. Make them feel the backlash. It doesn't matter if they already have your data take charge of your internet person and DON'T GIVE IN
Ecosia (browser)
DuckDuckGo (browser)
Microsoft Products (Not amazing but not drifting yet, I like the free version enough)
Libre Office (free & open source!!)
Waze (maps)
Please add on if you can any information or resources! Stay vigilant and stay informed
3K notes
·
View notes
Text
EMERGENCY AUTHOR UPDATE
I feel like this needs to be warned about. Everything on Ao3 that isn't set to private, HAS been data scraped and fed to 3 data sites that provide data for AI training, including writing and artwork.
Yes, this includes my entire Ennead series and everything else I've ever written and posted. As well as anything you all have written but not made private.
Ao3's legal team is fighting it and one site has made the data unavailable, but the other two aren't based in the USA so the fight is harder.
This is frustrating and upsetting news, especially for those of us who now need to pick between our Guest readers who have supported us for a long time and protecting the hard work that we've put our hearts and souls into and I just ask that we support each other and our choices during this time.
The link here has more details but from now on, until I can be sure there's a way to protect my work, which I've spent decades writing and planning, my stories will be posted for members of the site only.
#fanfiction#creative writing#writer#author#authors#writing woes#ao3#ai scrapers#data scraping#ai#artificial intelligence#technology#ai model#fandom#writerscommunity#writers on tumblr#writing#ao3 writer
180 notes
·
View notes
Text
I don’t have a posted DNI for a few reasons but in this case I’ll be crystal clear:
I do not want people who use AI in their whump writing (generating scenarios, generating story text, etc.) to follow me or interact with my posts. I also do not consent to any of my writing, posts, or reblogs being used as inputs or data for AI.
#not whump#whump community#ai writing#beans speaks#blog stuff#:/ stop using generative text machines that scrape data from writers to ‘make your dream scenarios’#go download some LANDSAT data and develop an AI to determine land use. use LiDAR to determine tree crown health by near infrared values.#thats a good use of AI (algorithms) that I know and respect.#using plagiarized predictive text machines is in poor taste and also damaging to the environment. be better.
279 notes
·
View notes
Text
Hey so just saw this on Twitter and figured there are some people who would like to know @infinitytraincrew is apparently getting deleted tonight so if you wanna archive it do it now
#infinity train#third-party sharing#owen dennis#anti ai#tumblr staff making stupid decisions again#cryptid says stuff#don't just glaze it actively nightshade it#ai scraping#data privacy
421 notes
·
View notes
Text
I really think a very simple self-applied user CAPTCHA category field for posting works would ease the way while policy can be discussed.
I personally wouldn't mind checking off another mandatory box along side ratings and warnings and what have you that said my fan work was "created without the aid of A.I. machine scraping/production."
I would definitely appreciate being able to filter out any A.I. machine-created works just like I do for tropes, characters, ratings, pairings, fandoms, or anything else I dislike.
AI and Data Scraping on the Archive
An update regarding the OTW’s position on AI and scraping on AO3: https://otw.news/04ba07
#ao3#AI data scraping#otw#allow users a filterable captcha-type category for machine-created fanworks
4K notes
·
View notes
Text

Massive PSA to ALL artists who may be on Twitter, please protect yourself. I just wiped all my art off that shit site, so unfortunately a lot of NSFW art that could be viewed on there will not be available. Some is available on Slasher app, and a bit on Pixiv. I will try to post old lewd art on pixiv and any relevant lewds on Slasher app. I'm really thinking of looking into just getting my own website to display all works uncensored depending on the cost.
To give more context on what the XAi program will be doing here is a link to an article with interview answers from the Mustyness himself:
A quote from said article:
'He also claimed that xAI’s use of Twitter data would not be much different from what many are already using the platform for, adding that it would primarily be used for “text training” and “image and video training.”
“I guess we will use the public tweets — obviously not anything private* — for training as well, just like basically everyone else has,” Musk said.'
He will be using data from private accounts, what Elon means is likely data of birthdates and addresses.
I added ALT descriptions now to the image, I didn't have time before (I'm working on an overly detailed art piece right now), I hope that helps all with screen readers!
2K notes
·
View notes
Text
If anyone's on the (super uncool but sometimes necessary in order to get a job) website Linkedin, they have made AI data collecting opt-OUT.
So settings and privacy → data privacy → Data for Generative AI Improvement → Off
While you're there, dedicate a good 10 minutes to going through the rest of the settings. THERE ARE SO MANY.
And they're all turned on.
103 notes
·
View notes
Text
i feel so terrible doing this but i had to. i locked all my ao3 works under archive only because i learned about the extent of the data scraping that happened. as much as i love writing and i love sharing it with as many people as possible, i cannot allow the works of which i have poured my heart, my soul, and most importantly, my time into to be scraped up and fed heartlessly to an ai bot. so for now, it’s only visible to archive users.
#livs rambles#stardew fanfic#ao3 fanfic#fan fic writing#fan fiction#fanfic#ao3 data scrape#ao3 ai scraping#ao3#ao3 writer
40 notes
·
View notes