#ao3 scrape
Explore tagged Tumblr posts
Text
This tool is optional. No one is required to use it, but it's here if you want to know which of your AO3 fics were scraped. Locked works were not 100% protected from this scrape. Currently, I don't know of any next steps you should be taking, so this is all informational.
Most people should use this link to check if they were included in the March 2025 AO3 scrape. This will show up to 2,000 scraped works for most usernames.
Or you can use this version, which is slower but does a better job if your username is a common word. This version also lets you look up works by work ID number, which is useful if you're looking for an orphaned or anonymous fic.
If you have more than 2,000 published works, first off, I am jealous of your motivation to write that much. But second, that won't display right on the public version of the tools. You can send me an ask (preferred) or DM (if you need to) to have me do a custom search for you if you have more than 2,000 total works under 1 username. If you send an ask off-anon asking me to search a name, I'll assume you want a private answer.
In case this post breaches containment: this is a tool that only has access to the work IDs, titles, author names, chapter counts, and hit counts of the scraped fics for this most recent scrape by nyuuzyou discovered in April 2025. There is no other work data in this tool. This never had the content of your works loaded to it, only info to help you check if your works were scraped. If you need additional metadata, I can search my offline copy for you if you share a work ID number and tell me what data you're looking for. I will never search the full work text for anyone, but I can check things like word counts and tags.
Please come yell if the tool stops working, and I'll fix as fast as I can. It's slow as hell, but it does load eventually. Give it up to 10 minutes, and if it seems down after that, please alert me via ask! Anons are on if you're shy. The link at the top is faster and handles most users well.
On mobile, enable screen rotation and turn your phone sideways. It's a litttttle easier to use like that. It works better if you can use desktop.
Some FAQs below the cut:
First off! If you're seeing an old version of this post, you may not have seen that we now have our first tool to poison AO3 fics! This is still experimental, and it's likely we'll find issues with it as people start using it! But if you want something like Glaze and Nightshade but for fic, this is what we have right now. Before you decide to use it, please read all the info you can--most importantly, using the poison in its current state makes your fic inaccessible to certain users. All the TTS tools I've tried work with this as long as your readers know to save the fic in a certain way! But people who need to download an offline copy to adjust the colors and can't do that with an AO3 site skin will NOT be able to download your work with the current version of the poison. For downloading EPUBs, it preliminarily looks like Calibre can support "unpoisoning" the fic so it's readable again.
"What do I need to do now?": At this time, the main place where this dataset was shared is disabled. As far as I'm aware, you don't need to do anything, but I'll update if I hear otherwise. If you're worried about getting scraped again, locking your fics to users only is NOT a guarantee, but it's a little extra protection. There are methods that can protect you more, but those will come at a cost of hiding your works from more potential readers as well.
"I know AO3 will be scraped again, and I'm willing to put a silly amount of effort into making my fics unusable for AI!": Excellent, stick around here. I'm currently trying to keep up with anyone working on solutions to poison our AO3 fics, and I will be reblogging information about doing this as I come across it.
"I want my fics to be unusable for AI, but I wanna be lazy about it.": You're so real for that, bestie. It may take awhile, but I'm on the lookout for data poisoning methods that require less effort, and I will boost posts regarding that once I find anything reputable.
"I don't want to know!": This tool is 100% optional. If you don't want to know, simply don't click the link. You are totally welcome to block me if it makes you feel more comfortable.
"Can I see the exact content they scraped?": Nope, not through me. I don't have the time to vet every single person to make sure they are who they say they are, and I don't want to risk giving a scraped copy of your fic to anyone else. If you really want to see this, you can find the info out there still and look it up yourself, but I can't be the one to do it for you.
"Are locked fics safe?": Not safe, but so far, it appears that locked fics were scraped less often than public fics. The only fics I haven't seen scraped as of right now are fics in unrevealed collections, which even logged-in users can't view without permission from the owner.
"My work wasn't a fic. It was an image/video/podfic.": You're safe! All the scrape got was stuff like the tags you used and your title and author name. The work content itself is a blank gap based on the samples I've checked.
"It's slow.": Unfortunately, a 13 million row data dashboard is going to be on the slow side. I think I've done everything I can to speed it up, but it may still take up to 10 minutes to load if you use the second link. It's faster if you can use desktop or the first link, but it should work on your phone too.
"My fic isn't there.": The cut-off date is around February 15th, 2025 for oneshots, but chapters posted up to March 21st, 2025 have been found in the data so far. I had to remove a few works from the dataset because the data was all skrungly and breaking my tool. (The few fics I removed were NOT in English.) Otherwise, from what I can tell so far, the scraper's code just... wasn't very good, so most likely, your fic was missed by random chance.
Thanks to everyone who helped with the cost to host the tool! I appreciate you so so so much. As of this edit, I've received more donations than what I paid to make this tool so you do NOT need to keep sending money. (But I super appreciate everyone who did help fund this! I just wanna make sure we all know it's all paid for now, so if you send any more that's just going to my savings to fix the electrical problems with my house. I don't have any more costs to support for this project right now.)
(Made some edits to the post on 27-May-2025 to update information!)
6K notes
·
View notes
Text
I don't want my works to be on sketchy AI training websites, but I also don't want sketchy AI training websites to have my real human name and contact information, which is generally required for DMCA notices. You see my problem.
791 notes
·
View notes
Text
PSA: Your AO3 work might have been scraped for a GenAI dataset
(PLEASE NOTE: At this time, access to the dataset has been disabled. You can file a DMCA takedown request if your work is included in the scraped data to campaign for full deletion, but at this time it is considered unnecessary.)
This is intended as a general notification for all the writers following this blog who post on AO3.
Approximately 12.6 million public works were scraped from AO3 and several other published-work sites, including PaperDemon, which has regular updates on the situation as well as resources. Here is the link to their updates.
Affected AO3 users are those whose work IDs were public-facing (that is to say, not locked to registered AO3 users only), and whose work IDs are between 1 and 63,200,000.
Your work ID is the number in your work's site URL, like so:

This work has an ID only in the 61-million range, and so was scraped for the dataset.
If you wish to lock your works to be visible to logged-in users only, you can do so through Works > Edit Works > All > Edit > Only show to registered users.
#mod ziva#ao3#genai#ao3 scrape#as of time of posting the user has uploaded to at least 2 other dataset sites as well as huggingface
312 notes
·
View notes
Text
Locked fics are not safe
Thank you to the lovely people that are helping others go through the dataset and figure out if their fics were scraped or not. However, locking our fics now or before hasn't helped;
locked fics were also scraped.
I haven't locked any of my own fics but I have seen others who either had a lot of locked fics or had entirely locked fics on the archive come forward and mention that the dataset also included them.
This means that the scraper had an account.
I'm so sorry for everyone involved (including myself) but at this point locking our fics does nothing if we know they also had an account. It only means that our fics are reaching a smaller audience than before.
To my own audience, my works will still be available to read with or without an account, I'm not going to let them scare me into hiding.
198 notes
·
View notes
Text
Updated Pin Post - IF YOU'RE HERE ABOUT THE AO3 SCRAPE
Please move over to ao3scrapesearch! (I wanted to have a blog where I could bring in other admins to help out as needed, and my fandom blog ain't it anymore.)
What's Next
You can check out this guide to submitting a DMCA claim on this guy. The main download source of the dataset has been permanently disabled! You shouldn't need to submit DMCA claims as of now, but please be aware that there are other sources out there to download this dataset.
I also temporarily made this blog viewable only in the Tumblr dash just to minimize people from 4chan coming in here. It'll go back to normal eventually.
(You guys are welcome to follow this blog if you want, but I am normally a Klance blog, so you're gonna get the daily Klance spam on your dash from me if you do!)
115 notes
·
View notes
Text
whoever scraped ao3 i hope u fall into a pile of legos
59 notes
·
View notes
Text
50 of my 52 stories on AO3 were scraped without my permission to be used as a training set for AI.
50
of 52.
Of those were original works, one of which was a character study for my book characters that a follower wanted to read.
I've filed my takedown request, but there is a chance it may not be removed. If it is not removed, I will retire from writing indefinitely, as I do not want this to become the norm and have my writing be fed to AI to create endless slop.
82 notes
·
View notes
Text
Soooo, apparently, someone by the name of "nyuuzyou" recently scraped tons of fanfics off of AO3 and posted them to various generative ai database sites. It seems like fics which were only viewable by people with profiles were safe, so a bunch of people are locking their accounts. I don't know if I'll be locking mine yet, but if you've been using AO3 as a guest, you may want to sign up soon. If you do have an account, and your fics were victims of the scrape, you might be able to file a copyright claim. This reddit post seems to have most of the information, including what works were scraped and where the data was posted.
Or, if you don't want to read all that, here's the user's profile on Hugging Face (one of the ai database websites)
#i don't even have reddit#but this post had the most information i could find#i'm so glad someone who doesn't care about my stories fed them to a plagarism machine so that it can make worse versions of my stories#literally so mad rn#whyyy#anti ai#anti generative ai#anti genai#ai scraping#ao3#archive of our own#ao3 scrape#fanfic authors#fanfic#fanfiction#fanfic writing#ao3 news
80 notes
·
View notes
Text
in the wake of ao3 getting data scraped and the new weird purist mindset in fandom that’s making people harass creators, right now is a great time to remember;
- readers, make an ao3 account to look at fics, it’s free, encourages artist to privatize their work to protect it without fear of losing their follower base.
- comment! interact! if you liked something specific in the fic don’t be shy to tell the author, building and strengthening fandom community will help the authors and other creators.
- authors, lock your fics, I know it’s scary and it’s ultimately up to you what you think is right but locking fics discourages data scrapers, and if they do get the gall to make an account to steal fics it helps ao3 track data scrapers more efficiently.
- fandom, do not be afraid, i know its hard right now but ao3 is an amazing resource as an archive as well as an accessible library of amazing fan works and others. the community poured time effort and money into ao3, it’d be a shame if it had to be shut down over something petty as this, working together is how we will make ao3 a safer and stronger place for creatives.
- lastly dear authors, i know it’s the worst when your hard work is scraped and fed into something as soulless as an ai. two years ago my whole profile on instagram was scraped for an ai by one of my close online friends, that is the main reason why i completely transferred to tumblr. it hurts but do not give up, you have amazing creative souls, and despite everyone calling fanfiction cringe, authors like you breathe life into fandom and franchises altogether.
it is all up to you if you want to keep creating, but just know that us readers will miss you and we understand if it’s too much.
#ao3 scrape#defend ao3 authors#protect your authors#and protect fandom spaces#protect ao3#anyway i’ll get off my high horse
49 notes
·
View notes
Text
A podcast I listen to just posed the question of "will the Ao3 fic scrape result in people accidentally generating work emails full of smut due to the nature of Ao3 content?" and honestly I think that would be both incredibly hilarious and completely deserved.
47 notes
·
View notes
Note
Hiya! I know it's been a little while but I just wanted to let you know I finally got around to making the web version of that fic poisoning tool I made about a month ago. It's at https://tricksofloki.github.io/ficpoison.html if you're interested :)
OHOHOHO!
Alright, I gave this a little test on my own fic over here. Quick little review/notes for anyone interested! (But the tl;dr is that I approve based on my initial review of the original code and based on using this web tool to automate running the code.)
This version is super easy to use. I'll be honest; I was struggling trying to figure out how to run the code locally before because that is not a coding language I personally use, and this website takes out all of the hard part of doing that. You need to do the one time task of creating a work skin to enable the "poison" CSS used, and you need to make sure that work skin is enabled for any work you're going to use this on. The code to put into your work skin is available at the link. If you already have a work skin you use, you can just add this class to it. (I think the tutorial I linked to does a good job walking you through how, but I'm open to doing a tutorial on this blog if anyone wants that.)
If you're poisoning an existing fic, first have a backup copy. Once you poison it, that copy is going to be annoying to UN-poison if you ever want to, so you should keep a private copy on your PC or phone or wherever so you have the unpoisoned version available. Once you do this, your copy on AO3 is poisoned, and it would take a fair amount of effort to unpoison as the author. Upside: as the author, you can see all the CSS stuff in the background, so if you really need to unpoison a copy as the author with full access to it, it's not impossible. Just really annoying.
For reference, here's what I can see as the author with access to the edit page:
I can clearly see where the poison is if I really wanted to go back through and unpoison.
And here is what I can see in a copy scraped with nyuuzyou's code:
You can definitely see it's messed up by looking, but you don't see an active callout to where exactly the poison code is. Keep in mind that not every scraper uses the same code as nyuuzyou, and more sophisticated code may pull something more sophisticated than the plain text from nyuuzyou's tool. Other scrapers may be pulling fics with the formatting and everything, and I don't know exactly what that output looks like. Depending on what their output is, if they can see the class for the poison, they can pretty easily code something to remove it. That's me being overly conservative, I suspect. I haven't heard of any scrapers who have bothered with anything more than plain text, and this isn't an issue unless they're grabbing the full HTML. (Translation: From what I know, this is NOT an issue. Yet. So this is not a weakness of the poison tool. Yet.)
Based on the output, anyone who's doing a half decent job of cleaning up the data they scrape would toss my fic out of the dataset. It's full of what look like typos because the poison got placed mid-word, so it looks like I just suck at writing. If your goal is to get tossed out of the dataset, this is perfect. If a scraper isn't paying attention at all, you can contribute some really terrible training data if they leave your fic in the set because your poisoned fic is going to be full or words that don't even exist thanks to the word placement.
As far as using the tool, I used an existing fic. I went into the edit page for the chapter, scrolled to the bottom and left the text editor on the default HTML mode. I copied everything in that box. (Easy method: click into the box where you can type out the fic, and press "Ctrl" and "A" to select all, then "Ctrl" and "C" to copy.) I went to the tab with all-hail-trash-prince's tool, and I pasted it into the box on the left.
I clicked "Apply poison" and the poisoned fic appeared in the right box. I copied the poisoned fic from the right box, went back to my fic on AO3 with my custom work skin already enabled, and I pasted the poison fic in place of the original fic. I clicked the preview button to make sure it would look normal, and it did. So I clicked to update the chapter with the poison block included.
I loaded the chapter with the default Microsoft screen reader turned on, and it didn't read any of the poison data, only the real fic that is visible on the screen, so success there.
So that brings us to applying this to a brand new fic. For those, you're going to go through the motions of posting a fic as usual, but instead of clicking post when you're done, you're going to swap that text editing mode over to HTML and copy everything in there. Take it to the poison tool, paste it in, and grab your poisoned copy. Go back to AO3, make sure your poison work skin is enabled, and then replace the original fic with the poison fic, making sure to stay in the HTML editing mode while you do.
(Sneaky quick edit after posting: sometimes the tool leaves you with a dangling <p> or </p> or <em>. Make sure you always preview the chapter after poisoning it, and you can go back in to the rich text editor to delete any of the floating tags that were accidentally put in by the poison.)
The last downside I notice is that your word count is immediately wrong. My 34k fic looks like a 43k fic after poisoning the first 16k words. Technically, you don't have to tell people the true word count of your fic but like. That feels a little rude to the reader, so I think it would be kind to briefly put the true word count either at the bottom of your summary or in your first author's note.
To me, the downsides of having to create a custom work skin (that trash-prince has kindly already written for everyone) and having the wrong word count displayed... are nothing. In comparison to having my fic be easy to scrape, I'll take those slight downsides any day. From what I know of the current scraping landscape, this is a reasonably effective way to make your fic useless to anyone who scrapes it because people are out there that will be scraping AO3 again.
I'm curious to hear anyone else's thoughts if they check this tool out or try it for themselves, so don't be shy! I'm one person, so maybe I can't catch everything. If you're seeing something that I'm not, I want to hear about it.
And if anyone wants a more visual step by step, you are welcome to yell my way. If this text post is clear enough for everyone, I won't bother, but if a more visual walkthrough will help anyone, then I'm happy to do it!
EDIT: Just tossing in a summary of feedback I've seen from others below!
The tool is pulling from a list of most popular English words, which means it may add inappropriate verbiage to G-rated fics. See this ask for info. trash-prince has made adjustments based on the initial words spotted, but please kindly report any other concerning poison words you find, particularly slurs and other wording that cannot be interpreted in a SFW way.
154 notes
·
View notes
Text
welp. I just got it confirmed this afternoon that twenty five of my fics got caught up in the last AI scrape on AO3.
I've had my fics locked for over a year now, so. that officially means nothing anymore. these poor excuses for human beings are using accounts in bad faith, or otherwise circumventing not having one's fics public to get at them.
if you're worried about your own work, you can go here and request the nice person digging through the metadata see what turns up when they search your username. if you don't want to make an account on AI Assholes R Us there, you can also message @occasionalklance here on tumblr, who is the person kind enough to do all the data-digging on their off-hours.
all this to say, I am tired, and not inclined to finish or post anything right now or maybe for a while, especially because the assholes over on hugg/ingface are saying they've set a scraper to capture any newly posted work. they might just be talking out of their collective ass to scare people, but I just don't feel like fucking with it.
to all genAI knobgobblers, I wish a very "Explosive Diarrhea Forever" 🖤
#archive of our own#ao3#ao3 scrape#fanfic#fanfiction#ai scraping#genAI scrape#fuck genAI#fuck genAI users#fuck genAI programmers
51 notes
·
View notes
Text
So due to the recent ao3 scrape that happened, I wanted to make a post to express my thoughts on AI art and writing.
I do NOT support AI creations.
It is theft, plain and simple. Stealing other people’s creations that they worked hard on and receiving no credit for the involvement in the model training is terrible and it needs to stop. It isn’t true art at all. And to those who claim that they worked hard on their ai art, that is blatantly false. No personal effort was put in. They essentially just commissioned a machine to make their ideas into art. There’s no respect for the progress of making it, which I personally believe is essential.
I do understand that it’s frustrating when you have ideas that you want put into form, but have no artistic ability. I can’t draw at all. But what do I do to solve this issue? Commission ACTUAL people when I have the funds.
The art community, both those who do drawings and those who write, we all need to support each other and build each other up. There’s so much potential in people, and it’s always special to see people blossom. To inspire each other and others who are just starting.
We can work together to make something more and special. We can do it.
34 notes
·
View notes
Text
i am so pissed, SO PISSED that YET ANOTHER person has come along and scraped ao3 for their generative ai bullshit. every -- not overstating this -- EVERY public work has been processed for HuggingFace and its derivatives. people i know are restricting their works to registered users only and i recommend you do the same so that only registered users can access them. paperdemon, (another site that got scraped -- bless their hearts), has taken up the responsibility of keeping track of the datasets borne of the stolen content. two of the datasets (artgram and itaku) have been deleted, but the others (ao3, artfol, character hub, paintberri, paperdemon art, paperdemon writing) are only temporarily disabled due to the owners of the sites filing a DMCA.
i'm so fucking angry despite the fact that the ao3 people are already in the process of trying to remove it. they stole the work of people i respect, people i take inspiration from, and even MY work. i can't believe it. i have FAR more respect for the shit i took this morning than i ever will for those parasitic lobotomites. but hey -- on the bright side, ziff davis (parent company of IGN, PCMag, Mashable) is suing OpenAI. hopefully when their pockets take a hit, ai jagoffs get the message and stop scraping and let generative ai die
#ao3#fuck ai#fuck generative ai#anti generative ai#anti ai art#anti chatgpt#anti ai#ao3 scrape#paperdemon#paintberri#artfol#itaku#artgram#character hub
29 notes
·
View notes
Text
frank herbert would hate knowing our dune fics were scraped from ao3 to train ai😔😔
#dune#lady jessica#leto atreides#jessica atreides#duke leto atreides#leto x jessica#paul atreides#dune by frank herbert#frank herbert#ao3#ao3 fanfic#ao3 scrape
30 notes
·
View notes
Text
JUST FYI GUYS
Due to the scraping of Ao3 I will be privating all my works (if you have an ao3 account you can still read them)
#danny phantom#dpxdc#ao3 writer#ao3 author#ao3 fanfic#batman#dp fanfic#deadtired heist#deadtiredship#danny fenton#ao3 scrape#ao3 private work
25 notes
·
View notes