Skyler | adult | side blog for talking about AI on AO3 (/neg) | Commonly requested info in the pinned post! Anons are on, and asks are preferred over DMs! :) I'm a real person, so please critique but do it nicely
Don't wanna be here? Send us removal request.
Text
One more thing I did want to address: I am not looking to skirt any AO3 terms of service, stated or even implied. You won't see me encourage anything on this blog that I don't fully believe is allowed per AO3. A notable thing I'm seeing is some people saying this breaks a part of the TOS about technology, so I just wanted to walk through that section together.
This is the section people are referring to, as far as I'm aware. The relevant subsections are 1, 6, and 7.
Section 1: You can't violate the Content Policy.
Ignoring the parts that aren't relevant to this conversation, only "spam" and "technical integrity" really get close to what poisoning is.
Spam is defined on AO3 as "repeated identical or nearly identical posts." This is not what the poison tool does. You're not reposting the same fic multiple times. You're adding extraneous words to your unique fics.
Technical integrity is what I believe people are getting hung up on. However, technical integrity is about harming AO3's or others' hardware or software. With the poison, you are essentially deciding how your fic will display in certain places, but this does not harm any software or hardware. It only makes your fic appear formatted as you intended. There is no damage done to anyone's technology using the poisoning method I advocate for on this blog.
If you go into the FAQ on this section of the TOS, AO3 further elaborates on what they mean by technical integrity. To address that entire section: 1) This poison does not attempt to hack the site or exploit a code vulnerability. It uses only CSS which AO3 has specifically allowed to be used on the site. 2) This poison does not contain any viruses or unwanted programs. 3) This poison does not redirect users to spam sites - though I do encourage you to share an OPTIONAL link to your readers to tell them where they can request a clean copy of the fic from you. This is not redirecting in the sense AO3 means here; this provides free will to follow or not follow the link. 4) The poison does not attempt to undermine or evade compliance with any of the TOS. AO3 specifically allows the display: none feature, meaning you are allowed to hide certain words or paragraphs in your fic as it displays on the site. It's officially supported by AO3.
Section 6: You can't share anything that contains software viruses or other computer code, files, or programs designed to interrupt, destroy, or limit the functionality of any computer, hardware, or telecommunications equipment.
The poisoned copy of your fic does not contain any software viruses or computer code whatsoever.
The code designed by all-hail-trash-prince does contain computer code, which means you potentially cannot share all-hail-trash-prince's actual code on AO3. Talk about it elsewhere if you want to be overly cautious. I'd argue you can even talk about it and link it on AO3 because the code is not designed to interrupt any computer, hardware, or equipment; it's only intended to adjust how your fanfiction displays on that equipment. The poisoned fic itself cannot harm any tech. I only advise talking about it off AO3 out of an abundance of caution. But again, that's regarding the code itself. There is no code in this sense in your poisoned fic--it only contains some CSS formatting after you apply the poison, no code.
Section 7: You can't interfere with or disrupt AO3, any OTW-hosted content, or any services, sites, servers or networks connected to OTW sites.
This heavily overlaps with the above. Put simply, this means your fic can't do anything to harm AO3's website. Poisoning your fic does not harm the website. Again, this method of poisoning is just adding some CSS to determine how your fic will display to others.
If you guys are referring to some other section when you say this breaks policy, please point me in the right direction. I've been through the TOS so so so many times and haven't spotted anything to indicate using this method of poisoning is not allowed, but I'm happy to review anything I somehow missed.
I'm advocating for this because I think it's the current best way to be able to share your work publicly while reducing the risk of it being used in a way you never intended for it to be used. Based on all the information I have from AO3, nothing about this is against any of its terms. And as always, you can choose not to use this regardless.
12 notes
·
View notes
Text
Hi guys! I have an anon about making the EPUB downloads work, and it may take a minute for me to fully confirm things, but I do think we have a way to make those work with Calibre. I'll answer the ask after I'm completely sure :) just didn't wanna leave you hanging, anon!
6 notes
·
View notes
Text
vimeo
How to use the anti-AI fic poison
This technique is anti-AMATEUR AI. You aren't doing anything against the big commercial names with this! You are reducing your risk of being useful to an amateur building a new model. You are discouraging individuals like ny*uzy*u, not corporations like Ch*tG*T.
AGAIN. You are not required to use this. This information is being provided to give you the option to make a decision for yourself.
31 notes
·
View notes
Note
anon because I truly don't want to get into a slapfight here but. have you... tried asking any actual, current AIs if they can read your "poisoned" fanfiction? because. out of scientific curiosity I ran your recommended poisoner on some text (actually the text of the poisoner website itself) and then asked the bargain-basement, free-tier version of chatgpt what it thought the original text was. it was word-perfect. it had no trouble with it at all.
like this situation does suck and I respect that you want to do something about it but... I really don't think this method is going to do anything to even slightly inconvenience any AIs worth the name and plenty to hurt real people in the process. sorry :( you don't have to post my message or anything but maybe reconsider?
No worries, and I am not trying to get into a fight with anyone here either, including anyone who disagrees and chooses not to use any of the information provided. I try to be consistently clear about that, but to repeat: nothing on this blog is meant to read as, "YOU HAVE TO DO THIS."
(I am autistic and can struggle with how I come across to people, so if any of you are thinking I'm being a jackass, please tell me so. It is most likely not intentional.)
That is good information to have, and I did face my reluctance to use AI at all to confirm for myself. I only cried a little as I created a Ch*tG*T account so I could share and ask it what it sees in the scraped copy of my poisoned fic.
And then I yapped too much, so the rest is under the cut!
Ch*tG*T, fed a directly scraped copy of the fic, at first hallucinated an entirely different story with metadata, characters, and a plot that aren't in the scraped JSONL file at all. I scraped only my own fic, so the entire file is only metadata and the poisoned work text for my one fic. It came up with a supernatural story about an Elena experiencing grief and healing when this is actually a Voltron Klance fic about getting together post-canon. There's admittedly a short section about grief in chapter 2, but definitely no Elena or supernatural elements.
I directed the AI toward where I know the actual work data is stored since it first said that there's no work text in the file (wrong), and after some prodding, it could tell me about my own fic, including providing a clean copy of the first paragraph. So yes, AI can read the poisoned fic. If your goal is to stop existing AI models from reading your work, this poison isn't for you.
For funsies, I also asked if it could spot signs that this fic is poisoned, and it said yes, pointing me specifically toward the nonsensical words that came from the poison code smashing multiple words together. Again, this isn't a problem for me. I am not specifically trying to sneak my dirty data into a training set, though that would be a huge, funny bonus in my opinion. Interestingly enough, it could not understand all of the content of the poison. I gave it my original poisoned copy of this fic, the copy that had a racial slur before I caught it. Ch*tG*T insisted both the real and the adversarial text contained no slurs.
And I'll apologize at this point because maybe I haven't been clear enough about my own goals here. I am not trying to defeat huge existing commercial AIs with a shit-ton of resources behind them.
And as long as I'm creating a fic that a human can read, then yes, a human can just copy the human-readable view and feed it to whatever existing AI model they want. Part of being in this community is having to trust that someone else in the community wouldn't do that to you, but I know that's easier for those of us with less popular fics. But let's say you're popular, or one of your readers is just super selfish... Fed a copy of the poisoned fic, a commercial AI with a lot of expensive resources available can decrypt what it's reading even with the poison, or the person can direct it toward a historic clean copy if all else fails. An invested human is going to be the one interacting with an existing commercial AI, and they're likely smart enough to work around any obstacles. I can't stop a single asshole human who chooses to devalue your work by directly feeding it to AI, one fic at a time.
Whether it's a selfish but real reader or a mass scraper who doesn't care whose fic they're grabbing, complete abstinence from sharing your fics on any service connected to the internet in any way is the only way to guarantee your fics will never be scraped and used to train AI. Fanfiction is my biggest hobby, so that's not an acceptable method to me on a personal level. All I can do as someone who still wants to share my stories is to minimize the risk of the negative outcomes I can foresee.
The negative outcomes I know of:
Someone feeds my fic straight to Ch*tG*T. The only way to prevent this is to never share my fics at all, since I don't know who will decide to pass it along to AI.
Someone can't access my work. I choose to partially prevent this by continuing to share my work online because that is part of what I enjoy about fanfiction. This is the area being negatively impacted by my choice of protection, and as mentioned, I'm looking to reduce this impact over time. I don't like cutting people out of accessing my works, but I can't test the promising methods without actually testing them.
Someone amateur mass-scrapes works including mine, for the purpose of training a new AI model. This is the area I'm focused on. This is something I can try to improve by testing and adopting new techniques. This isn't doing anything to existing AI models; this is trying to stop fresh new ones from gaining anything from my own work.
As someone whose job revolves a lot around data analysis, I know the headache of dirty data. If I don't spot it, I make some terrible recommendations to my clients. If it's bad enough, I lose my company millions of dollars of business with my failure to spot that bad data. If I'm lucky, my internal team catches it before the client ever sees it, and that's still annoying because I have to do another fresh analysis of the data after cleaning it, which is time-consuming. If I do spot the dirty data up front, it's still often a massive pain to figure out how to get it out without changing my clean data, and it makes my job slower and more annoying. It's my job, so I do it.
You hand me that same type of stuff when I'm off the clock? I'm not bothering with it. I'll take the easier method of finding a quick surefire way to spot the dirty data, and I'll toss all those data points out and stick with the clean ones. I won't bother trying to salvage the junk because I only have so many hours in a day, and I can usually achieve what I want with the clean data I do have.
I might overcome it if it's all dirty in the exact same way and I can quickly figure out how to fix it. I'm much less likely to bother if I spot various different types of dirty in there, such as what you'd get from a community of authors using 20+ different poisoning tools instead of all sharing the same exact one.
The goal to me is to be that annoying dirty data that gets tossed aside for being too annoying. Or if the person using the data for AI training is really lazy and doesn't bother to check for poison, then I get the pleasure of being 71 drops toward making their results make less sense. At the bare minimum, I'm trying to be a minor annoyance that makes the scraper's day just a teensy bit worse as they go through additional data cleaning work before they can use my writing.
I did figure I might as well go all the way if I was going to cave and use AI today anyway, so I also asked Ch*tG*T how many fics poisoned in this current style would need to be present to have a slight but notable impact on a new model trained on the data. If you trust the AI, the answer is minimum 0.2% of all scraped works need to contain poison, assuming the new model is trained only on the scraped fics. Meaning only 26,000 of these poisoned fics could've made some fun results if they'd been poisoned at the time of nyuuzyou's scrape. The more after that, the more messed up the model gets. Again, assuming no other training data is used to try to get around this, and assuming the trainer doesn't bother removing the junk fics and sticking to the unpoisoned ones.
And I've visited this last part before in a couple reblogs now, but it bears repeating: this is something you have to choose to do or not to do on a personal level. I prioritize doing the little bit I can to protect my writing over having my fics be easily downloadable, as long as the demographics among my readers stay the same. There is no known demand for the download feature among the specific people who read my fics, so this is not a priority for me until I hear otherwise. I'm open to making it a priority for myself based on how readers react to the notices I've provided about the poison, but that has been a non-issue for me.
I'm choosing to make it a priority anyway because I want anyone here to have the option to use this without hurting their readers. However, we are not completely there yet, so if native download functionality is important to you, the poison isn't for you yet! With a simple workaround, this works fine with screen readers on mobile. With no adjustments at all, this works fine with screen readers on desktop reading directly through the site. Right now, the only issue I'm aware of is that people who need access to change the colors and fonts on a dime can't do that on a downloaded copy; addressing that is my current priority.
Even looking at the native download functionality, it does appear that Calibre is able to decode the poison and allow you to read a poisoned EPUB. I'm looking to confirm this in detail when I have a minute, and this may cover the above issue on closer look.
tl;dr I'm playing with my own fics--this info is being provided for you to make your own decision if you'd like the option to do the same. I'm not trying to stop existing commercial AI like Ch*tG*T. I'm trying to annoy the amateurs out of using my data for new models or other projects I didn't consent to. I'm not striving to hurt real human readers, and I mean it when I encourage you guys to bring accessibility and other concerns to me (just be nice please). This is all a work in progress.
15 notes
·
View notes
Text
Thank you guys for being patient with me! I've been a lot busier with real life than I expected to be, but I'm hoping to sit down this weekend and record my screen while I walk through:
A visual tutorial to apply the poison as a writer
Notes on how to check for expected errors as a writer
Notes on accessibility and who SHOULDN'T use the poison
Tips on how to navigate a poisoned fic as a reader
And anything else that you guys tell me you're interested in understanding better
So if there's something you have for point #5, leave it in the replies/reblogs/tags of this post or send an ask if you'd rather ask anonymously! I'm hoping to start recording tomorrow evening.
9 notes
·
View notes
Note
Reblogging to add a visual on how readers can still save an unpoisoned copy of the fic, as well as some notes on each method!
"Print as PDF"
Pros:
Works on mobile browsers
Works with TTS apps that support PDFs
Cons:
Ugly as all hell with little to no options to format it differently unless you have a paid copy of a PDF editing program
Not as easy as clicking AO3's download button
On PC:
1) Load the fic on AO3 as usual.
2) Absolutely do NOT click this button. It'll make an ugly unreadable copy.
3) Make sure this button says "Chapter by Chapter." If it says "Entire Work," then you're only looking at the first chapter, and this process will ONLY save chapter 1. Just click that button now if you need to switch it out.
4) Press "Ctrl" + "P" or click through the way you usually print something from your browser.
5) Instead of printing to whatever your normal printer is (if you use one at all!), change this option to "Save to PDF." If you want to change any other print settings you can do that now.
6) Click save, and it will pop up a save dialog asking where you want to save it. From there, just do whatever you normally do when you use the regular download button on AO3!
On mobile:
1) In your browser, go to the fic.
2) Still definitely do NOT click the normal download button.
3) Still definitely make sure the fic is on view entire work mode. The button at the top says "Chapter by Chapter" if you're in the right mode.
4) Click the menu you use to view more options when you're in a browser tab. Mine looks like this in the Samsung browser, but yours may be a little different.
5) Pick the "Print/PDF" option.
6) If yours has settings, awesome. You can play with those. Mine is bare bones, so I can just click here to change my printer and that's it. Use this setting to select "Save as PDF" to get a PDF.
7) Click the save button and continue the way you normally would with the regular AO3 download button.
Save as HTML
Pros:
This is more customizable because you're downloading all of the HTML and theoretically can change settings for fonts and backgrounds
Cons:
...BUT that may not be easy. That's new territory for me, so I'm willing to look into it more if anyone is interested in the potential, but I haven't had a need to use it because I always read fics in my browser.
And this could be intimidating if you've never played with HTML files before. PDFs may feel a lot simpler.
And it's still ugly, though in my opinion not AS bad as the print to PDF option.
Still not as easy as clicking AO3's download button.
On PC:
1) Load the work, same as above.
2) Same as above, do NOT use AO3's button to save.
3) Same as above, make sure you've loaded the ENTIRE work, not just the first chapter.
4) Press "Ctrl" + "S" to save the entire webpage.
5) Use the save dialog to put the file wherever you want it. Think of it as similar to a PDF or EPUB of your fic. It probably gets saved wherever you put those.
(Pro tip: if the file name it suggests has the phrase "Chapter 1" near the end, you forgot to switch over to viewing the entire work! Cancel out and switch to the entire work before you actually save, unless you're really trying to save only one chapter of the fic.)
6) Go to the place where you saved the fic. When you save HTML like this, it will create both an HTML file and a folder of files it uses to help that HTML file understand what it's supposed to look like. When you're ready to open it, you want to ignore the folder and open the actual HTML file.
7) If you've never opened an HTML file before, it might ask you what program you want to use to open the file. I select "Firefox" because that's my default browser. It should work with Firefox, Chrome, Edge, whatever you use to look at websites normally.
8) The top looks ugly, exactly like how websites look with your internet is getting interrupted while loading them. Scroll down a bit and the fic itself looks fine. It's just the top that's weird.
Neither option is amazing, so you might want to ask yourself if this is worth it to you. Personally, I don't know of my fics having any readers who are downloading the fic to read offline, so it's not something I care much about for my own fics. I do care how it affects your fics, so it's still something on my to-do list. Whether that comes from finding an alternative download method that does work well or by modifying the poisoning method while still poisoning the part of the website mass scrapers go after, I'll keep looking to see if there's something we can do to improve the fic-downloading aspect of this.
Honorable mentions:
One riskier thing you could try is including an alternate link to the unpoisoned fic in your A/N. Generally, AI scraping is automated, so they're not actually reading the contents of your A/Ns or following any links you put in there, but you should know that they could theoretically do that. It's probably far more effort than they'd be willing to go to, but it's still a risk. That being said, if you want to take that risk, just use the A/N to put a link to a read-only copy of your fic on your choice of alternate host such as Ellipsis. (Not trying to advertise for them specifically, but I just really don't wanna rec something like GDocs with the decisions it's made around AI. Just use whatever host you like that allows guest access and lets you make a read-only/no editing copy.)
And finally, the less risky version of that is more manual, but you can tell readers in your A/N that you are willing to privately share a clean copy of the fic. Just give them somewhere to contact you, and use your best judgment. I highly doubt someone scraping is going to go to the effort to DMing you for a clean copy, but again. It's not impossible.
Hiya! I know it's been a little while but I just wanted to let you know I finally got around to making the web version of that fic poisoning tool I made about a month ago. It's at https://tricksofloki.github.io/ficpoison.html if you're interested :)
OHOHOHO!
Alright, I gave this a little test on my own fic over here. Quick little review/notes for anyone interested! (But the tl;dr is that I approve based on my initial review of the original code and based on using this web tool to automate running the code.)
This version is super easy to use. I'll be honest; I was struggling trying to figure out how to run the code locally before because that is not a coding language I personally use, and this website takes out all of the hard part of doing that. You need to do the one time task of creating a work skin to enable the "poison" CSS used, and you need to make sure that work skin is enabled for any work you're going to use this on. The code to put into your work skin is available at the link. If you already have a work skin you use, you can just add this class to it. (I think the tutorial I linked to does a good job walking you through how, but I'm open to doing a tutorial on this blog if anyone wants that.)
If you're poisoning an existing fic, first have a backup copy. Once you poison it, that copy is going to be annoying to UN-poison if you ever want to, so you should keep a private copy on your PC or phone or wherever so you have the unpoisoned version available. Once you do this, your copy on AO3 is poisoned, and it would take a fair amount of effort to unpoison as the author. Upside: as the author, you can see all the CSS stuff in the background, so if you really need to unpoison a copy as the author with full access to it, it's not impossible. Just really annoying.
For reference, here's what I can see as the author with access to the edit page:
I can clearly see where the poison is if I really wanted to go back through and unpoison.
And here is what I can see in a copy scraped with nyuuzyou's code:
You can definitely see it's messed up by looking, but you don't see an active callout to where exactly the poison code is. Keep in mind that not every scraper uses the same code as nyuuzyou, and more sophisticated code may pull something more sophisticated than the plain text from nyuuzyou's tool. Other scrapers may be pulling fics with the formatting and everything, and I don't know exactly what that output looks like. Depending on what their output is, if they can see the class for the poison, they can pretty easily code something to remove it. That's me being overly conservative, I suspect. I haven't heard of any scrapers who have bothered with anything more than plain text, and this isn't an issue unless they're grabbing the full HTML. (Translation: From what I know, this is NOT an issue. Yet. So this is not a weakness of the poison tool. Yet.)
Based on the output, anyone who's doing a half decent job of cleaning up the data they scrape would toss my fic out of the dataset. It's full of what look like typos because the poison got placed mid-word, so it looks like I just suck at writing. If your goal is to get tossed out of the dataset, this is perfect. If a scraper isn't paying attention at all, you can contribute some really terrible training data if they leave your fic in the set because your poisoned fic is going to be full or words that don't even exist thanks to the word placement.
As far as using the tool, I used an existing fic. I went into the edit page for the chapter, scrolled to the bottom and left the text editor on the default HTML mode. I copied everything in that box. (Easy method: click into the box where you can type out the fic, and press "Ctrl" and "A" to select all, then "Ctrl" and "C" to copy.) I went to the tab with all-hail-trash-prince's tool, and I pasted it into the box on the left.
I clicked "Apply poison" and the poisoned fic appeared in the right box. I copied the poisoned fic from the right box, went back to my fic on AO3 with my custom work skin already enabled, and I pasted the poison fic in place of the original fic. I clicked the preview button to make sure it would look normal, and it did. So I clicked to update the chapter with the poison block included.
I loaded the chapter with the default Microsoft screen reader turned on, and it didn't read any of the poison data, only the real fic that is visible on the screen, so success there.
So that brings us to applying this to a brand new fic. For those, you're going to go through the motions of posting a fic as usual, but instead of clicking post when you're done, you're going to swap that text editing mode over to HTML and copy everything in there. Take it to the poison tool, paste it in, and grab your poisoned copy. Go back to AO3, make sure your poison work skin is enabled, and then replace the original fic with the poison fic, making sure to stay in the HTML editing mode while you do.
(Sneaky quick edit after posting: sometimes the tool leaves you with a dangling <p> or </p> or <em>. Make sure you always preview the chapter after poisoning it, and you can go back in to the rich text editor to delete any of the floating tags that were accidentally put in by the poison.)
The last downside I notice is that your word count is immediately wrong. My 34k fic looks like a 43k fic after poisoning the first 16k words. Technically, you don't have to tell people the true word count of your fic but like. That feels a little rude to the reader, so I think it would be kind to briefly put the true word count either at the bottom of your summary or in your first author's note.
To me, the downsides of having to create a custom work skin (that trash-prince has kindly already written for everyone) and having the wrong word count displayed... are nothing. In comparison to having my fic be easy to scrape, I'll take those slight downsides any day. From what I know of the current scraping landscape, this is a reasonably effective way to make your fic useless to anyone who scrapes it because people are out there that will be scraping AO3 again.
I'm curious to hear anyone else's thoughts if they check this tool out or try it for themselves, so don't be shy! I'm one person, so maybe I can't catch everything. If you're seeing something that I'm not, I want to hear about it.
And if anyone wants a more visual step by step, you are welcome to yell my way. If this text post is clear enough for everyone, I won't bother, but if a more visual walkthrough will help anyone, then I'm happy to do it!
EDIT: Just tossing in a summary of feedback I've seen from others below!
The tool is pulling from a list of most popular English words, which means it may add inappropriate verbiage to G-rated fics. See this ask for info. trash-prince has made adjustments based on the initial words spotted, but please kindly report any other concerning poison words you find, particularly slurs and other wording that cannot be interpreted in a SFW way.
164 notes
·
View notes
Text
Some additional critique I'd appreciate some brainstorming on!
I gave this a shot knowing the tools I already have, and I could not get the native print to PDF option to print in any style except the default font and black font on white background.
Whatever solution we go for with this, it needs to at a minimum:
Have a free version so that anyone can use it
Offer the reader's choice of fonts
Offer the reader's choice of background AND font color
Works with either a PDF or an HTML file
If you have anything that you think can do this, please share! I will review and see what actually works.
And if anyone else has accessibility critiques, please share them. Don't be an ass about it, but I do want to know how this current solution is or may possibly negatively affect you. The goal is to continuously improve the solution so it works for more and more people and ideally doesn't exclude anyone.
This tool is optional. No one is required to use it, but it's here if you want to know which of your AO3 fics were scraped. Locked works were not 100% protected from this scrape. Currently, I don't know of any next steps you should be taking, so this is all informational.
Most people should use this link to check if they were included in the March 2025 AO3 scrape. This will show up to 2,000 scraped works for most usernames.
Or you can use this version, which is slower but does a better job if your username is a common word. This version also lets you look up works by work ID number, which is useful if you're looking for an orphaned or anonymous fic.
If you have more than 2,000 published works, first off, I am jealous of your motivation to write that much. But second, that won't display right on the public version of the tools. You can send me an ask (preferred) or DM (if you need to) to have me do a custom search for you if you have more than 2,000 total works under 1 username. If you send an ask off-anon asking me to search a name, I'll assume you want a private answer.
In case this post breaches containment: this is a tool that only has access to the work IDs, titles, author names, chapter counts, and hit counts of the scraped fics for this most recent scrape by nyuuzyou discovered in April 2025. There is no other work data in this tool. This never had the content of your works loaded to it, only info to help you check if your works were scraped. If you need additional metadata, I can search my offline copy for you if you share a work ID number and tell me what data you're looking for. I will never search the full work text for anyone, but I can check things like word counts and tags.
Please come yell if the tool stops working, and I'll fix as fast as I can. It's slow as hell, but it does load eventually. Give it up to 10 minutes, and if it seems down after that, please alert me via ask! Anons are on if you're shy. The link at the top is faster and handles most users well.
On mobile, enable screen rotation and turn your phone sideways. It's a litttttle easier to use like that. It works better if you can use desktop.
Some FAQs below the cut:
"What do I need to do now?": At this time, the main place where this dataset was shared is disabled. As far as I'm aware, you don't need to do anything, but I'll update if I hear otherwise. If you're worried about getting scraped again, locking your fics to users only is NOT a guarantee, but it's a little extra protection. There are methods that can protect you more, but those will come at a cost of hiding your works from more potential readers as well.
"I know AO3 will be scraped again, and I'm willing to put a silly amount of effort into making my fics unusable for AI!": Excellent, stick around here. I'm currently trying to keep up with anyone working on solutions to poison our AO3 fics, and I will be reblogging information about doing this as I come across it.
"I want my fics to be unusable for AI, but I wanna be lazy about it.": You're so real for that, bestie. It may take awhile, but I'm on the lookout for data poisoning methods that require less effort, and I will boost posts regarding that once I find anything reputable.
"I don't want to know!": This tool is 100% optional. If you don't want to know, simply don't click the link. You are totally welcome to block me if it makes you feel more comfortable.
"Can I see the exact content they scraped?": Nope, not through me. I don't have the time to vet every single person to make sure they are who they say they are, and I don't want to risk giving a scraped copy of your fic to anyone else. If you really want to see this, you can find the info out there still and look it up yourself, but I can't be the one to do it for you.
"Are locked fics safe?": Not safe, but so far, it appears that locked fics were scraped less often than public fics. The only fics I haven't seen scraped as of right now are fics in unrevealed collections, which even logged-in users can't view without permission from the owner.
"My work wasn't a fic. It was an image/video/podfic.": You're safe! All the scrape got was stuff like the tags you used and your title and author name. The work content itself is a blank gap based on the samples I've checked.
"It's slow.": Unfortunately, a 13 million row data dashboard is going to be on the slow side. I think I've done everything I can to speed it up, but it may still take up to 10 minutes to load if you use the second link. It's faster if you can use desktop or the first link, but it should work on your phone too.
"My fic isn't there.": The cut-off date is around February 15th, 2025 for oneshots, but chapters posted up to March 21st, 2025 have been found in the data so far. I had to remove a few works from the dataset because the data was all skrungly and breaking my tool. (The few fics I removed were NOT in English.) Otherwise, from what I can tell so far, the scraper's code just... wasn't very good, so most likely, your fic was missed by random chance.
Thanks to everyone who helped with the cost to host the tool! I appreciate you so so so much. As of this edit, I've received more donations than what I paid to make this tool so you do NOT need to keep sending money. (But I super appreciate everyone who did help fund this! I just wanna make sure we all know it's all paid for now, so if you send any more that's just going to my savings to fix the electrical problems with my house. I don't have any more costs to support for this project right now.)
(Made some edits to the post on 27-May-2025 to update information!)
7K notes
·
View notes
Text
Reblogging some accessibility critique in hopes we can collectively get some brainstorming over how to get around this! If any of you have ideas, I'm interested in hearing them. I'm not striving to make my stories less accessible to people who want to read them, so the more we can reduce the impact to any accessibility tools, the better.
The major issues I'm seeing here are:
People having a routine that works for them already and not being willing to adjust. (All the current method offers now is a suggestion that any author using it includes a note in the first A/N to alert readers and offer guidance on workarounds. That's relying on the author to understand well enough to give that guidance and to care enough to take the time to do so, and it's relying on the reader paying attention to the A/N before they try to download the fic.)
TTS apps that specifically require a PDF or other downloaded copy of the fic instead of being able to read from the browser. (That could be all of them. I personally have only used PC-based TTS before testing the app prev mentioned using, so I'm not well-versed in what mobile apps have up for offer. If anyone wants to infodump, I'm happy to hear it.)
At this time, I do believe it's not going to be possible to accommodate this, but I would love to be proven wrong. The issue is that the download button you have been using for a decade is hooked to the same area of the website that is the easiest and most reliable to scrape without being detected and stopped.
We're brainstorming, so there's no dumb idea in my opinion. If you think it's dumb anyway, my anons are still on if you want to contribute without attaching your name to the idea.
This tool is optional. No one is required to use it, but it's here if you want to know which of your AO3 fics were scraped. Locked works were not 100% protected from this scrape. Currently, I don't know of any next steps you should be taking, so this is all informational.
Most people should use this link to check if they were included in the March 2025 AO3 scrape. This will show up to 2,000 scraped works for most usernames.
Or you can use this version, which is slower but does a better job if your username is a common word. This version also lets you look up works by work ID number, which is useful if you're looking for an orphaned or anonymous fic.
If you have more than 2,000 published works, first off, I am jealous of your motivation to write that much. But second, that won't display right on the public version of the tools. You can send me an ask (preferred) or DM (if you need to) to have me do a custom search for you if you have more than 2,000 total works under 1 username. If you send an ask off-anon asking me to search a name, I'll assume you want a private answer.
In case this post breaches containment: this is a tool that only has access to the work IDs, titles, author names, chapter counts, and hit counts of the scraped fics for this most recent scrape by nyuuzyou discovered in April 2025. There is no other work data in this tool. This never had the content of your works loaded to it, only info to help you check if your works were scraped. If you need additional metadata, I can search my offline copy for you if you share a work ID number and tell me what data you're looking for. I will never search the full work text for anyone, but I can check things like word counts and tags.
Please come yell if the tool stops working, and I'll fix as fast as I can. It's slow as hell, but it does load eventually. Give it up to 10 minutes, and if it seems down after that, please alert me via ask! Anons are on if you're shy. The link at the top is faster and handles most users well.
On mobile, enable screen rotation and turn your phone sideways. It's a litttttle easier to use like that. It works better if you can use desktop.
Some FAQs below the cut:
"What do I need to do now?": At this time, the main place where this dataset was shared is disabled. As far as I'm aware, you don't need to do anything, but I'll update if I hear otherwise. If you're worried about getting scraped again, locking your fics to users only is NOT a guarantee, but it's a little extra protection. There are methods that can protect you more, but those will come at a cost of hiding your works from more potential readers as well.
"I know AO3 will be scraped again, and I'm willing to put a silly amount of effort into making my fics unusable for AI!": Excellent, stick around here. I'm currently trying to keep up with anyone working on solutions to poison our AO3 fics, and I will be reblogging information about doing this as I come across it.
"I want my fics to be unusable for AI, but I wanna be lazy about it.": You're so real for that, bestie. It may take awhile, but I'm on the lookout for data poisoning methods that require less effort, and I will boost posts regarding that once I find anything reputable.
"I don't want to know!": This tool is 100% optional. If you don't want to know, simply don't click the link. You are totally welcome to block me if it makes you feel more comfortable.
"Can I see the exact content they scraped?": Nope, not through me. I don't have the time to vet every single person to make sure they are who they say they are, and I don't want to risk giving a scraped copy of your fic to anyone else. If you really want to see this, you can find the info out there still and look it up yourself, but I can't be the one to do it for you.
"Are locked fics safe?": Not safe, but so far, it appears that locked fics were scraped less often than public fics. The only fics I haven't seen scraped as of right now are fics in unrevealed collections, which even logged-in users can't view without permission from the owner.
"My work wasn't a fic. It was an image/video/podfic.": You're safe! All the scrape got was stuff like the tags you used and your title and author name. The work content itself is a blank gap based on the samples I've checked.
"It's slow.": Unfortunately, a 13 million row data dashboard is going to be on the slow side. I think I've done everything I can to speed it up, but it may still take up to 10 minutes to load if you use the second link. It's faster if you can use desktop or the first link, but it should work on your phone too.
"My fic isn't there.": The cut-off date is around February 15th, 2025 for oneshots, but chapters posted up to March 21st, 2025 have been found in the data so far. I had to remove a few works from the dataset because the data was all skrungly and breaking my tool. (The few fics I removed were NOT in English.) Otherwise, from what I can tell so far, the scraper's code just... wasn't very good, so most likely, your fic was missed by random chance.
Thanks to everyone who helped with the cost to host the tool! I appreciate you so so so much. As of this edit, I've received more donations than what I paid to make this tool so you do NOT need to keep sending money. (But I super appreciate everyone who did help fund this! I just wanna make sure we all know it's all paid for now, so if you send any more that's just going to my savings to fix the electrical problems with my house. I don't have any more costs to support for this project right now.)
(Made some edits to the post on 27-May-2025 to update information!)
7K notes
·
View notes
Text
Maybe I'm not getting it! To me, it makes the most sense to use an AO3 site skin with the font and background color that is accessible to me. I'm just getting hung up on the why of downloading as an epub just for accessibility if it's only for the colors. (Genuine question, not trying to be a dick. Just trying to understand why you have to do it this way so I can work on improving our methods to exclude fewer people.)
This tool is optional. No one is required to use it, but it's here if you want to know which of your AO3 fics were scraped. Locked works were not 100% protected from this scrape. Currently, I don't know of any next steps you should be taking, so this is all informational.
Most people should use this link to check if they were included in the March 2025 AO3 scrape. This will show up to 2,000 scraped works for most usernames.
Or you can use this version, which is slower but does a better job if your username is a common word. This version also lets you look up works by work ID number, which is useful if you're looking for an orphaned or anonymous fic.
If you have more than 2,000 published works, first off, I am jealous of your motivation to write that much. But second, that won't display right on the public version of the tools. You can send me an ask (preferred) or DM (if you need to) to have me do a custom search for you if you have more than 2,000 total works under 1 username. If you send an ask off-anon asking me to search a name, I'll assume you want a private answer.
In case this post breaches containment: this is a tool that only has access to the work IDs, titles, author names, chapter counts, and hit counts of the scraped fics for this most recent scrape by nyuuzyou discovered in April 2025. There is no other work data in this tool. This never had the content of your works loaded to it, only info to help you check if your works were scraped. If you need additional metadata, I can search my offline copy for you if you share a work ID number and tell me what data you're looking for. I will never search the full work text for anyone, but I can check things like word counts and tags.
Please come yell if the tool stops working, and I'll fix as fast as I can. It's slow as hell, but it does load eventually. Give it up to 10 minutes, and if it seems down after that, please alert me via ask! Anons are on if you're shy. The link at the top is faster and handles most users well.
On mobile, enable screen rotation and turn your phone sideways. It's a litttttle easier to use like that. It works better if you can use desktop.
Some FAQs below the cut:
"What do I need to do now?": At this time, the main place where this dataset was shared is disabled. As far as I'm aware, you don't need to do anything, but I'll update if I hear otherwise. If you're worried about getting scraped again, locking your fics to users only is NOT a guarantee, but it's a little extra protection. There are methods that can protect you more, but those will come at a cost of hiding your works from more potential readers as well.
"I know AO3 will be scraped again, and I'm willing to put a silly amount of effort into making my fics unusable for AI!": Excellent, stick around here. I'm currently trying to keep up with anyone working on solutions to poison our AO3 fics, and I will be reblogging information about doing this as I come across it.
"I want my fics to be unusable for AI, but I wanna be lazy about it.": You're so real for that, bestie. It may take awhile, but I'm on the lookout for data poisoning methods that require less effort, and I will boost posts regarding that once I find anything reputable.
"I don't want to know!": This tool is 100% optional. If you don't want to know, simply don't click the link. You are totally welcome to block me if it makes you feel more comfortable.
"Can I see the exact content they scraped?": Nope, not through me. I don't have the time to vet every single person to make sure they are who they say they are, and I don't want to risk giving a scraped copy of your fic to anyone else. If you really want to see this, you can find the info out there still and look it up yourself, but I can't be the one to do it for you.
"Are locked fics safe?": Not safe, but so far, it appears that locked fics were scraped less often than public fics. The only fics I haven't seen scraped as of right now are fics in unrevealed collections, which even logged-in users can't view without permission from the owner.
"My work wasn't a fic. It was an image/video/podfic.": You're safe! All the scrape got was stuff like the tags you used and your title and author name. The work content itself is a blank gap based on the samples I've checked.
"It's slow.": Unfortunately, a 13 million row data dashboard is going to be on the slow side. I think I've done everything I can to speed it up, but it may still take up to 10 minutes to load if you use the second link. It's faster if you can use desktop or the first link, but it should work on your phone too.
"My fic isn't there.": The cut-off date is around February 15th, 2025 for oneshots, but chapters posted up to March 21st, 2025 have been found in the data so far. I had to remove a few works from the dataset because the data was all skrungly and breaking my tool. (The few fics I removed were NOT in English.) Otherwise, from what I can tell so far, the scraper's code just... wasn't very good, so most likely, your fic was missed by random chance.
Thanks to everyone who helped with the cost to host the tool! I appreciate you so so so much. As of this edit, I've received more donations than what I paid to make this tool so you do NOT need to keep sending money. (But I super appreciate everyone who did help fund this! I just wanna make sure we all know it's all paid for now, so if you send any more that's just going to my savings to fix the electrical problems with my house. I don't have any more costs to support for this project right now.)
(Made some edits to the post on 27-May-2025 to update information!)
7K notes
·
View notes
Text
Not sure why the attitude in the tags since this literally works with the app you mentioned using the basic workaround suggested above.
For anyone interested @ Voice works with this poisoning method as long as you (the reader) use the "Print to PDF" workaround that authors can choose to mention in their A/Ns on poisoned fics.
This tool is optional. No one is required to use it, but it's here if you want to know which of your AO3 fics were scraped. Locked works were not 100% protected from this scrape. Currently, I don't know of any next steps you should be taking, so this is all informational.
Most people should use this link to check if they were included in the March 2025 AO3 scrape. This will show up to 2,000 scraped works for most usernames.
Or you can use this version, which is slower but does a better job if your username is a common word. This version also lets you look up works by work ID number, which is useful if you're looking for an orphaned or anonymous fic.
If you have more than 2,000 published works, first off, I am jealous of your motivation to write that much. But second, that won't display right on the public version of the tools. You can send me an ask (preferred) or DM (if you need to) to have me do a custom search for you if you have more than 2,000 total works under 1 username. If you send an ask off-anon asking me to search a name, I'll assume you want a private answer.
In case this post breaches containment: this is a tool that only has access to the work IDs, titles, author names, chapter counts, and hit counts of the scraped fics for this most recent scrape by nyuuzyou discovered in April 2025. There is no other work data in this tool. This never had the content of your works loaded to it, only info to help you check if your works were scraped. If you need additional metadata, I can search my offline copy for you if you share a work ID number and tell me what data you're looking for. I will never search the full work text for anyone, but I can check things like word counts and tags.
Please come yell if the tool stops working, and I'll fix as fast as I can. It's slow as hell, but it does load eventually. Give it up to 10 minutes, and if it seems down after that, please alert me via ask! Anons are on if you're shy. The link at the top is faster and handles most users well.
On mobile, enable screen rotation and turn your phone sideways. It's a litttttle easier to use like that. It works better if you can use desktop.
Some FAQs below the cut:
"What do I need to do now?": At this time, the main place where this dataset was shared is disabled. As far as I'm aware, you don't need to do anything, but I'll update if I hear otherwise. If you're worried about getting scraped again, locking your fics to users only is NOT a guarantee, but it's a little extra protection. There are methods that can protect you more, but those will come at a cost of hiding your works from more potential readers as well.
"I know AO3 will be scraped again, and I'm willing to put a silly amount of effort into making my fics unusable for AI!": Excellent, stick around here. I'm currently trying to keep up with anyone working on solutions to poison our AO3 fics, and I will be reblogging information about doing this as I come across it.
"I want my fics to be unusable for AI, but I wanna be lazy about it.": You're so real for that, bestie. It may take awhile, but I'm on the lookout for data poisoning methods that require less effort, and I will boost posts regarding that once I find anything reputable.
"I don't want to know!": This tool is 100% optional. If you don't want to know, simply don't click the link. You are totally welcome to block me if it makes you feel more comfortable.
"Can I see the exact content they scraped?": Nope, not through me. I don't have the time to vet every single person to make sure they are who they say they are, and I don't want to risk giving a scraped copy of your fic to anyone else. If you really want to see this, you can find the info out there still and look it up yourself, but I can't be the one to do it for you.
"Are locked fics safe?": Not safe, but so far, it appears that locked fics were scraped less often than public fics. The only fics I haven't seen scraped as of right now are fics in unrevealed collections, which even logged-in users can't view without permission from the owner.
"My work wasn't a fic. It was an image/video/podfic.": You're safe! All the scrape got was stuff like the tags you used and your title and author name. The work content itself is a blank gap based on the samples I've checked.
"It's slow.": Unfortunately, a 13 million row data dashboard is going to be on the slow side. I think I've done everything I can to speed it up, but it may still take up to 10 minutes to load if you use the second link. It's faster if you can use desktop or the first link, but it should work on your phone too.
"My fic isn't there.": The cut-off date is around February 15th, 2025 for oneshots, but chapters posted up to March 21st, 2025 have been found in the data so far. I had to remove a few works from the dataset because the data was all skrungly and breaking my tool. (The few fics I removed were NOT in English.) Otherwise, from what I can tell so far, the scraper's code just... wasn't very good, so most likely, your fic was missed by random chance.
Thanks to everyone who helped with the cost to host the tool! I appreciate you so so so much. As of this edit, I've received more donations than what I paid to make this tool so you do NOT need to keep sending money. (But I super appreciate everyone who did help fund this! I just wanna make sure we all know it's all paid for now, so if you send any more that's just going to my savings to fix the electrical problems with my house. I don't have any more costs to support for this project right now.)
(Made some edits to the post on 27-May-2025 to update information!)
7K notes
·
View notes
Note
Nah, I figured it wasn't any bad intention on your part! Honestly, I personally left the random fucks and assholes in my fics that are rated T or higher because I thought those were funny. It's only the slur I went through and edited out.
Anyone who was worried about this: the words listed under the cut in the above post have been removed.
If you spot any others, please be kind when reporting them! It's easiest to correct the words if you report all the ones you're concerned about at once rather than finding and sharing them one by one. And repeating from trash-prince's tags: please don't report words that have a common SFW meaning such as balls.
The tool works great! I do have a concern though, similar to the coded version, various cuss words that would not be good for a G fanfic but also a lot of old dated slurs that got in. I had to go over four or fives times because they took me by surprise.
Thank you, anon! Good thing to be aware of, particularly for anyone who's posting G-rated fics. Here is AO3's official guidance to rating your fics. It's not super detailed and doesn't have recommendation on what swearing and slurs should do to your rating, but I would personally hesitate to have any sort of those in a G-rated fic of mine.
Couple options you have:
Decide you don't mind that since it's invisible to most people or since your fics are all rated T or higher. If this is your choice, please know that the poison is visible to people who choose to turn off the work skin and to people who download the fic to read offline.
Review the list of words that may be inserted by the poisoning tool. After poisoning each fic, do a quick Ctrl + F for any words you're uncomfortable with and substitute another word anywhere those ones ended up. This is easier if you paste in the poison and then switch over to the rich text editor in AO3.
If you're even a little into coding, all-hail-trash-prince has made the full code available. Removing a few lines from the poison word database would be a simple modification to your own local copy of the tool.
And as always, there's the option to not use the tool at all if it's not to your liking in its current state. It may be worth kindly mentioning to the creator to see if they would be willing to put in the extra effort to modify what they've created and possibly make a separate G-rated version of the tool. The words were pulled straight from a list of the most popular English words, so I see no ill intent in including any of the words in that list.
I'm gonna put it under the cut so you don't have to see, but here's a list of words I saw in the database that someone might find objectionable. If I missed any, let me know. You can review the full list of poison words here. Use one of the above 4 options to deal with this as you see fit.
arse
ass
bastard
bitch
cock
cunt
damn
dick
fag
fuck
I'm not typing out the n-slur, but that's in there
piss
pussy
shit
slut
swastika
tit
twat
whore
I left out words I saw that can have an innocent meaning, like snatch and prick. Any word in this list may appear as part of another word, but if you search up fuck, then you're also going to find any motherfucker that got sprinkled in.
13 notes
·
View notes
Text
For sure! As mentioned, it's a personal decision for anyone who wants to, and that aspect is touched on in one of the links above. If you prioritize your readers having a little more freedom in how they interact with your work, then this method isn't for you.
The default AO3 setting is to use the creator's work skin, so this isn't an issue for most people. A good way for authors who choose to use this poisoning method to address those who DO have that setting flipped is to mention it in the opening author's note. It's one simple button to click to enable the custom work skin as a reader who has them disabled by default, and the A/N can be a cue to do that for anyone who doesn't figure it out on their own.
The creator of this code also did recommend making an obvious mention in the A/N that you need to use an alternate download method for the fic. People can still easily download the fic by "printing" it but selecting "print to PDF" to save a copy as it visually appears. Alternatively, you can save an entire webpage as HTML and it will maintain the intended formatting. You just can't click the native AO3 download button on a poisoned fic.
And correct! It doesn't seem to negatively impact TTS. I tested on the only TTS software I have easy access to, but if someone else finds that there's a popular TTS that doesn't cooperate with this method, please let me know.
There's no required way to use this tool if you choose to use it, but in my opinion, the best way is to include a mention of the major potential impacts in the first A/N, including the inability to use the native download button and the requirement to view the work with the work skin enabled.
This tool is optional. No one is required to use it, but it's here if you want to know which of your AO3 fics were scraped. Locked works were not 100% protected from this scrape. Currently, I don't know of any next steps you should be taking, so this is all informational.
Most people should use this link to check if they were included in the March 2025 AO3 scrape. This will show up to 2,000 scraped works for most usernames.
Or you can use this version, which is slower but does a better job if your username is a common word. This version also lets you look up works by work ID number, which is useful if you're looking for an orphaned or anonymous fic.
If you have more than 2,000 published works, first off, I am jealous of your motivation to write that much. But second, that won't display right on the public version of the tools. You can send me an ask (preferred) or DM (if you need to) to have me do a custom search for you if you have more than 2,000 total works under 1 username. If you send an ask off-anon asking me to search a name, I'll assume you want a private answer.
In case this post breaches containment: this is a tool that only has access to the work IDs, titles, author names, chapter counts, and hit counts of the scraped fics for this most recent scrape by nyuuzyou discovered in April 2025. There is no other work data in this tool. This never had the content of your works loaded to it, only info to help you check if your works were scraped. If you need additional metadata, I can search my offline copy for you if you share a work ID number and tell me what data you're looking for. I will never search the full work text for anyone, but I can check things like word counts and tags.
Please come yell if the tool stops working, and I'll fix as fast as I can. It's slow as hell, but it does load eventually. Give it up to 10 minutes, and if it seems down after that, please alert me via ask! Anons are on if you're shy. The link at the top is faster and handles most users well.
On mobile, enable screen rotation and turn your phone sideways. It's a litttttle easier to use like that. It works better if you can use desktop.
Some FAQs below the cut:
"What do I need to do now?": At this time, the main place where this dataset was shared is disabled. As far as I'm aware, you don't need to do anything, but I'll update if I hear otherwise. If you're worried about getting scraped again, locking your fics to users only is NOT a guarantee, but it's a little extra protection. There are methods that can protect you more, but those will come at a cost of hiding your works from more potential readers as well.
"I know AO3 will be scraped again, and I'm willing to put a silly amount of effort into making my fics unusable for AI!": Excellent, stick around here. I'm currently trying to keep up with anyone working on solutions to poison our AO3 fics, and I will be reblogging information about doing this as I come across it.
"I want my fics to be unusable for AI, but I wanna be lazy about it.": You're so real for that, bestie. It may take awhile, but I'm on the lookout for data poisoning methods that require less effort, and I will boost posts regarding that once I find anything reputable.
"I don't want to know!": This tool is 100% optional. If you don't want to know, simply don't click the link. You are totally welcome to block me if it makes you feel more comfortable.
"Can I see the exact content they scraped?": Nope, not through me. I don't have the time to vet every single person to make sure they are who they say they are, and I don't want to risk giving a scraped copy of your fic to anyone else. If you really want to see this, you can find the info out there still and look it up yourself, but I can't be the one to do it for you.
"Are locked fics safe?": Not safe, but so far, it appears that locked fics were scraped less often than public fics. The only fics I haven't seen scraped as of right now are fics in unrevealed collections, which even logged-in users can't view without permission from the owner.
"My work wasn't a fic. It was an image/video/podfic.": You're safe! All the scrape got was stuff like the tags you used and your title and author name. The work content itself is a blank gap based on the samples I've checked.
"It's slow.": Unfortunately, a 13 million row data dashboard is going to be on the slow side. I think I've done everything I can to speed it up, but it may still take up to 10 minutes to load if you use the second link. It's faster if you can use desktop or the first link, but it should work on your phone too.
"My fic isn't there.": The cut-off date is around February 15th, 2025 for oneshots, but chapters posted up to March 21st, 2025 have been found in the data so far. I had to remove a few works from the dataset because the data was all skrungly and breaking my tool. (The few fics I removed were NOT in English.) Otherwise, from what I can tell so far, the scraper's code just... wasn't very good, so most likely, your fic was missed by random chance.
Thanks to everyone who helped with the cost to host the tool! I appreciate you so so so much. As of this edit, I've received more donations than what I paid to make this tool so you do NOT need to keep sending money. (But I super appreciate everyone who did help fund this! I just wanna make sure we all know it's all paid for now, so if you send any more that's just going to my savings to fix the electrical problems with my house. I don't have any more costs to support for this project right now.)
(Made some edits to the post on 27-May-2025 to update information!)
7K notes
·
View notes
Note
The tool works great! I do have a concern though, similar to the coded version, various cuss words that would not be good for a G fanfic but also a lot of old dated slurs that got in. I had to go over four or fives times because they took me by surprise.
Thank you, anon! Good thing to be aware of, particularly for anyone who's posting G-rated fics. Here is AO3's official guidance to rating your fics. It's not super detailed and doesn't have recommendation on what swearing and slurs should do to your rating, but I would personally hesitate to have any sort of those in a G-rated fic of mine.
Couple options you have:
Decide you don't mind that since it's invisible to most people or since your fics are all rated T or higher. If this is your choice, please know that the poison is visible to people who choose to turn off the work skin and to people who download the fic to read offline.
Review the list of words that may be inserted by the poisoning tool. After poisoning each fic, do a quick Ctrl + F for any words you're uncomfortable with and substitute another word anywhere those ones ended up. This is easier if you paste in the poison and then switch over to the rich text editor in AO3.
If you're even a little into coding, all-hail-trash-prince has made the full code available. Removing a few lines from the poison word database would be a simple modification to your own local copy of the tool.
And as always, there's the option to not use the tool at all if it's not to your liking in its current state. It may be worth kindly mentioning to the creator to see if they would be willing to put in the extra effort to modify what they've created and possibly make a separate G-rated version of the tool. The words were pulled straight from a list of the most popular English words, so I see no ill intent in including any of the words in that list.
I'm gonna put it under the cut so you don't have to see, but here's a list of words I saw in the database that someone might find objectionable. If I missed any, let me know. You can review the full list of poison words here. Use one of the above 4 options to deal with this as you see fit.
arse
ass
bastard
bitch
cock
cunt
damn
dick
fag
fuck
I'm not typing out the n-slur, but that's in there
piss
pussy
shit
slut
swastika
tit
twat
whore
I left out words I saw that can have an innocent meaning, like snatch and prick. Any word in this list may appear as part of another word, but if you search up fuck, then you're also going to find any motherfucker that got sprinkled in.
#if it gives you an idea of how prevalent this is....#i poisoned about 300k words of fic with this tool#my only hard no i've seen in that word list is the n-slur#among my 300k of original words the poison tool inserted 11 total n-slurs that i had to search and remove#feels weird as hell to be doing a ctrl+f for that word but like.#11 instances in all of that fic? not the worst it could be
13 notes
·
View notes
Text
Gonna pass this around once more with an addition!
Here is a tool that can be used to poison your current and future AO3 fics if you would like to reduce the chances that your work will be useful to scrapers in the future. You can thank the lovely @all-hail-trash-prince for writing the code and for making it more accessible to those who don't know how to run code.
Here is me doing a very brief runthrough on the current version of the tool (as of this reblog). Here is me yapping about it in video if you prefer that format.
I'm not saying you have to use this tool. I'm just boosting it as one potential path forward for those who want to reduce the quality of their fics for future scrapers without reducing the quality for readers on AO3, including those using screen readers to narrate your fics. You can choose to use it for zero fics, one fic, some of your fics, or all of your fics. Entirely up to you if you want to do this.
Poisoning your fics now does not undo nyuuzyou's scrape. The data, while a little less accessible now, is still out there in a few locations. There's no undoing that, and there will always be a copy of the fics scraped by nyuuzyou. Poisoning your fics is a method to protect you from future scrapes. The protection doesn't reduce the risk that you will be scraped, but it highly increases the chance that the scraper will not be able to use your fic for anything useful, including AI training. There's no surefire way to protect your fics short of not sharing them at all, but this tool provides the best balance I've seen of protecting them from unwanted use and still allowing you to share with real readers.
And once more for anyone who hasn't caught it yet: I am not directly affiliated with AO3. This is a third party tool that you may choose to use or may choose to ignore.
Editing to add: This code should probably be considered in an alpha phase. This is code written by one person, vetted by another who knows coding in general but not this specific language. It may have bugs to iron out. It may have unexpected impacts, particularly to the accessibility of your fics. If you're not okay with that, please send your specific concerns here so I can look into them AND/OR wait for us to get through testing this code in actual use. Any concerns and bugs you report here are appreciated because they help everyone involved make this better over time.
This tool is optional. No one is required to use it, but it's here if you want to know which of your AO3 fics were scraped. Locked works were not 100% protected from this scrape. Currently, I don't know of any next steps you should be taking, so this is all informational.
Most people should use this link to check if they were included in the March 2025 AO3 scrape. This will show up to 2,000 scraped works for most usernames.
Or you can use this version, which is slower but does a better job if your username is a common word. This version also lets you look up works by work ID number, which is useful if you're looking for an orphaned or anonymous fic.
If you have more than 2,000 published works, first off, I am jealous of your motivation to write that much. But second, that won't display right on the public version of the tools. You can send me an ask (preferred) or DM (if you need to) to have me do a custom search for you if you have more than 2,000 total works under 1 username. If you send an ask off-anon asking me to search a name, I'll assume you want a private answer.
In case this post breaches containment: this is a tool that only has access to the work IDs, titles, author names, chapter counts, and hit counts of the scraped fics for this most recent scrape by nyuuzyou discovered in April 2025. There is no other work data in this tool. This never had the content of your works loaded to it, only info to help you check if your works were scraped. If you need additional metadata, I can search my offline copy for you if you share a work ID number and tell me what data you're looking for. I will never search the full work text for anyone, but I can check things like word counts and tags.
Please come yell if the tool stops working, and I'll fix as fast as I can. It's slow as hell, but it does load eventually. Give it up to 10 minutes, and if it seems down after that, please alert me via ask! Anons are on if you're shy. The link at the top is faster and handles most users well.
On mobile, enable screen rotation and turn your phone sideways. It's a litttttle easier to use like that. It works better if you can use desktop.
Some FAQs below the cut:
"What do I need to do now?": At this time, the main place where this dataset was shared is disabled. As far as I'm aware, you don't need to do anything, but I'll update if I hear otherwise. If you're worried about getting scraped again, locking your fics to users only is NOT a guarantee, but it's a little extra protection. There are methods that can protect you more, but those will come at a cost of hiding your works from more potential readers as well.
"I know AO3 will be scraped again, and I'm willing to put a silly amount of effort into making my fics unusable for AI!": Excellent, stick around here. I'm currently trying to keep up with anyone working on solutions to poison our AO3 fics, and I will be reblogging information about doing this as I come across it.
"I want my fics to be unusable for AI, but I wanna be lazy about it.": You're so real for that, bestie. It may take awhile, but I'm on the lookout for data poisoning methods that require less effort, and I will boost posts regarding that once I find anything reputable.
"I don't want to know!": This tool is 100% optional. If you don't want to know, simply don't click the link. You are totally welcome to block me if it makes you feel more comfortable.
"Can I see the exact content they scraped?": Nope, not through me. I don't have the time to vet every single person to make sure they are who they say they are, and I don't want to risk giving a scraped copy of your fic to anyone else. If you really want to see this, you can find the info out there still and look it up yourself, but I can't be the one to do it for you.
"Are locked fics safe?": Not safe, but so far, it appears that locked fics were scraped less often than public fics. The only fics I haven't seen scraped as of right now are fics in unrevealed collections, which even logged-in users can't view without permission from the owner.
"My work wasn't a fic. It was an image/video/podfic.": You're safe! All the scrape got was stuff like the tags you used and your title and author name. The work content itself is a blank gap based on the samples I've checked.
"It's slow.": Unfortunately, a 13 million row data dashboard is going to be on the slow side. I think I've done everything I can to speed it up, but it may still take up to 10 minutes to load if you use the second link. It's faster if you can use desktop or the first link, but it should work on your phone too.
"My fic isn't there.": The cut-off date is around February 15th, 2025 for oneshots, but chapters posted up to March 21st, 2025 have been found in the data so far. I had to remove a few works from the dataset because the data was all skrungly and breaking my tool. (The few fics I removed were NOT in English.) Otherwise, from what I can tell so far, the scraper's code just... wasn't very good, so most likely, your fic was missed by random chance.
Thanks to everyone who helped with the cost to host the tool! I appreciate you so so so much. As of this edit, I've received more donations than what I paid to make this tool so you do NOT need to keep sending money. (But I super appreciate everyone who did help fund this! I just wanna make sure we all know it's all paid for now, so if you send any more that's just going to my savings to fix the electrical problems with my house. I don't have any more costs to support for this project right now.)
(Made some edits to the post on 27-May-2025 to update information!)
7K notes
·
View notes
Note
Hi! The display: none style got me thinking about using it for those that don’t know code and I’ve been experimenting with using it just as a skin, so it’s hidden from users. At this point the only problem is it has to be used manually as in putting paragraphs of junk in and classifying the paragraph as “Hey hide this one” I also shrunk the text for good measure, so it looks like a paragraph break line if creator style is off. I’m curious if you think that would mess up screen readers? I remember you saying you’d test hover text when you had a chance. Do you think you could add this to your list of things to text too? Thanks for all you're doing!
Just tested the display:none style while looking at the tool by @all-hail-trash-prince! My default screen reader does not read anything aloud with display:none enabled. It skips over those chunks automatically.
3 notes
·
View notes
Note
Hiya! I know it's been a little while but I just wanted to let you know I finally got around to making the web version of that fic poisoning tool I made about a month ago. It's at https://tricksofloki.github.io/ficpoison.html if you're interested :)
OHOHOHO!
Alright, I gave this a little test on my own fic over here. Quick little review/notes for anyone interested! (But the tl;dr is that I approve based on my initial review of the original code and based on using this web tool to automate running the code.)
This version is super easy to use. I'll be honest; I was struggling trying to figure out how to run the code locally before because that is not a coding language I personally use, and this website takes out all of the hard part of doing that. You need to do the one time task of creating a work skin to enable the "poison" CSS used, and you need to make sure that work skin is enabled for any work you're going to use this on. The code to put into your work skin is available at the link. If you already have a work skin you use, you can just add this class to it. (I think the tutorial I linked to does a good job walking you through how, but I'm open to doing a tutorial on this blog if anyone wants that.)
If you're poisoning an existing fic, first have a backup copy. Once you poison it, that copy is going to be annoying to UN-poison if you ever want to, so you should keep a private copy on your PC or phone or wherever so you have the unpoisoned version available. Once you do this, your copy on AO3 is poisoned, and it would take a fair amount of effort to unpoison as the author. Upside: as the author, you can see all the CSS stuff in the background, so if you really need to unpoison a copy as the author with full access to it, it's not impossible. Just really annoying.
For reference, here's what I can see as the author with access to the edit page:
I can clearly see where the poison is if I really wanted to go back through and unpoison.
And here is what I can see in a copy scraped with nyuuzyou's code:
You can definitely see it's messed up by looking, but you don't see an active callout to where exactly the poison code is. Keep in mind that not every scraper uses the same code as nyuuzyou, and more sophisticated code may pull something more sophisticated than the plain text from nyuuzyou's tool. Other scrapers may be pulling fics with the formatting and everything, and I don't know exactly what that output looks like. Depending on what their output is, if they can see the class for the poison, they can pretty easily code something to remove it. That's me being overly conservative, I suspect. I haven't heard of any scrapers who have bothered with anything more than plain text, and this isn't an issue unless they're grabbing the full HTML. (Translation: From what I know, this is NOT an issue. Yet. So this is not a weakness of the poison tool. Yet.)
Based on the output, anyone who's doing a half decent job of cleaning up the data they scrape would toss my fic out of the dataset. It's full of what look like typos because the poison got placed mid-word, so it looks like I just suck at writing. If your goal is to get tossed out of the dataset, this is perfect. If a scraper isn't paying attention at all, you can contribute some really terrible training data if they leave your fic in the set because your poisoned fic is going to be full or words that don't even exist thanks to the word placement.
As far as using the tool, I used an existing fic. I went into the edit page for the chapter, scrolled to the bottom and left the text editor on the default HTML mode. I copied everything in that box. (Easy method: click into the box where you can type out the fic, and press "Ctrl" and "A" to select all, then "Ctrl" and "C" to copy.) I went to the tab with all-hail-trash-prince's tool, and I pasted it into the box on the left.
I clicked "Apply poison" and the poisoned fic appeared in the right box. I copied the poisoned fic from the right box, went back to my fic on AO3 with my custom work skin already enabled, and I pasted the poison fic in place of the original fic. I clicked the preview button to make sure it would look normal, and it did. So I clicked to update the chapter with the poison block included.
I loaded the chapter with the default Microsoft screen reader turned on, and it didn't read any of the poison data, only the real fic that is visible on the screen, so success there.
So that brings us to applying this to a brand new fic. For those, you're going to go through the motions of posting a fic as usual, but instead of clicking post when you're done, you're going to swap that text editing mode over to HTML and copy everything in there. Take it to the poison tool, paste it in, and grab your poisoned copy. Go back to AO3, make sure your poison work skin is enabled, and then replace the original fic with the poison fic, making sure to stay in the HTML editing mode while you do.
(Sneaky quick edit after posting: sometimes the tool leaves you with a dangling <p> or </p> or <em>. Make sure you always preview the chapter after poisoning it, and you can go back in to the rich text editor to delete any of the floating tags that were accidentally put in by the poison.)
The last downside I notice is that your word count is immediately wrong. My 34k fic looks like a 43k fic after poisoning the first 16k words. Technically, you don't have to tell people the true word count of your fic but like. That feels a little rude to the reader, so I think it would be kind to briefly put the true word count either at the bottom of your summary or in your first author's note.
To me, the downsides of having to create a custom work skin (that trash-prince has kindly already written for everyone) and having the wrong word count displayed... are nothing. In comparison to having my fic be easy to scrape, I'll take those slight downsides any day. From what I know of the current scraping landscape, this is a reasonably effective way to make your fic useless to anyone who scrapes it because people are out there that will be scraping AO3 again.
I'm curious to hear anyone else's thoughts if they check this tool out or try it for themselves, so don't be shy! I'm one person, so maybe I can't catch everything. If you're seeing something that I'm not, I want to hear about it.
And if anyone wants a more visual step by step, you are welcome to yell my way. If this text post is clear enough for everyone, I won't bother, but if a more visual walkthrough will help anyone, then I'm happy to do it!
EDIT: Just tossing in a summary of feedback I've seen from others below!
The tool is pulling from a list of most popular English words, which means it may add inappropriate verbiage to G-rated fics. See this ask for info. trash-prince has made adjustments based on the initial words spotted, but please kindly report any other concerning poison words you find, particularly slurs and other wording that cannot be interpreted in a SFW way.
164 notes
·
View notes
Text
Hi, sorry I have like 20 unanswered asks, but I also am on week..... 10? 11? of working wayyyyyy too many hours. So, my bad! I see 'em, I'm trying to respond to anything that needs a quick response, and I'll get to the others when I'm a little less busy! <3
Feel free to follow up if I'm being too slow on something more urgent. I don't think there's anything like that at this point, but if there is, don't be afraid to check in again!
9 notes
·
View notes