#ao3 data scrape
Explore tagged Tumblr posts
Text







aight
huggingface ao3 scrapers try not to create a false narrative challenge level impossible. try not to commit hasty generalization challenge. try to touch grass challenge.
here’s the thread if yall wanna reblog with the logical fallacies yall find in their arguments.
#idom inflammatory#hugging face#ao3#logical fallacies#fuck ai#anti ai#anti generative ai#anti genai#anti gen ai#fuck generative ai#fuck genai#fuck gen ai#ao3 data scrape
16 notes
·
View notes
Text
i feel so terrible doing this but i had to. i locked all my ao3 works under archive only because i learned about the extent of the data scraping that happened. as much as i love writing and i love sharing it with as many people as possible, i cannot allow the works of which i have poured my heart, my soul, and most importantly, my time into to be scraped up and fed heartlessly to an ai bot. so for now, it’s only visible to archive users.
#livs rambles#stardew fanfic#ao3 fanfic#fan fic writing#fan fiction#fanfic#ao3 data scrape#ao3 ai scraping#ao3#ao3 writer
40 notes
·
View notes
Text
hey, anyone on ao3 who was a part of the data scrape;
if you are in america, I will help you file a DMCA against the website hosting the data set. sorry i can’t provide help to other countries, as I’m only educated in american law.
32 notes
·
View notes
Text
25APR2025 AO3 Data Scrape on Hugging Face
There is a new A03 Datascrape on HuggingFace: https://huggingface.co/datasets/Chat-Error/archiveofourown-newest The "archiveofourown-newest" dataset contains approximately 14,806,149 works, while Archive of Our Own publicly listed a total of approximately 14,880,000 works as of April 23, 2025. If your works preceed that date, it's likely they are in this dataset. I submitted a DCMA takedown to HuggingFace at [email protected] , and if you have bandwidth I recommend you do the same. You can also report the dataset by clicking the three dots and posting a dispute, however you'll likely find the poster unhelpful. Do BOTH. Should you not know what to say, there are plentiful DCMA takedown templates online, or you can copy mine.
Note that the people that posted the dataset are not the actual agents to act on the DCMA, HuggingFace is, and they're likely to try to circumvent whatever it is you post by saying:
Hello, Thank you for identifying the relevant works. Please note that you must include valid contact information, including name, address, email address, and telephone number if possible. Once this is done, we may process your request. Sincerely, Anonymous
Funny and notable that they chose to sign this "Anonymous."
Edit: In case it's not abundantly clear, do not give these random thieves your personal info! GO THROUGH HUGGINGFACE. 2nd Edit: as of 6PM EST, the data set has been taken down!

3rd Edit: as of 27APr2025... They uploaded it as a different dataset
#AO3#FUCK AI#AO3 Data Scrape#HuggingFace#Theft#ao3 community#DCMA#DCMA Takedown#ao3 writer#ao3 author#ao3 fanfic#archiveofourown#anti ai#fuck generative ai
30 notes
·
View notes
Text
im so so thankful my fics were spared from the data scrape since i only started writing in march.
for all the authors whose fics *were* scraped, im sending love and light and hoping they get taken down + that no ai is trained using your hard work. its wishful thinking, yes, but its something.
people stealing and profiting off of your art and hard work is the worst feeling. im so sorry :/
#ao3#ao3 author#ao3 data scrape#fuck generative ai#fuck ai#ai “artists” can fuck off too#suck my non-existant dick for all i care#typing a prompt in a chat isnt talent#fuck off
12 notes
·
View notes
Text
I hope whoever scraped Ao3 for AI training gets hit with the Ao3 authors curse like we’ve never seen before.
Sincerely, an Ao3 writer who’s work got scraped (and has experienced the curse)
12 notes
·
View notes
Text
My works and AI
As most of you know, Ao3 was victim to a MASSIVE data scrape by an AI program called HuggingFace last week. As all of you should know, I am incredibly anti AI. It steals work from artists, it propels global warming and it tends to target those most vulnerable in our society.
Now, because AI is so integrated in how we interact with the internet now, there is only so much I can do to avoid it. But there is one thing I can do, although it breaks my heart.
As of today, all of my works on my GoldenAvenger02 account, as well as my Brentinator account, are only visible to registered users of Ao3.
I went back and forth on this a lot; on one hand, I want my works to be accessible to all people, even if they can't make an Ao3 account. But on the other hand, I've posted nearly 200 fics across both accounts at this point, and I know that if I continue to leave my fics open for guests to see, that there is a chance that all of my fics, my nearly 10 years of writing, will be stolen from me to train some shitty AI program.
This was a really hard decision for me, but this is what I've decided. I will still be posting my Ninjago fics on here for sure, but if you do not plan on making an Ao3 account, please let me know what other fics I should be posting on here.
And finally. Please, for the love of whatever higher power or lack thereof you believe in, stop fucking using AI. I don't care WHAT it is. Change your browser (I use Ecosia), switch away from Google Docs, teach yourself how to write and draw, delete Chat GPT and c.ai.
Whenever you use AI, you are stealing hard work. You are making our planet unliveable. You are actively contributing to millions of people losing access to their work.
7 notes
·
View notes
Text
Because of AI data scraping
Hemlock, Nightshade, Queen Anne’s Lace and Dancing on the Clock will no longer be available to non-registered archive users. I hate that I have to do this, but I want to protect my works from further exploitation and blatant copyright violation for the purposes of generative AI.
3 notes
·
View notes
Text
IMPORTANT NOTE FOR MY AO3 FICS
PLEASE NOTE: I explicitly do NOT give consent for any of my works to be used for AI training of any kind, now or at any point in the future. All of my works are now archive-locked because of an AI data scrape.
I apologize to anyone without an account who was reading my works previously, but I refuse to let my hard work be taken advantage of by these disgusting scraping bots.
I will continue to post my work here on Tumblr and AO3 both, but if you want to bookmark or save them on your AO3, you will have to be logged in.
2 notes
·
View notes
Text
had to archive lock tots and coffee cuz this is getting scary oh my god
3 notes
·
View notes
Text
For the time being, I have had to lock all my fics on ao3 due to the bot scrap. I hope to unlock them all soon, but I won't be risking my work. Sorry to anyone reading them as a guest.
This is really frustrating as a writer and as far as I know there isn't much I can do about it. All I can do is hope my works have avoided the scraps so far.
1 note
·
View note
Note
How are you live what's happening with ao3 and the AI? Does it discourage you in any way from publishing your stories?
Great question. I haven't archive locked my stories and don't plan to. That's a personal decision I've made for myself and my own content, and that doesn't mean I don't wholeheartedly support my fellow authors who do so. But I'm of the (again personal) opinion that my works already have been scraped, and will continue to be scraped in some capacity. As have all of my texposts on here.
I appreciate the work the OTW is doing to take down data on other sites where it has been scraped. I think that's absolutely the right course of action. But personally, I am under no illusions that by archive-locking my fics, I am 100% preventing the scraping/sharing/AI use of my content. And at this point, even when we first learned of that big "scrape" a while back, it was too late.
My goal is to make my content as widely available for readers as possible, which comes with drawbacks. Archive-locking fics came with a significant reduction in hits/comments/kudos for some authors, and I decided that was a risk I personally did not want to take. Especially when, again, I was of the belief that many of my fics had already been scraped/were vulnerable to being scraped before we learned about these mass-scraping incidents.
Additionally, I'm quite certain people have been feeding my fics into AI processors, ChatGPT, etc, for a while now. It's not something I have control over, and people will continue to do it even when they know it's wrong. Even with ao3 accounts.
I don't own my fanfiction content, I can't make money off of it, and I don't want to. This would be a very different conversation if I did. Truthfully, my only hope is that by continuing to write a/b/o, and large amounts of it, I can "spike" whatever dataset is using my fics. That thought brings me joy, even if it's a little silly and far-fetched with these better algorithms.
#asks#anon#ao3#archive of our own#myfic#theresurrectionist#writing#data scraping#OTW#AI generators#chat gpt
192 notes
·
View notes
Text
EMERGENCY AUTHOR UPDATE
I feel like this needs to be warned about. Everything on Ao3 that isn't set to private, HAS been data scraped and fed to 3 data sites that provide data for AI training, including writing and artwork.
Yes, this includes my entire Ennead series and everything else I've ever written and posted. As well as anything you all have written but not made private.
Ao3's legal team is fighting it and one site has made the data unavailable, but the other two aren't based in the USA so the fight is harder.
This is frustrating and upsetting news, especially for those of us who now need to pick between our Guest readers who have supported us for a long time and protecting the hard work that we've put our hearts and souls into and I just ask that we support each other and our choices during this time.
The link here has more details but from now on, until I can be sure there's a way to protect my work, which I've spent decades writing and planning, my stories will be posted for members of the site only.
#fanfiction#creative writing#writer#author#authors#writing woes#ao3#ai scrapers#data scraping#ai#artificial intelligence#technology#ai model#fandom#writerscommunity#writers on tumblr#writing#ao3 writer
182 notes
·
View notes
Text
Hey peeps, just a note to say that, due to just finding out about the recent data scrape on AO3, I'm making the decision to privatise all of my fics on there. I wanted them to be accessible but I'm honestly so upset that they got scraped, only my 10 most recent ones didn't and that's such an upsetting thing for me.
I'll still be posting the chapters on here, where I'm semi confident I set my account correctly, so at least there's that.
11 notes
·
View notes
Text
Block This AI-Tool Account On Fanfiction Dot Net ASAP
If you are still using FFnet like I am, block this AI Tool Account That Pretends to be a Fic Writer who randomly leaves reviews in a very ominous way that bothers me. I got a barrage of emails informing of sudden reviews and follows. Each review is a copypasta, and I felt like my fics were being branded. SO BLOCK IT IS!



Log in, choose the account option, find the block users option like in the screengrab below, then add this FFId: 16123984 into the slot and select save.

I honestly don't know if this app can data scrape or not, since we don't have an option to lock or private our FFnet accounts, keep blocking anything that you think is suspicious.
Here is the Link to the account. They have one AI-Generated so-called Naruto Fic, which literally has nothing to do with Naruto.

Account ID: 16123984
Account Link:
Share, reblog, amplify!
#ffnet#fanfiction dot net#fanfiction dot hell#AI data scraping#ai tools#ai is theft#data scraping#ai is a plague#fic writing#fic writers#writing for the web#writers on tumblr#writing community#writers#fic writer problems#ao3 writer#ao3 author#fanfiction#fanfic writing#fanfic woes#artificial intelligence
10 notes
·
View notes
Text
Thai Drama Stats Special Edition:
The Great Archive Lockdown 🔒
Hi folks! In case you weren't aware, there are various scraping bots that trawl through AO3 and use the data for AI training, content mill sites, or other vaguely nefarious purposes. One site, "Fanfic Books", is essentially creating an unauthorized mirror of AO3. Here are some posts about it.
To combat this, many users have recently chosen to "Archive Lock" their fics.
What is Archive Locking?
An archive locked work, or "restricted" work, is only visible to users who are logged into AO3. This prevents anonymous users (and bots that aren't using login credentials) from reading your fic or finding your fic in searches. This doesn't block all scraping bots, but it should keep most of them out.
What does this have to do with fandom stats?
The AO3 scraping I do doesn't use login credentials, so I can't count archive locked fics. That's totally fine! I am in no way telling you to stop archive locking! Lock or unlock to your hearts content!
It does, however, mean that the data I pulled from my Thai Drama AO3 Trends Dashboard this week (July 1 - July 7, 2024) are looking especially strange.
Holy moly! We actually have negative growth. More fics were locked than posted, which is why the Net New is negative. I'd estimate that about 1% of all previously public Thai Drama fics were archive locked this week.
This matches trends on all of AO3. This week, the total number of publicly available fics actually decreased by 0.7% -- and that's including all the new fics being posted!
When did this happen?
The timing for both Thai Drama fandom and all of AO3 is pretty consistent.
For Thai Drama fandom, most of the locking happened on Friday, July 5, but there was also some locking on Sunday.
When we look at all of AO3, it seems like most of the mass-locking happened on July 5th as well, with additional locking happening all throughout the weekend.
Which Thai Drama fandoms were most affected?
When we look at sheer numbers, KinnPorsche, of course, has a lot of newly-locked fics. 3 Will Be Free, My Engineer, and Dark Blue Kiss were locked down a lot as well.
When we look at the top fandoms by negative growth, The Player saw almost all of its fics vanish overnight. 3 Will Be Free was cut neatly in half.
This data is cool I guess, but... so what?
If these numbers are accurate, it represents another sudden and massive shift towards archive locking on AO3.
According to @star-grazing's stats about archive locking in December 2022, the total number of archive locked works on AO3 increased by 70% in just a couple weeks after a reddit post went viral about AI bots scraping AO3 for machine learning material.
Those stats show that in December 2022, 5.79% of AO3 fics were archive locked. When I checked the numbers again today, 9.37% of all works were archive locked.
Using rough estimates, from the last few days of AO3 data, I'd say that the total number of archive locked works increased by 8% since last Thursday (7/4). And trends seem to indicate that the great lockdown is still going!
Anyway...
Thanks for sticking with me! This is a really fun time to be collecting AO3 stats :) If you have more questions, feel free to reach out. I also put some more details under the cut! Thanks y'all!
Are we sure it was archive locking, and not some other data issue?
Er, good question. It's my best guess, and I've tried to rule out other potentially culprits. The AO3 Fandom Trend Analysis Dashboard, which has data about all fandoms on AO3, doesn't seem to show anything amiss. Their data uses login credentials, meaning they can count archive locked fics.
I also went through several tags manually while logged in and logged out to compare numbers from this week to previous weeks. It doesn't seem like there was a mass deletion or retag that I could see.
I also used the "restricted:true" search operator to search for archive locked fics while logged in. A lot of those missing fics pop back up!
I absolutely welcome other theories though, if you think of one!
Is this still happening?
Seemingly yes, for Thai Dramas at least! When I checked the "All Thai Dramas" AO3 search this morning, the total number of Thai Drama fics had dropped below the 40K mark - lower than when I first started keeping track a month ago!
We probably have a lot more archive locking in our future!
How do I archive lock my own fics?
There's a really good tutorial over here.
Help! I don't have an AO3 account, so I can't read all these archive locked fics anymore.
Please message me! I have some spare invites.
Which fandoms are the most "locked down"?
I'm not sure, but there is a Fanlore article about Hockey RPF and the Fourth Wall which provides some comparison stats. Hockey fandom has traditionally been one of the most locked down fandoms; less than half of hockey rpf fics are publicly available.
You can also peruse this AO3 search to see all archive locked fics.
#fandom stats#ao3 psa#archive locking#restricted works#archive of our own#special edition#thai drama fanfic stats#i should probably try to figure out a way to count archive locked fics eventually but...#sigh... i would have to stop scraping with google sheets which took uhh#SOOO long to figure out#it's fine. such is the nature of data scraping
15 notes
·
View notes