#data scraping sites
Explore tagged Tumblr posts
Text
The thing about tumblr is that thereās a panic about how the site is dying and falling apart literally every other week so eventually if youāve been here long enough you just get zen about it. Like if anything specific is your breaking point do whatever you want but personally theyāre gonna have to pry me out of the vents like a feral raccoon before I leave. Anyway if youāre new here and you see people talk about how something is the end of tumblr and youāre afraid theyāre correct I just want you to know Iāve been here through probably like 300 ends of tumblr. Iām not saying it will never happen for real but statistically I remain skeptical.
#this is about the midjourney deal thing but#if it breaks containment and starts circulating every time smth happens Iām gonna laugh#anyway: yeah sure itās scummy on tumblrās part. but like.#I donāt honestly believe thereās a site that ISNāT getting scraped by ai and data mining these days#so what are you even really losing#yeah yeah I toggled off the data sharing whatever#itās just not nearly the scummiest thing tumblr has done#and not even close to the scummiest thing a major social media platform has done#so people seeing people be like āWELL IāM LEAVING OVER THISā#Iām just like. lol okay.
225 notes
Ā·
View notes
Note
Didnt Instagram/facebook change their policies so by having an account you automatically agree to have your content train their AI with no way to turn it off? Good enough reason to not post there at all if you ask me
i think unfortunately just about all websites are doing this no matter what they say.
#Deviantart is no better#Twitter is no better#Tumblr is⦠ok idk what is going on with tumblr cause i feel like this site is like a rat sewer house#But im sure its also data scrapping#No one doing ai scrapes is caring about a sites terms of servic e
43 notes
Ā·
View notes
Text
you know i think it's really funny (derogatory) that tumblr seemingly has the ability to create popups for ad-free, crabs, anything that will get them money, but when it comes to using popups to guide new users around the website layout they're just like "uww what can we POSSIBLY do?? all these new people only know twitter and they couldn't possibly read a popup explaining how to use the site!! let's just change everything visually appealing about tumblr to cater to our inability to make popups for helpful reasons uwu!!".
like you know what i want? If you're testing a change on me I want a pop up that says so. I don't want to have to rely on the @changes blog to see that. If i'm testing a change I should be given a feedback form that is specifically about that change. I want the results of the feedback forms for changes to be publicly viewable to the userbase. you could learn so much about how this website actually functions for users through that, but no. we can only make popups when it fiscally benefits us! we can't use polls to get a sense of what the userbase is feeling! we can't publicize what % of feedback about a change was positive or negative. we couldn't...possibly...
#not dogs#tumblr#getting real sick of this shit#if you can make a pop up for ad free you can make a pop up explaining how to use this site#without changing what makes the site function and be visually appealing to those that have been here for a decade plus#look i understand tumblr is struggling to survive in this capitalist hellscape#but literally so is every other social media with the exception of those that#scrape people for their data#so you have a choice. you can be ethical or you can be profitable#which is unfortunate.#but maybe you'd be closer to profitable if you took your users seriously.
112 notes
Ā·
View notes
Text
considering posting art n stuff here again. No promises, but thinking about it
#still dont love the ai data scraping shit going on but it's so universal and standard practice at this point#that turning up my nose from it is pretty pointless and its gonna happen no matter where or how i share stuff#additionally still not a fan of this site but the alternatives are all either objectively worse or i have no idea how to#gain a following on them. i have an established community here and starting from scratch elsewhere is. Very Hard as many know
7 notes
Ā·
View notes
Text
Sorry but yall are gonna have to pry me out of this place
#ive got back up accounts on insta cohost and bluesky#but this site will have to die for me to stop using it#sorry but do you really think other sites ARENT scraping your data too? (<- studied AI at college)#anyway love yall im @dreaminginmysoup or @dreaminmysoup everywhere that im on
10 notes
Ā·
View notes
Text
observation-wise i do think it's interesting how enraged people were about how a giant query that returned pretty much everything ever posted (and unposted. drafts and unanswered asks and whatnot) on the site was done (which. to my knowledge. STILL doesn't have an answer regarding the question of whether or not the data included in that query was already sold) and that tumblr was going to start partnering with AI companies to train their models and then a couple of posts went around like "okie dokie guys NOW after that query was done we implemented an opt-out toggle <3 and we trust in Good Faith that the companies will respect this toggle <3" and then everyone was like Oh Okay <3 Yay <3 and suddenly everyone's fine again. 10/10 example of a collective sunk cost fallacy mentality. at this point it's kind of free entertainment to watch
#obviously if you post anything online you are implicitly acknowledging the risk of it being scraped. that isn't the point#the point is that a REALLY shitty dick move was pulled and like. nobody cares about it. at all#despite the fact that if this happened a year ago to another site like half the people posting about it would've been saying shit like#'haha that's what those idiots get for staying on a site that just wants to mine them for data. companies don't care about their users.#thank god tumblr is different <3' when it's like. guys. you realize tumblr hasn't been different for at least 6 years now. right.#you realize that the 'hellsite (affectionate)' marketing ploy was just that. a marketing ploy.#i realize some people will read this and go 'get off your high horse you're literally posting this on tumblr'#and i mean. yeah. that's the point HAHA
2 notes
Ā·
View notes
Text

#so apparently everyones data has Already been scraped. including mine#i already feel artblocked. i feel like this will just kill my drive to draw entirely#god. everything is so awful#i've been betrayed by the site i use daily#hell on earth#my post#my ramblings
3 notes
Ā·
View notes
Text
went ahead and made a cohost, @visorlights. I don't plan on going anywhere, but might as well cave at this point.
#should make a pinned post with external links I suppose. when I remember my pf login.#I don't really expect any platform to be fully exempt from data scraping tbh regardless of site policy. so no point in abandoning for that.
4 notes
Ā·
View notes
Text
just saw a fic writer i liked happily share and thank someone for making ai fan āartā of their oc and iā
#i have never unfollowed and blocked so fast#writers what is up??? what is up with this kind of behavior?#were we not just concerned over the fact ai like chatgpt are data scraping sites like ao3?#but oh ppl using and supporting ai like midjourney (which u have to pay for) is ok??#the same ai that was trained off the hard work of countless artists without their permission#the gall to even call it fan artā¦.disgusting
7 notes
Ā·
View notes
Text
I think forums and personal websites should make a comeback
#fate rambles#with Twitter going down the shithole#and all these places scraping data for ai#lets go back to days when we all had personal sites with silly html and site forums#i used to hit up pokecharms and serenesforest back in the day myself
6 notes
Ā·
View notes
Text
I am in the process of changing my AO3 settings so that only registered users can access my content. I refuse to allow people to train AI with my work. I recently had a guest request that I unlock the first part of a series on AO3.
I'm sorry, but I'm not unlocking my stuff for you, because AO3 has advised one of the few ways to prevent data scraping is by restricting access. This annoys me because my work's availability has to be limited because of someone else's intrusion. My apologies if this restricts your viewing/reading. Get an AO3 account or try to reset your passwords, please. Authors put their work on AO3 for free, and while we appreciate your comments and love, help us by not supporting AI.
Part of me feels like why should I bother to unlock my work if you can't be bothered to reset your passwords, or reregister on the site? That would make me feel like a bit of an a-hole for saying that though. I do appreciate that passwords are a pain, but if you value the stuff you read for free, I dunno, maybe you should make the effort? At the end of the day, I don't know you, so I'm not going to trust an anonymous 'guest'.


#data scraping#ai training#changing my ao3 settings#apologies if you are having difficulties accessing my content#but I am not willing to unlock my writing for you if you can't be bothered to reset your passwords or reregister on the site
4 notes
Ā·
View notes
Text
Reddit sues Anthropic, accusing the AI company of illegally scraping data from its site
Social media platform Reddit has sued the artificial intelligence company Anthropic, alleging that it is illegally āscrapingā the comments of Reddit users to train its chatbot Claude. Reddit claims that Anthropic has used automated bots to access Redditās content despite being asked not to do so, and āintentionally trained on the personal data of Reddit users without ever requesting theirā¦
0 notes
Text
lol apparently someone recently ran an LLM as an account on CWS and it proceeded to violate basic site rules by generating nearly 1200 garbage wordlinks that were either duplicates, too specific, or contained information that is not useful site-wide, thus creating a shitload of extra dictionary maintenance work for the unpaid volunteer staff who have been manually merging, deleting or otherwise un-fucking all of these wordlinks
very cool, solving the problem of "hm, i don't think these guys work hard enough" (sarcasm) by bloating the dictionary database with a bunch of crap
#and then apologizing with a long (AI generated) journal entry about how I'm just a widdle LLM and i didn't undewstand da wules#the account was barred from creating or editing WLs partway through its spree otherwise it probably would've made way more#bitch do the work yourself like the rest of us#nadia rambles#i wonder if they also scraped the site for conlang data or nah... who knows!#i mean... maybe argyle or other senior staff knows i guess#either way there haven't been any announcements about it#so for now i will just go for Ignorance Is Bliss and choose to believe the site was not scraped
0 notes
Text
So... apparently the NaNoWriMo organization has been gutted and the people at the top now are fully focused on Getting That AI Money.
I have no reason to say this other than Vibesā¢ļø and the way that every other org who has pivoted to AI has behaved but I wouldn't trust anything shared with or stored on their servers not to be scraped for training LLMs. That includes pasting stuff into the site to verify your word count, if that's still a thing. (I haven't done Nano since 2015).
Also of note:
Age gating has been implemented. If you haven't added your date of birth to your profile or if you're under 18, it's supposed to lock you out of local region pages and the forums. ... It's worth noting that the privacy policy on the webpage doesn't specify how that data is stored and may not be GDPR compliant.
...
Camp events are being run solely by sponsors. Events for LGBTQIA+, disabled writers, and writers of color no longer appear to be a thing at NaNo.
Just... go read the whole thing. It's not that long. Ugh.
6K notes
Ā·
View notes
Text
Made a new sideblog to post some writings on literally the day before the AI news broke, so now I'm feeling real conflicted about actually publishing any writing at all on there. Maybe I should just post a link to an external website on here, rather than crossposting.
0 notes