Tumgik
#ai scraping
artsietango · 9 months
Text
This Google Drive AI scraping bullshit actually makes me want to cry. My entire life is packed into Google Drive. All of my writing over the years, all of my academic documents, everything.
I’m just so overwhelmed with all the shit I’m going to have to move. I’m lucky to have Scrivener, but online data storage has been super important as I’ve had so many shitty computers, and the only reason I haven’t lost work is because Google Drive has been my backup storage unit.
My partner has recommended gitlab to move my files to - it seems useful, and I can try and explain more about what it is and how it works when I get more familiar with it. I’m unsure if it’s a text editor, or can work that way. He was explaining something about the version history that I don’t quite understand right now but might later. I’m just super overwhelmed and frustrated that this is the dystopia we live in right now.
29K notes · View notes
z-mizcellaneous-z · 9 months
Text
listen. all im saying is it would be iconic as fuck if the writers on strike wrote insane amounts of horrendously smutty omegaverse fan fiction so when the studios try to AI scrape they'll be fucked over into next year
5K notes · View notes
amalgamasreal · 7 months
Text
SOURCE
Bit of a long video but worth a watch.
TL;DW though is that hidden in the Terms and Conditions for Google's AI Labs is a nice little poison pill that says they get access to your entire Google Drive if you opt in.
So if you're an author of some type and you keep your unpublished works in your G-Drive that means an AI will get to scrape all of it and by opting in you will have given them permission to it. The content creator goes on to predict that Google is going to let out their own streaming service where the scripts, and potentially the art if it's animated, will be almost or entirely AI generated using that scraped data as a baseline and the authors/artist's who's work was essentially stolen in its most raw form to crib from will have zero way of fighting Google on that in our current legal system.
This is of course right in the middle of the writers and actors strike where we're seeing just what lengths studios will go to in order to screw everyone but themselves.
They go on to recommend that if you keep any creative or personal works on Google Drive that you pull it off as soon as possible and delete your entire Drive. They acknowledge that of course this doesn't mean Google really deleted the data but if you do it before they start compulsory opting everyone in there's a chance your work might get overlooked. They also recommend several free editing programs that aren't run by corporations like Google with LibreOffice (the default office program of most Linux distros) being named.
Finally they go over methods of shaming Google which I feel like you just have to watch for comedies sake so I won't describe them in full.
Now this is from me: I know the majority of people don't have the ability to build and manage a big archive just for themselves, but if you're a creative NOW IS THE TIME to educate yourself on what you can do to protect your works. Cloud storage was always iffy at best, but with AI scraping entering the mix it's now downright malignant. Get a bunch of thumb drives, buy some external hard drives, if you have the money buy a pre-built NAS, and if you really want to get into learn how to build your own NAS. These are the old ways before cloud and they're coming back again, more important than ever.
2K notes · View notes
fabaulti · 8 months
Text
I think most of us should take the whole ai scraping situation as a sign that we should maybe stop giving google/facebook/big corps all our data and look into alternatives that actually value your privacy.
i know this is easier said than done because everybody under the sun seems to use these services, but I promise you it’s not impossible. In fact, I made a list of a few alternatives to popular apps and services, alternatives that are privacy first, open source and don’t sell your data.
right off the bat I suggest you stop using gmail. it’s trash and not secure at all. google can read your emails. in fact, google has acces to all the data on your account and while what they do with it is already shady, I don’t even want to know what the whole ai situation is going to bring. a good alternative to a few google services is skiff. they provide a secure, e3ee mail service along with a workspace that can easily import google documents, a calendar and 10 gb free storage. i’ve been using it for a while and it’s great.
a good alternative to google drive is either koofr or filen. I use filen because everything you upload on there is end to end encrypted with zero knowledge. they offer 10 gb of free storage and really affordable lifetime plans.
google docs? i don’t know her. instead, try cryptpad. I don’t have the spoons to list all the great features of this service, you just have to believe me. nothing you write there will be used to train ai and you can share it just as easily. if skiff is too limited for you and you also need stuff like sheets or forms, cryptpad is here for you. the only downside i could think of is that they don’t have a mobile app, but the site works great in a browser too.
since there is no real alternative to youtube I recommend watching your little slime videos through a streaming frontend like freetube or new pipe. besides the fact that they remove ads, they also stop google from tracking what you watch. there is a bit of functionality loss with these services, but if you just want to watch videos privately they’re great.
if you’re looking for an alternative to google photos that is secure and end to end encrypted you might want to look into stingle, although in my experience filen’s photos tab works pretty well too.
oh, also, for the love of god, stop using whatsapp, facebook messenger or instagram for messaging. just stop. signal and telegram are literally here and they’re free. spread the word, educate your friends, ask them if they really want anyone to snoop around their private conversations.
regarding browser, you know the drill. throw google chrome/edge in the trash (they really basically spyware disguised as browsers) and download either librewolf or brave. mozilla can be a great secure option too, with a bit of tinkering.
if you wanna get a vpn (and I recommend you do) be wary that some of them are scammy. do your research, read their terms and conditions, familiarise yourself with their model. if you don’t wanna do that and are willing to trust my word, go with mullvad. they don’t keep any logs. it’s 5 euros a month with no different pricing plans or other bullshit.
lastly, whatever alternative you decide on, what matters most is that you don’t keep all your data in one place. don’t trust a service to take care of your emails, documents, photos and messages. store all these things in different, trustworthy (preferably open source) places. there is absolutely no reason google has to know everything about you.
do your own research as well, don’t just trust the first vpn service your favourite youtube gets sponsored by. don’t trust random tech blogs to tell you what the best cloud storage service is — they get good money for advertising one or the other. compare shit on your own or ask a tech savvy friend to help you. you’ve got this.
1K notes · View notes
thenerdyindividual · 10 months
Text
So with AO3 recommending locking your fics to help prevent scraping for AI use, I know a few people (myself included) who have locked down their fics. But it’s made me curious how many people are locking so…
Also reblog this and tell me in the tags why you do or don’t plan to lock your works.
For those of you that want to lock your works but don’t want to do each fic individually, here is a tutorial for how to lock all your fics at once.
1K notes · View notes
canadiancryptid · 29 days
Text
Tumblr media
Hey so just saw this on Twitter and figured there are some people who would like to know @infinitytraincrew is apparently getting deleted tonight so if you wanna archive it do it now
393 notes · View notes
dduane · 6 months
Text
Sweet holy Thoth on his e-bike, *now* what?!
I pause from Ebooks Direct-oriented work to have a glance around Tumblr, sheerly for relaxation, you understand, and then on my dashboard suddenly find... this:
Tumblr media
...Um.
@makiruz, I'm sorry that you're not sorry. But that's on me. Meanwhile, I have no desire to be complicit / further involved in having made a thief of you, as entropy's apparently been increased more than enough around here.
So if @makiruz would kindly message me so that I can send her a link to a free bundle of all the books currently in the Ebooks Direct store, I'd appreciate it. The link will be tailored so that she won't have to submit any identifying info.
I'll also be pleased if she'll pass me contact info for a/the local library system where she is (national if possible), as I'm always glad to pass my digital stuff on to libraries for free. This will make it possible for me to top the local library up as new works come out.
And for the rest of you: many thanks for your kind words about my apparent living-sainthood.* I do what I can with what I've got. It's all any of us can do.
*With an added dry look: Would it were only so. @petermorwood is smiling gently, but has wisely declined to expound further on his thoughts at this time. :)
461 notes · View notes
sprout-fics · 9 months
Note
Since you've talked about AI scraping Ao3 for works to improve its own writing... Google Documents and Microsoft Word use AI scrappers as well, which cannot be turned off.
As a former Google Docs truther... I fucking hate this news, but thought I'd share for people to see!
So, I did a little digging on this. Here is an article written by a fellow author that talks about how Google has announced they use public data to train AI models. However, the key word here is public. Google docs belong to the realm of privacy. However, there could be disclaimers buried in the terms and conditions that nobody reads that detail that google is in fact using things from Google docs. Google has been known for some shady business practices when it comes to algorithms and privacy, so it would not surprise me if this was the case. However, at the moment Google has yet to announce they are scraping people’s private docs for AI.  
I couldn’t find many sources that state Microsoft is explicitly scraping people’s data from the cloud/OneDrive or online Microsoft suite. However, what I did find was a thousand and one articles explaining for individuals to purposefully scrape data from Word docs. So theoretically, an individual could copy paste your work into a word doc and then use tools to extract data from that. Which is the same as scraping data from Ao3 or Tumblr. The difference is (from what I can tell) that these are individual programmers and not mass generated AI bots.  
One thing I found in the first article is mention of NextCloud, which is essentially a private server safe from AI scraping. It’s a program similar to Microsoft Office, just more secure and safer from AI data scraping. Keep in mind I have not used this program, so I cannot guarantee the authenticity or operationalization of this, but it might be something to look into. 
So, here’s the thing: At this point in the game, it’s pretty much a guarantee that we all will face some sort of AI scraping of works in the near and imminent future. As these tools rapidly evolve, the tools to combat stealing of works struggle to keep up. There are steps we all can take, we can restrict works on Ao3 to users only, we can stop posting on Tumblr, but in the end, there are ways to navigate around this if people are hellbent on stealing your works. It is endlessly frustrating, and it seriously makes me re-consider the benefits of sharing content online, knowing my works can feed programs that are designed to eliminate the need for my creativity.  
406 notes · View notes
the960writers · 26 days
Text
In light of the current shit with tumblr making us opt-out of sharing our blogs with AI scrapers, I checked the state of Wordpress for this and, not surprisingly since it's the same company, you need to opt out there too.
If you have a wordpress-blog of the NAME.wordpress.com kind, you need to go into Settings and under the section Privacy, hit the checkmark for "Prevent third-party sharing for NAME.wordpress.com".
Tumblr media
I know some of us here at writeblr have secondary blogs on wordpress, so make sure to opt-out of AI scraping there.
84 notes · View notes
llyfrenfys · 30 days
Text
By the way - I've had my work scraped by AI before. I'm protected only in that the AI sucks when it comes to minoritised languages. The site where I saw my work was a scam bookstore selling a Victorian Welsh dictionary and clearly a scraping ai saw my work had the words "Welsh" and "Dictionary" in it and went ham. Resulting in a product description with bits of my work on LGBT+ terminology in it. This anecdote in itself is funny, but the practice of ai scraping is not. I'm a writer and many thousands of writers like me depend on our written output for our livelihoods/careers. Allowing ai scraping on tumblr is putting a lot of people's livelihoods at risk. I don't even earn anything from my work- but I know many others who rely on their writing to get by and I'm so worried for all of them.
I genuinely don't want to leave this site. I refuse to move anywhere else and want to make this a better place. Rather than migrate platforms every few years.
Automattic, do better. Tagging @staff to voice concerns, but do so with the caveat I know it's none of their fault. This is an Automattic issue mainly.
61 notes · View notes
writingwife-83 · 10 months
Text
Might make a poll from the perspective of readers too 🤔
277 notes · View notes
daytodaycrowthoughts · 9 months
Text
I dont jump on big political topics much but I have an idea about the AI scraping that Google is fucking around with.
If they want our words so badly? They can have them. On our terms.
Remove all the shit you care about. All of it. And you know what to replace it with? Horrible horrible things.
Not illegal horrible. But write the WORST fanfic you possibly can.
Ron Desantis/Disney r34 enemies to lovers mpreg fics
Bezos/Epstine hurt/comfort f*ck or die prison fics
Nestlé/MLM secret lovers and neither of them knows the other is a corporate spy to steal secrets fics
Make. It. Weird.
Make them SO CRINGE that they're good, like @biggest-gaudiest-patronuses 's Tony the Tiger/Grinch fic (I love it very much)
And leave that in drive for the AI to scrape. Write pure FILTH. Write like you're thirteen again on fanfic(dot)net or wattpad. Write purposefully badly write in only UWU speak.
Make them regret their decision.
If Disney can have a vault of nsfw from their artists because they have cruel and unjust contracts?
Google can have the worst trained AI service because of their invasive theft.
Make Corporations Cringe Again
And, for attention, please enjoy this nice rock I picked up on my walk yesterday 💜
Tumblr media
128 notes · View notes
sidecast · 29 days
Text
moving to pillowfort
i spent the day looking at alternatives to tumblr that show less of an intent to let ai scrape their sites. pillowfort was the most sympathetic to me, so that is where you can find my art from now on, under the same url. so far, it seems to be alright and similar to what i'm used to in many ways. i reccommend it and urge you to have a look at it too, especially because id like to stay in touch with people i met on tumblr
obviously this isn't an issue that can be solved entirely by moving to a different site. its an issue with society, capitalism, et ceatera, you know it all already. i am still choosing to move, because this was the last drop in the glass that has been slowly overfilling for the past year. tumblr is not the same as when i started using it, and i see no reason why i should keep doing so
i will not deactivate my blog for the foreseeable future, but i will also not be posting anything "of worth" from here on unless a lot of things change very quickly
28 notes · View notes
beansprean · 28 days
Note
will you still be posting your art on tumblr with the ai scraping etc? :(
For now the answer is yes, of course! most of the people who want to see my stuff are here, and i like it here. i cant take back whatever midjourney already scraped from my blog before the announcement, and all my blogs now have the 'prevent third party w/e' thing toggled on (make sure yall do that too to protect your posts! any reblogs will still have that protection as long as the OP has it toggled btw so dont stop reblogging art lol), which i trust will provide me at least as much protection as posting images on the open internet did before lol. I might increase my watermarks? idk
I did sign up for both Wafrn and Cohost just in case, and I'm also fairly active on instagram (still beansprean everywhere), but IG SUCKS for posting comics because of the aspect ratio lol. I still plan to transport My Familiar's Ghost over to AO3 eventually (very time consuming)! You can also follow my Patreon; some posts there go public after a bit if you can't pay for early access!
i will let yall know if anything changes, but for now im here im queer and im not getting out of this chair!!
26 notes · View notes
albertcamuesli · 9 months
Text
Google updated its privacy policy to disclose that its various AI services, such as Bard and Cloud AI, may be trained on public data that the company has scraped from the web.
Tumblr media
89 notes · View notes
canadiancryptid · 30 days
Text
New privacy setting just dropped! Its turned off by default!
Tumblr media
Its under blog settings, for each individual sideblog. Bottom of the page. Don't know if you can get to it from app but you definitely can on desktop mode
61 notes · View notes