#AI Data sourcing
Explore tagged Tumblr posts
cogitotech · 3 months ago
Text
Cogito Tech Introduces DataSum, a New Standard for Ethical and Transparent AI Data Sourcing
The rapid advancement of artificial intelligence relies on vast amounts of training data, yet concerns persist regarding its sourcing, labeling, and ethical implications. Issues such as biased AI models, mislabeled datasets, and exploitative labor conditions in data annotation have underscored the need for greater transparency in data sourcing. read more on Cogito Tech Introduces DataSum, a New Standard for Ethical and Transparent AI Data Sourcing
Tumblr media
0 notes
cyle · 5 months ago
Text
still confused how to make any of these LLMs useful to me.
while my daughter was napping, i downloaded lm studio and got a dozen of the most popular open source LLMs running on my PC, and they work great with very low latency, but i can't come up with anything to do with them but make boring toy scripts to do stupid shit.
as a test, i fed deepseek r1, llama 3.2, and mistral-small a big spreadsheet of data we've been collecting about my newborn daughter (all of this locally, not transmitting anything off my computer, because i don't want anybody with that data except, y'know, doctors) to see how it compared with several real doctors' advice and prognoses. all of the LLMs suggestions were between generically correct and hilariously wrong. alarmingly wrong in some cases, but usually ending with the suggestion to "consult a medical professional" -- yeah, duh. pretty much no better than old school unreliable WebMD.
then i tried doing some prompt engineering to punch up some of my writing, and everything ended up sounding like it was written by an LLM. i don't get why anybody wants this. i can tell that LLM feel, and i think a lot of people can now, given the horrible sales emails i get every day that sound like they were "punched up" by an LLM. it's got a stink to it. maybe we'll all get used to it; i bet most non-tech people have no clue.
i may write a small script to try to tag some of my blogs' posts for me, because i'm really bad at doing so, but i have very little faith in the open source vision LLMs' ability to classify images. it'll probably not work how i hope. that still feels like something you gotta pay for to get good results.
all of this keeps making me think of ffmpeg. a super cool, tiny, useful program that is very extensible and great at performing a certain task: transcoding media. it used to be horribly annoying to transcode media, and then ffmpeg came along and made it all stupidly simple overnight, but nobody noticed. there was no industry bubble around it.
LLMs feel like they're competing for a space that ubiquitous and useful that we'll take for granted today like ffmpeg. they just haven't fully grasped and appreciated that smallness yet. there isn't money to be made here.
61 notes · View notes
queen-mabs-revenge · 1 month ago
Text
communist generative ai boosters on this website truly like
Tumblr media
#generative ai#yes the cheating through school arguments can skew into personal chastisement instead of criticising the for-profit education system#that's hostile to learning in the first place#and yes the copyright defense is self-defeating and goofy#yes yeeeeeeeeeees i get it but fucking hell now the concept of art is bourgeois lmaao contrarian ass reactionary bullshit#whYYYYYYY are you fighting the alienation war on the side of alienation????#fucking unhinged cold-stream marxism really is just like -- what the fuck are you even fighting for? what even is the point of you?#sorry idk i just think that something that is actively and exponentially heightening capitalist alienation#while calcifying hyper-extractive private infrastructure to capture all energy production as we continue descending into climate chaos#and locking skills that our fucking species has cultivated through centuries of communicative learning behind an algorithmic black box#and doing it on the back of hyperexploitation of labour primarily in the neocolonial world#to try and sort and categorise the human experience into privately owned and traded bits of data capital#explicitly being used to streamline systematic emiseration and further erode human communal connection#OH I DON'T KNOW seems kind of bad!#seems kind of antithetical to and violent against the working class and our class struggle?#seems like everything - including technology - has a class character and isn't just neutral tools we can bend to our benefit#it is literally an exploitation; extraction; and alienation machine - idk maybe that isn't gonna aid the struggle#and flourishing of the full panoply of human experience that - i fucking hope - we're fighting for???#for the fullness of human creative liberation that can only come through the first step of socialist revolution???#that's what i'm fighting for anyway - idk what the fuck some of you are doing#fucking brittle economic marxists genuinely defending a technology that is demonstrably violent to the sources of all value:#the soil and the worker#but sure it'll be fine - abundance babey!#WHEW.
9 notes · View notes
bananonbinary · 10 months ago
Text
god i miss when the internet wasn't garbage. you can't google anything these days without whatever answer you're looking for getting lost under a deluge of seo ai bullshit. cannot wait until the bubble pops and we might get useable search engines again.
22 notes · View notes
pikslasrce · 2 years ago
Text
i get the outrage over the ai generated mv bc i agree however it irks me that people keep pointing out the wonky/extra fingers/etc as a gotcha bc i think thats the whole point of using ai for the video they wanted the "flaws" that come with it that 'ai generated uncanny valley' vibe like even tho i disagree w it on an ethical level i do get where theyre coming from artistic direction-wise
39 notes · View notes
Text
youtube
3 notes · View notes
rjohnson49la · 5 months ago
Text
3 notes · View notes
yddaw · 1 year ago
Text
Sometimes it’s unfortunate seeing that a lot of people are anti [insert technology here]. It makes sense of course, but it seems like the idea being shared is that the technological tool itself is “bad” but not the company using it.
Like Chromium is not the same thing as Chrome itself. And AI is not only for stealing content and reselling it. But having so many companies do this and use these tools with little to no regulation (specifically on privacy) paints such a nasty image for the tool that has so much potential 😩
6 notes · View notes
canadianlucifer · 1 year ago
Text
it's so sad that i can't say "I love AI!" without a million asterisks
2 notes · View notes
mysocial8onetech · 2 years ago
Text
Meet Subject-Diffusion, the model that revolutionizes open-domain personalized image generation. No test-time fine-tuning required. Just provide a text description and a reference image and get stunning single- or multi-subject images in any domain. Learn more about this model.
4 notes · View notes
nitunio · 2 years ago
Text
I think that if a person knows that something was made using trained on unethically sourced data AI. And still uses it/likes it/supports it/defends it.
Then said person should stop "being mad" when their data is used to train AI without consent.
2 notes · View notes
mossquitoman · 2 years ago
Text
what do people even have against ai art lmao what
3 notes · View notes
opha · 1 year ago
Text
commercially available image generators consume an average 3 watt-hours per generation. running the AC for an hour is a little over 1000. we usually measure power in kilowatt-hours, btw, because watt-hours are so negligible in scope that you can pedal a bike for an hour and have enough power for 33 image generations, or running a single 25 watt light bulb for 4 hours.
as for water, about 500ml for anywhere between 5 and 50 generations (depends on location and season). all of that comes from data centers, which often (but not always) reuse the water and about a third of which are using non-potable wastewater for cooling in the first place. you know, the stuff you personally use up to the tune of 30-50 gallons per day by flushing the toilet, showering, running the washing machine, etc.
training large language models does consume much more energy, but it's still pretty small fish compared to any other tech industry, and that's not something that is constantly happening.
please don't be fooled by contextless figures and inaccurate analogies slung by obvious clickbait articles like "you'll be astonished how much power it takes to generate a single image!" and other people who have a vested interest in fearmongering misinformation. there are plenty of real problems in this sector, like the use of labor exploitation in the global south in fine-tuning LLMs.
Tumblr media
See also, "We're in a drought; conserve water!" Meanwhile, bottled water companies and golf courses for rich folk empty the aquifers.
231K notes · View notes
mysocial8onetech · 2 years ago
Text
How can we leverage the power of natural language processing and artificial intelligence to automate fact-checking and make it more efficient and scalable? In this latest blog article, we describe FactLLaMA, a new model that can optimize instruction-following language models with external knowledge for automated fact-checking. We explain what FactLLaMA is and more insightful information about this model.
2 notes · View notes
instantedownloads · 1 month ago
Text
How to Use n8n and AI to Build an Automation System
Automation is changing how we work every day. It helps save time, reduce mistakes, and get more done with less effort. If you want to automate your tasks but don’t know where to start, this guide is for you. In this post, you will learn how to use n8n — a free, open-source automation tool — combined with AI to build smart workflows that do work for you. What Is n8n? n8n (pronounced…
0 notes
ongoing-catastrophe · 2 months ago
Text
I have a dilemma. later this month I'm gonna be teaching/grading a bunch of middle to high schoolers in speech and debate. I want to try to use this as an opportunity to give them a chance to actually learn something instead of relying on gen ai but honestly I'm not sure how to regulate this.
how do I tell if its ai writing or just unskilled writing? if a speech is ai or just given badly? i think it's easier to pick out the wrong facts (I'll have to make a big folder but i'll manage) but even still, what if its a genuine mistake and not ai use?
it doesn't help that the event organizers dont seem to actually care about the integrity of the event, since they told me I could "just toss it in chat gpt" when they asked me to make an introductory document
0 notes