Tumgik
#Data bias
the-feminine-fury · 2 years
Text
"We teach brilliance bias to children from an early age. A recent US study found that when girls start primary school at the age of five they are as likely as five year old boys to think women could be 'really really smart'. But by the time they turn six, something changes. They start doubting their gender. So much so, in fact, that they start limiting themselves if a game is presented to them as intended for 'children who are really really smart'. Five Year old girls are as likely to want to play it as boys but six-year-old girls are suddenly uninterested. Schools are teaching little girls that brilliant doesn't belong to them."
(Perez, Invisible Women: Data Bias in a World Designed for Men)
284 notes · View notes
itellmyselfsecrets · 1 year
Text
“Sex is not the reason women are excluded from data. Gender is. The female body is not the problem. The problem is the social meaning that we ascribe to that body, and a socially determined failure to account for it…The gender data gap is both a cause, and a consequence of the type of unthinking that conceives of humanity as almost exclusively male…Seeing men as the human default is fundamental to the structure of human society…If human evolution is driven by men, are women even human?” - Caroline Criado Perez (Invisible Women: Data Bias in a World Designed for Men)
11 notes · View notes
skannar · 10 months
Text
I love good Audiobooks on new tech.
2 notes · View notes
d0nutzgg · 1 year
Text
The Implications of Algorithmic Bias and How To Mitigate It
AI has the potential to transform our world in ways we can't even imagine. From self-driving cars to personalized medicine, it's making our lives easier and more efficient. However, with this power comes the responsibility to consider the ethical implications and challenges that come with the use of AI. One of the most significant ethical concerns with AI is algorithmic bias.
Algorithmic bias occurs when a machine learning model is trained on data that is disproportionate from one demographic group, it may make inaccurate predictions for other groups, leading to discrimination. This can be a major problem when AI systems are used in decision-making contexts, such as in healthcare or criminal justice, where fairness is crucial.
But there are ways engineers can mitigate algorithmic bias in their models to help promote equality. One important step is to ensure that the data used to train the model is representative of the population it will be used on. Additionally, engineers should test their models on a diverse set of data to identify any potential biases and correct them.
Another key step is to be transparent about the decisions made by the model, and to provide an interpretable explanation of how it reaches its decisions. This can help to ensure that the model is held accountable for any discriminatory decisions it makes.
Finally, it's important to engage with stakeholders, including individuals and communities who may be affected by the model's decisions, to understand their concerns and incorporate them into the development process.
As engineers, we have a responsibility to ensure that our AI models are fair, transparent and accountable. By taking these steps, we can help to promote equality and ensure that the benefits of AI are enjoyed by everyone.
2 notes · View notes
Text
Tumblr media
"Invisible Women" by Caroline Criado-Perez
Thank you @womensbookclub_paris for the rec! ❤️
0 notes
filehulk · 11 months
Text
Natural Language Processing with ChatGPT: Unlocking Human-Like Conversations
Natural Language Processing (NLP) has witnessed significant advancements in recent years, empowering machines to understand and generate human-like text. One remarkable breakthrough in this domain is ChatGPT, a cutting-edge language model that leverages state-of-the-art techniques to engage in conversational exchanges. In this article, we delve into the underlying technology of ChatGPT, its…
Tumblr media
View On WordPress
1 note · View note
hitechbpo · 2 years
Link
Data bias in ML projects is also called machine learning bias or algorithmic bias. Bias are the training datasets used to train the ML or AI models are not complete or does not contain the true representation of facts and figures. Find out here about 5 types of data bias impacting your ml projects and how to fix them.
0 notes
Text
The surprising truth about data-driven dictatorships
Tumblr media
Here’s the “dictator’s dilemma”: they want to block their country’s frustrated elites from mobilizing against them, so they censor public communications; but they also want to know what their people truly believe, so they can head off simmering resentments before they boil over into regime-toppling revolutions.
These two strategies are in tension: the more you censor, the less you know about the true feelings of your citizens and the easier it will be to miss serious problems until they spill over into the streets (think: the fall of the Berlin Wall or Tunisia before the Arab Spring). Dictators try to square this circle with things like private opinion polling or petition systems, but these capture a small slice of the potentially destabiziling moods circulating in the body politic.
Enter AI: back in 2018, Yuval Harari proposed that AI would supercharge dictatorships by mining and summarizing the public mood — as captured on social media — allowing dictators to tack into serious discontent and diffuse it before it erupted into unequenchable wildfire:
https://www.theatlantic.com/magazine/archive/2018/10/yuval-noah-harari-technology-tyranny/568330/
Harari wrote that “the desire to concentrate all information and power in one place may become [dictators] decisive advantage in the 21st century.” But other political scientists sharply disagreed. Last year, Henry Farrell, Jeremy Wallace and Abraham Newman published a thoroughgoing rebuttal to Harari in Foreign Affairs:
https://www.foreignaffairs.com/world/spirals-delusion-artificial-intelligence-decision-making
They argued that — like everyone who gets excited about AI, only to have their hopes dashed — dictators seeking to use AI to understand the public mood would run into serious training data bias problems. After all, people living under dictatorships know that spouting off about their discontent and desire for change is a risky business, so they will self-censor on social media. That’s true even if a person isn’t afraid of retaliation: if you know that using certain words or phrases in a post will get it autoblocked by a censorbot, what’s the point of trying to use those words?
The phrase “Garbage In, Garbage Out” dates back to 1957. That’s how long we’ve known that a computer that operates on bad data will barf up bad conclusions. But this is a very inconvenient truth for AI weirdos: having given up on manually assembling training data based on careful human judgment with multiple review steps, the AI industry “pivoted” to mass ingestion of scraped data from the whole internet.
But adding more unreliable data to an unreliable dataset doesn’t improve its reliability. GIGO is the iron law of computing, and you can’t repeal it by shoveling more garbage into the top of the training funnel:
https://memex.craphound.com/2018/05/29/garbage-in-garbage-out-machine-learning-has-not-repealed-the-iron-law-of-computer-science/
When it comes to “AI” that’s used for decision support — that is, when an algorithm tells humans what to do and they do it — then you get something worse than Garbage In, Garbage Out — you get Garbage In, Garbage Out, Garbage Back In Again. That’s when the AI spits out something wrong, and then another AI sucks up that wrong conclusion and uses it to generate more conclusions.
To see this in action, consider the deeply flawed predictive policing systems that cities around the world rely on. These systems suck up crime data from the cops, then predict where crime is going to be, and send cops to those “hotspots” to do things like throw Black kids up against a wall and make them turn out their pockets, or pull over drivers and search their cars after pretending to have smelled cannabis.
The problem here is that “crime the police detected” isn’t the same as “crime.” You only find crime where you look for it. For example, there are far more incidents of domestic abuse reported in apartment buildings than in fully detached homes. That’s not because apartment dwellers are more likely to be wife-beaters: it’s because domestic abuse is most often reported by a neighbor who hears it through the walls.
So if your cops practice racially biased policing (I know, this is hard to imagine, but stay with me /s), then the crime they detect will already be a function of bias. If you only ever throw Black kids up against a wall and turn out their pockets, then every knife and dime-bag you find in someone’s pockets will come from some Black kid the cops decided to harass.
That’s life without AI. But now let’s throw in predictive policing: feed your “knives found in pockets” data to an algorithm and ask it to predict where there are more knives in pockets, and it will send you back to that Black neighborhood and tell you do throw even more Black kids up against a wall and search their pockets. The more you do this, the more knives you’ll find, and the more you’ll go back and do it again.
This is what Patrick Ball from the Human Rights Data Analysis Group calls “empiricism washing”: take a biased procedure and feed it to an algorithm, and then you get to go and do more biased procedures, and whenever anyone accuses you of bias, you can insist that you’re just following an empirical conclusion of a neutral algorithm, because “math can’t be racist.”
HRDAG has done excellent work on this, finding a natural experiment that makes the problem of GIGOGBI crystal clear. The National Survey On Drug Use and Health produces the gold standard snapshot of drug use in America. Kristian Lum and William Isaac took Oakland’s drug arrest data from 2010 and asked Predpol, a leading predictive policing product, to predict where Oakland’s 2011 drug use would take place.
Tumblr media
[Image ID: (a) Number of drug arrests made by Oakland police department, 2010. (1) West Oakland, (2) International Boulevard. (b) Estimated number of drug users, based on 2011 National Survey on Drug Use and Health]
Then, they compared those predictions to the outcomes of the 2011 survey, which shows where actual drug use took place. The two maps couldn’t be more different:
https://rss.onlinelibrary.wiley.com/doi/full/10.1111/j.1740-9713.2016.00960.x
Predpol told cops to go and look for drug use in a predominantly Black, working class neighborhood. Meanwhile the NSDUH survey showed the actual drug use took place all over Oakland, with a higher concentration in the Berkeley-neighboring student neighborhood.
What’s even more vivid is what happens when you simulate running Predpol on the new arrest data that would be generated by cops following its recommendations. If the cops went to that Black neighborhood and found more drugs there and told Predpol about it, the recommendation gets stronger and more confident.
In other words, GIGOGBI is a system for concentrating bias. Even trace amounts of bias in the original training data get refined and magnified when they are output though a decision support system that directs humans to go an act on that output. Algorithms are to bias what centrifuges are to radioactive ore: a way to turn minute amounts of bias into pluripotent, indestructible toxic waste.
There’s a great name for an AI that’s trained on an AI’s output, courtesy of Jathan Sadowski: “Habsburg AI.”
And that brings me back to the Dictator’s Dilemma. If your citizens are self-censoring in order to avoid retaliation or algorithmic shadowbanning, then the AI you train on their posts in order to find out what they’re really thinking will steer you in the opposite direction, so you make bad policies that make people angrier and destabilize things more.
Or at least, that was Farrell(et al)’s theory. And for many years, that’s where the debate over AI and dictatorship has stalled: theory vs theory. But now, there’s some empirical data on this, thanks to the “The Digital Dictator’s Dilemma,” a new paper from UCSD PhD candidate Eddie Yang:
https://www.eddieyang.net/research/DDD.pdf
Yang figured out a way to test these dueling hypotheses. He got 10 million Chinese social media posts from the start of the pandemic, before companies like Weibo were required to censor certain pandemic-related posts as politically sensitive. Yang treats these posts as a robust snapshot of public opinion: because there was no censorship of pandemic-related chatter, Chinese users were free to post anything they wanted without having to self-censor for fear of retaliation or deletion.
Next, Yang acquired the censorship model used by a real Chinese social media company to decide which posts should be blocked. Using this, he was able to determine which of the posts in the original set would be censored today in China.
That means that Yang knows that the “real” sentiment in the Chinese social media snapshot is, and what Chinese authorities would believe it to be if Chinese users were self-censoring all the posts that would be flagged by censorware today.
From here, Yang was able to play with the knobs, and determine how “preference-falsification” (when users lie about their feelings) and self-censorship would give a dictatorship a misleading view of public sentiment. What he finds is that the more repressive a regime is — the more people are incentivized to falsify or censor their views — the worse the system gets at uncovering the true public mood.
What’s more, adding additional (bad) data to the system doesn’t fix this “missing data” problem. GIGO remains an iron law of computing in this context, too.
But it gets better (or worse, I guess): Yang models a “crisis” scenario in which users stop self-censoring and start articulating their true views (because they’ve run out of fucks to give). This is the most dangerous moment for a dictator, and depending on the dictatorship handles it, they either get another decade or rule, or they wake up with guillotines on their lawns.
But “crisis” is where AI performs the worst. Trained on the “status quo” data where users are continuously self-censoring and preference-falsifying, AI has no clue how to handle the unvarnished truth. Both its recommendations about what to censor and its summaries of public sentiment are the least accurate when crisis erupts.
But here’s an interesting wrinkle: Yang scraped a bunch of Chinese users’ posts from Twitter — which the Chinese government doesn’t get to censor (yet) or spy on (yet) — and fed them to the model. He hypothesized that when Chinese users post to American social media, they don’t self-censor or preference-falsify, so this data should help the model improve its accuracy.
He was right — the model got significantly better once it ingested data from Twitter than when it was working solely from Weibo posts. And Yang notes that dictatorships all over the world are widely understood to be scraping western/northern social media.
But even though Twitter data improved the model’s accuracy, it was still wildly inaccurate, compared to the same model trained on a full set of un-self-censored, un-falsified data. GIGO is not an option, it’s the law (of computing).
Writing about the study on Crooked Timber, Farrell notes that as the world fills up with “garbage and noise” (he invokes Philip K Dick’s delighted coinage “gubbish”), “approximately correct knowledge becomes the scarce and valuable resource.”
https://crookedtimber.org/2023/07/25/51610/
This “probably approximately correct knowledge” comes from humans, not LLMs or AI, and so “the social applications of machine learning in non-authoritarian societies are just as parasitic on these forms of human knowledge production as authoritarian governments.”
Tumblr media
The Clarion Science Fiction and Fantasy Writers’ Workshop summer fundraiser is almost over! I am an alum, instructor and volunteer board member for this nonprofit workshop whose alums include Octavia Butler, Kim Stanley Robinson, Bruce Sterling, Nalo Hopkinson, Kameron Hurley, Nnedi Okorafor, Lucius Shepard, and Ted Chiang! Your donations will help us subsidize tuition for students, making Clarion — and sf/f — more accessible for all kinds of writers.
Tumblr media
Libro.fm is the indie-bookstore-friendly, DRM-free audiobook alternative to Audible, the Amazon-owned monopolist that locks every book you buy to Amazon forever. When you buy a book on Libro, they share some of the purchase price with a local indie bookstore of your choosing (Libro is the best partner I have in selling my own DRM-free audiobooks!). As of today, Libro is even better, because it’s available in five new territories and currencies: Canada, the UK, the EU, Australia and New Zealand!
Tumblr media
[Image ID: An altered image of the Nuremberg rally, with ranked lines of soldiers facing a towering figure in a many-ribboned soldier's coat. He wears a high-peaked cap with a microchip in place of insignia. His head has been replaced with the menacing red eye of HAL9000 from Stanley Kubrick's '2001: A Space Odyssey.' The sky behind him is filled with a 'code waterfall' from 'The Matrix.']
Tumblr media
Image: Cryteria (modified) https://commons.wikimedia.org/wiki/File:HAL9000.svg
CC BY 3.0 https://creativecommons.org/licenses/by/3.0/deed.en
 — 
Raimond Spekking (modified) https://commons.wikimedia.org/wiki/File:Acer_Extensa_5220_-_Columbia_MB_06236-1N_-_Intel_Celeron_M_530_-_SLA2G_-_in_Socket_479-5029.jpg
CC BY-SA 4.0 https://creativecommons.org/licenses/by-sa/4.0/deed.en
 — 
Russian Airborne Troops (modified) https://commons.wikimedia.org/wiki/File:Vladislav_Achalov_at_the_Airborne_Troops_Day_in_Moscow_%E2%80%93_August_2,_2008.jpg
“Soldiers of Russia” Cultural Center (modified) https://commons.wikimedia.org/wiki/File:Col._Leonid_Khabarov_in_an_everyday_service_uniform.JPG
CC BY-SA 3.0 https://creativecommons.org/licenses/by-sa/3.0/deed.en
826 notes · View notes
waitineedaname · 4 months
Text
Finally, after months of work, I have completed it: the collection of all* character appearances in Fullmetal Alchemist: Brotherhood!
edit: if you want a more detailed spreadsheet on the homunculi in particular, @vuullets has a collection of all homunculi appearances in the manga! you can find it here
Some notes on this spreadsheet:
there are spoilers. obviously. proceed with caution
timestamps indicate when a character first appears in a scene, not every time they appear. if the scene changes to one without that character, and then we return to that character in another scene, that's another timestamp for a new appearance
all timestamps are approximate, give or take a few seconds based on how quickly I could pause the show
only unique flashbacks count as an appearance. if the flashback is to something we've seen in a previous episode, that is not counted as a unique appearance, but if it provides something new that we haven't seen before, it counts!
I didn't include background easter egg appearances, like when you can see Mei in the background at a train station before she's introduced
I didn't actually do all characters. there are a lot of characters, and I am just one person. sorry if you're a big fan of minor members of the military, i just couldn't do it
since Greed is kind of a special case, he deserves a specific explanation: OG Greed and Greedling are not counted as separate characters, they're both just Greed. when Greed is in control of Ling's body, that counts as an appearance for Greed, and it's not an appearance for Ling unless he's in control. if they're both in a scene together (talking in the mindscape, for example, or switching control back and forth) they each get a timestamp for when they first appear/speak in a scene
feel free to use this as a reference! I made this as a useful tool for myself, and because I'm a nerd about data. if you are also a nerd about data, I tallied up some stats, which I'll put under the cut:
only six characters broke 30 episodes. the characters with the most appearances are Edward (60), Alphonse (58), Mustang (45), Hawkeye (42), Scar (40), and Winry (31).
next highest on the list are Alex Armstrong and Mei (tied for 29), King Bradley (28), Hohenheim (26), and Ling (25).
the homunculus in the most episodes is Wrath (28), and the one in the least is Lust (11)
as previously mentioned, Alex Armstrong and Mei are in the same number of episodes (29), as are Olivier Armstrong and Marcoh (24), and Buccaneer and Ross (18)
Hughes is in only 10 episodes, the same number as Grumman and Fu
Yoki is in a whopping 23 episodes. what the fuck
the chimera in the most episodes is Zampano (21), closely followed by Darius and Jerso (20), with Heinkel falling behind at 16. The Devil's Nest chimeras are only in 2 episodes, with the exception of Bido, who is in 3
77 notes · View notes
butch-reidentified · 3 months
Text
MRA's love to claim that if women were in charge the world would go to shit bc we'd "get our periods and declare war," which is obviously a batshit insane, uneducated, and maximally misogynistic belief to begin with. I shouldn't have to tell you that our periods don't actually make us emotionally unstable, that in fact fewer than 20% of college-age women (women who aren't even old enough for the prefrontal cortex to finish developing, and thus are far from old enough to, for example, be eligible to run for US president) even report "severe" psychological symptoms of PMS - and this includes symptoms like depressed mood and anxiety.
in fact, PMS isn't even something all women experience. and of those who do, there's a huge variety of ways it can present. most symptoms women associate with PMS are not emotional: bloating, body soreness, headaches, oversleeping, food cravings, nausea/vomiting, hot flashes, breast tenderness....
from the article linked above: "Definitions of PMS and diagnostic criteria to identify cases have varied substantially over the years and across studies, in large part due to the heterogeneity of women’s menstrual symptom experience. Over 150 symptoms have been associated with PMS."
overwhelmingly, research shows that the effect of PMS on women in the workplace is the same as that of any other medical problem/illness: some people miss some work if it's severe enough. which, considering that symptoms can often include various types of pain that can be quite severe, as well as common illness symptoms like nausea and vomiting, it makes perfect sense that some women would need to take a day off or leave a bit early at times. what the research does NOT say is that PMS causes women to behave in irrational ways that negatively impact the quality of her work.
so let's be truthful. why would female leaders mean more war when women and girls are so overwhelmingly and horrifically sexually victimized as a result?
if most women don't even experience severe mood symptoms with PMS, and having mood symptoms doesn't mean one is unable to control her actions/behaviors (I know this concept of self-control is foreign to most men, but we're pretty good at it!), and there's absolutely zero evidence to suggest that severe PMS mood symptoms would or could ever lead to declaring war, and women old enough to hold office in most countries have many years of experience managing their pre/peri-menstrual symptoms (if they even have any), and most world leaders are past the age women stop even having periods at all, and we see that women in other leadership positions are absolutely crushing it all over the world, and there IS significant evidence showing that women in numerous fields actually outperform male peers (despite feeling significantly less respected in higher-rank positions than males feel, as well as feeling more discouraged and frustrated) and are more emotionally intelligent, there IS evidence that women are less influenced by and better at regulating anger in the workplace, and there IS indisputable evidence that men are more violent than women in general, regardless of the reason, and there IS indisputable evidence that women and girls suffer mass victimization by men during wartime... then maybe, just maybe, women are actually less likely than men to start wars. but there's only one way to find out for sure 😏
51 notes · View notes
mathysphere · 8 months
Text
Tumblr media
Fun fact! 'Funny cross stitch' is the 16ᵗʰ most common tag given to cross-stitch listings on Etsy (right after 'pdf cross stitch' and before 'cute cross stitch'). But how many of those patterns are actually funny? What's the average humor level? And which among them is the funniest? I think it's time we found out!
Click Here* to cast your vote on any of 997 different 'funny' cross-stitch patterns, all randomly scraped from Etsy throughout the year, and check back later this month for the results!
Patterns pictured here: [dark sense] [keith haring] [love] [pew pew] [gnomes]
*the site is uhhhh a little hacked-together, so if it crashes lemme know and I'll fix it. website design is my passion
117 notes · View notes
itellmyselfsecrets · 1 year
Text
“Most of recorded human history is one big data gap. Starting with the theory of Man the Hunter, the chroniclers of the past have left little space for women's role in the evolution of humanity, whether cultural or biological. Instead, the lives of men have been taken to represent those of humans overall.” - Caroline Criado Perez (Invisible Women: Data Bias in a World Designed for Men)
10 notes · View notes
skannar · 10 months
Text
0 notes
seonghwacore · 28 days
Text
be real honest. which member of your favorite group whose personality is actually similar to you? are they your bias or not?
34 notes · View notes
Text
Even something as basic as advice on how to exercise to keep disease at bay is based on male-biased research. If you run a general search for whether resistance training is good for reducing heart disease, you’ll come across a series of papers warning against resistance training if you have high blood pressure. This is in large part because of the concerns that it doesn’t have as beneficial an effect on lowering blood pressure as aerobic exercise, and also because it causes an increase in artery stiffness.
Which is all true. In men. Who, as ever, form the majority of research participants. The research that has been done on women suggests that this advice is not gender-neutral. A 2008 paper, for example, found that not only does resistance training lower blood pressure to a greater extent in women, women don’t suffer from the same increase in artery stiffness. And this matters, because as women get older, their blood pressure gets higher compared to men of the same age, and elevated blood pressure is more directly linked to cardiovascular mortality in women than in men. In fact, the risk of death from coronary artery disease for women is twice that for men for every 20 mm Hg increase in blood pressure above normal levels. It also matters because commonly used antihypertensive drugs have been shown to be less beneficial in lowering blood pressure in women than in men.
So to sum up: for women, the blood-pressure drugs (developed using male subjects) don’t work as effectively, but resistance training just might do the trick. Except we haven’t known that because all the studies have been done on men. And this is before we account for the benefits to women in doing resistance training to counteract osteopenia and osteoporosis, both of which they are at high risk for post-menopause.Other male-biased advice includes the recommendation for diabetics to do high-intensity interval training; it doesn’t really help female diabetics (we don’t really know why, but this is possibly because women burn fat more than carbs during exercise). We know very little about how women respond to concussions, ‘even though women suffer from concussions at higher rates than men and take longer to recover in comparable sports’. Isometric exercises fatigue women less (which is relevant for post-injury rehabilitation) because men and women have different ratios of types of muscle fibre, but we have ‘a limited understanding of the differences’ because there are ‘an inadequate number of published studies’.
— Caroline Criado Perez, Invisible Women: Exposing Data Bias in a World Designed for Men
1K notes · View notes
petewentzisblack1312 · 2 months
Text
'mourning colour' as the name implies refers to a colour associated with mourning, for example one worn at funerals and wakes.
say your answer and where youre from in the tags, replies or in a comment.
38 notes · View notes