Shaking up the checkbox system
Some of you might already be aware that I’m considering a new system for choosing which terms are on the checkbox identity list in future annual surveys, because this year's results under the existing system would result in the addition of 12 new terms, and none would be removed - making the checkbox identity list 45 terms long. (We usually add 1-4 new words per year and maybe remove one, so +12 and -0 is very unusual!)
At 33 this year it was already too long. People were struggling to find their identities in the list even with the filter, which risks data quality as they use the textboxes instead, and increases the risk of the participant abandoning the survey. We also get a lot of people saying stuff like “you should explain what these terms mean, because I’ve not heard of half of them” - that sense of alienation is another thing that increases the likelihood of a participant closing the tab before the end.
I’ve been running a consultation since just before I posted the 2022 report, and there have been 1,339 responses, which is pretty great - thank you all very much for your participation!
This blog post will go over the hypothetical checkbox list under the established and the proposed selection systems, combined with (and informed by) the results from the consultation survey. We’ll start with an overview of the list under both systems, and then I’ll go into more detail on how each list was constructed and why.
~
THE ESTABLISHED SYSTEM
Under the current system, the checkbox list will look like this next year:
agender
androgyne
autigender
bigender
binary
boy
boygirl
butch
cisgender
demiboy
demigender
demigirl
dyke
enby
fag
faggot
feminine
femme
gay (in relation to gender)
gender non-conforming
genderfluid/fluid gender
genderflux
genderfuck
genderless
genderqueer
gendervoid
girl
girlboy
guy
lesbian (in relation to gender)
man
masculine
neutral
nonbinary
queer (in relation to gender)
questioning or unknown
trans
trans*
transfeminine
transgender
transmasculine
transsexual
woman
xenogender
none / I do not describe myself / person / "I'm just me"
That’s 45 checkboxes. Terms in bold would be new for 2023. In the survey they would be presented in a randomised order, to reduce primacy and recency bias. (Unfortunately, for bias reduction reasons I cannot sort or categorise terms in the survey in any way.)
There would also be up to 20 textboxes directly underneath where participants can type their identities that were not listed as checkboxes.
~
THE PROPOSED SYSTEM
Here’s what the checkbox list would look like next year under the proposed system:
agender
binary
cisgender
dyke
enby
fag
gender non-conforming
genderfluid/fluid gender
genderqueer
man
nonbinary
queer (in relation to gender)
questioning or unknown
trans
transfeminine
transgender
transmasculine
woman
none / I do not describe myself / "I'm just me"
That’s 19 checkboxes. Terms in bold would be new for 2023. In the survey they would be presented in a randomised order, to reduce primacy and recency bias. (Unfortunately, for bias reduction reasons I cannot sort or categorise terms in the survey in any way.)
There would also be up to 20 textboxes directly underneath where participants can type their identities that were not listed as checkboxes.
~
HOW THE ESTABLISHED SYSTEM WORKS
The 1%/3% thresholds
Currently, if a word/term is typed into a textbox by over 1% of respondents in either the 30-and-under or the 31-and-over age group, I add it to the checkbox list.
The age group thing is to make sure that over-30s, who usually only make up about 14% of responses, see words in the checkbox list that they can relate to - so that they don’t look at the survey, see a bunch of stuff that feels alienating, and then immediately check out and close the tab.
If I add a word that has a very clearly corresponding word, that is added to the list too for completeness and comparison, even if it wasn’t entered by over 1% of participants. (For example, if I add transmasculine I have to add transfeminine, or if I add transgender I have to add cisgender.)
When the list started to get a bit too long, I added another rule that lets me remove words from the checkbox list when they become less popular. Words selected by under 3% of both 30-and-unders and 31-and-overs should be removed from the checkbox list, unless they correspond with another more popular word that is remaining on the list.
The removal threshold is higher than the addition threshold, because when a word is a checkbox it is selected more often than it would have been entered into a textbox. Now that I have more data on The Checkbox Effect, I’m aware that the 3% threshold is too low and would need to be higher.
~
HOW THE PROPOSED SYSTEM WORKS
Multiply and rank
Now that 12 checkboxes have been added over the years and 2 have been removed, we are in a position to observe how much more often a term is selected as a checkbox than it would have been written in.
I call this the Checkbox Effect, which is the tendency for a term to appear to become suddenly more popular when it gets added to the checkbox list. I speculate that it might be because a checkbox reminds people of a term that they do relate to, but that they might not have independently thought of as a term to write into a textbox. (Also, checking a box is easier than typing something.)
The best way to show you what I mean by this is to show you on a spreadsheet:
(I’d show you the numbers for the checkboxes that have been removed, but there’s only two and they’re not suitable for use in this exercise - one is 36.7x and one is 6.5x!)
So just to make sure we’re all on the same page:
Right at the top, gender non-conforming was typed into the textboxes by 1.1% of participants in 2018, and after it was added to the checkbox list it was chosen by 24.3 times as many people in 2019.
Right at the bottom, demigender was typed in by 4.1% of participants in 2015, and when it was added to the checkbox list in 2016 it was chosen by 14.8% of participants, which was 3.6 times as many people.
That’s quite a wide range, but let’s use the median of 8.9 as the multiplier. (I often find that with sets like this, which have a couple of very high or very low outliers that would skew a mean, the median is more accurate.) The plan would go something like this:
Choose an optimal number of checkboxes to have in the list. Enough that everyone can choose at least one, but not so many that it gets hard to find your identity/identities on there, and not so many that it gets exhausting to scroll through and make a decision about each one (which is a thing that some participants do like to do). For the purposes of this explanation, let’s say the optimal number of checkboxes is 🥑 (an avocado emoji).
Use the multiplier (8.9x). Assume that a textbox answer would be chosen 8.9 times as often if it was a checkbox. Multiply all write-ins by 8.9. So if a term is typed in 10 times, its total (for the purposes of this exercise only) becomes 10 x 8.9 = 89.
Combine the checkbox terms (and their statistics) with the textbox terms (and their multiplied statistics), and rank them by [popularity/assumed popularity] descending, i.e. with the most popular at the top of the list.
Choose the top 🥑 terms to be checkboxes in next year’s survey.
So, we simulate how popular a write-in would be, in order to compare it to existing checkboxes and decide whether or not it should be on the list.
When we multiply a textbox term in this way, we are guessing at how popular it would be. (For this reason, we can’t use this guesswork in the reports or the spreadsheets of results, this method would only be used to work out which terms will be checkbox options.) As we saw in the above table, the multiplier isn’t an exact, fixed number - it’s kind of a judgement call. My choice depends on a lot of factors that we can’t really see properly. What causes some words to become only 3x more “popular” and some words to become 10x more popular, or more? That means that when we choose a multiplier, we have to take into account that:
If we choose a higher multiplier, more of the uncommon words get a chance to be added to the checkbox list, which gives more different people a chance to be and feel represented by the survey - but there might be a lot of different words being added and removed year-to-year, and some more popular and well-established words might be removed from the list than they should be;
If we play it safe and choose a lower multiplier, the list would be more stable from year to year, but some textbox terms that are genuinely becoming more popular than checkbox words might appear to be less popular.
So, as I said above, the median Checkbox Effect is 8.9x. We don’t have any experience of using this method to go on, so we might as well use that for now. (We can adjust it in response to new data etc. in future years, which is reassuring.)
Next let’s choose 🥑 - the ideal number of checkbox items in the list.
In the consultation I asked people what a comfortable number of checkboxes would be for them. Here’s the results:
That’s 29.1% saying 11-20 checkboxes, followed by 25.2% saying 21-30 checkboxes. So I’d like to keep it to between 11 and 20 checkboxes, and within that margin I’d feel more comfortable aiming higher. Let’s say 🥑 = 20 next year and see how it goes.
I’m guessing you don’t need to know the exact ins-and-outs, but if I choose the same number of words from the top 10 of each of the two age groups (12), combine them (13 unique terms), and factor in the terms that have to be there no matter what (2) and the “opposites” that I would have to add automatically (4), the list is 19 items long and looks like this when sorted alphabetically:
agender
binary
cisgender
dyke
enby
fag
gender non-conforming
genderfluid/fluid gender
genderqueer
man
nonbinary
queer (in relation to gender)
questioning or unknown
trans
transfeminine
transgender
transmasculine
woman
none / I do not describe myself / "I'm just me"
Dyke and fag are new. If the multiplier we chose is too high, those two words will be less popular than expected next year, and they might fall back off the checkbox list for 2024.
I feel like most people could probably choose at least one of the listed identities, but if they can’t, they will have 20 textboxes directly underneath.
~
ADDRESSING CONCERNS ABOUT THE PROPOSED SYSTEM
Some participants expressed concern that some established and popular words would leave the checkbox list. The following words from the current checkbox list would be lost:
androgyne
bigender
boy
butch
demiboy
demigender
demigirl
femme
gay (in relation to gender)
genderflux
genderless
gendervoid
girl
lesbian (in relation to gender)
neutral
trans*
When I say they would be “lost”, that makes it seem pretty negative, but participants would still be able to type those words into the textboxes, and those textbox entries would be counted, receive the multiplier treatment, and then be considered for inclusion in the checkbox list the following year.
Some people also said that they would feel unhappy about me “moving the goalposts”, by saying this year that a word would be added to the checkbox list if it went over 1%, and then introducing new criteria after some new words meet the old criteria - so some words that should have been on the list next year will not be. For the record, the eight words that went over 1% this year in one of the two age groups and that would not be added to the checkbox list under the proposed system are:
autigender
boygirl
faggot
genderfuck
guy
masculine
transsexual
xenogender
I can definitely understand there being some annoyance about this! But data quality is already suffering with 33 checkbox options, and I’m more than a little bit worried about what would happen if that list went up to 45.
And the thing about annoyance is, I pretty much annoy people every year, even when I stick to the rules I made and am very clear and transparent about. For example, I constantly get emails and comments in the feedback box saying that I should add various terms to the checkbox list because they are marginalised or otherwise very deserving of increased visibility, even though they have never been typed in by anywhere near 1% of participants, and sometimes are not even related to gender. (Two-spirit, intersex, and plural spring to mind.) And, over the years, people have consistently asked me to remove terms from the identity checkbox list and the pronoun list because they’re (sometimes, in some contexts, to some people) offensive, even when those terms/pronouns are very popular. (Mainly queer, trans* [no footnote], and it/it pronouns.)
So there’s two relevant things I factor into my decisions here:
The checkbox list is not a winner’s podium or an attempt at correctness (political or otherwise); it exists to improve data quality and to make the survey easier to fill in for the majority of participants. The bar for inclusion on the checkbox list has been incredibly low thus far (1% in only one of the two age groups), and if your word doesn’t meet it, the checkbox is unlikely to have much effect aside from confusing people. While your identity terms are valid and do deserve visibility and recognition, a checkbox in a survey is intended to observe that visibility/recognition, not create it.
While the Checkbox Effect undoubtedly exists and does make some terms appear more or less popular than they actually are (and I don’t think there is anything I can do about that short of ditching checkboxes and doing only textboxes) (no I will not do that), if it didn’t exist I’m pretty sure the order of the top of the list wouldn’t actually change that much. Words that are typed into the textboxes are counted and any word can be looked up in the results spreadsheet, even if it has been entered only once.
A lot of people said that they don’t like the checkbox list being limited at all, and every identity should be on there. I’d refer those people to point 1 above, and I would also suggest that perhaps you don’t want all 14,622 unique textbox entries on the checkbox list. You probably don’t even want the 2,665 entries that were typed in more than once.
A few people said that they really like poring over the checkbox list and considering whether each identity applies to them. That’s understandable, speaking as a person who spends two months of the year poring over lists of genders literally thousands long! However, if the survey is enjoyable that’s a happy side effect. The survey’s main goal is to collect very specific information from as many people as possible, as easily/quickly/efficiently as possible.
A bunch of people suggested a fancy question/answer design where there’s a textbox, and you start typing a word, and if the word is a checkbox it makes you check the box, and if the word isn’t a checkbox it lets you enter it as a textbox answer instead. This sounds extremely fancy and I love it, but:
It would be more complicated to code;
People have a filter for that question already and they’re either not seeing it or not using it, because they’re still typing checkbox terms into textboxes.
I feel that the solution here isn’t more fancy/complex design, which also carries more risk of going wrong on different platforms and all that jazz. I know it’s a little cliché, but in design it’s often true that less is more.
A couple of people expressed concern that it might be confusing if the list of identities changed a lot from year to year, and while I think that might be true for some people, at least 71% of participants this year had never done the annual survey before, and I suspect that might be fairly typical. Of those who are familiar with the annual survey and read all the reports, I’m sure a lot of them won’t have memorised last year’s list. And I don’t think it would affect whether or not people can answer that question in the survey, which is the important thing.
Several people said that they would only support it if the multiplier could be adjusted as appropriate with new data each year, which is absolutely something that I would want to do. Great minds, great minds.
And finally, it is inevitable that if the list gets shorter more people will have to type their identities. I’m hoping that fewer people typing in words that are already on the checkbox list will make up for this.
~
MY DECISION
I think I’ve got to go for it and implement the proposed system.
As for whether I should implement it for 2023 or wait until 2024, before I processed the textbox identities I was still on the fence - but after finding out that the list would grow to 45 checkboxes next year under the current system... I didn’t fall off the fence, I was shoved off it very enthusiastically.
Thankfully, most people are in favour of the system (only 5% against), and about two-thirds are either in favour or very in favour of it being implemented immediately.
~
CONCLUSIONS
Yeah, I think I have to do it, and I think I have to do it right away! I’ll use this system for the identity checkboxes only for 2023, and if it works well I’ll consider trying it for titles and pronouns for 2024.
Also, I’m really glad I did the consultation. I feel a lot better knowing how you all feel about it in advance, and people suggested a lot of potential challenges, which gave me a chance to consider all the pros and cons before making a decision.
And, as a couple of people pointed out in the consultation, I can try it for one year, and if it doesn’t work I can switch back to the old system or try something new.
Thank you everyone for your input! And, in case you missed it, you can read the 2022 report here. You can also support this project on Patreon here, and you can sign up to be notified when the 2023 survey opens by signing up to the mailing list here.
91 notes
·
View notes