Text
“Why do Greek, Czech, Hungarian, and Swedish, with their 8 to 13 million speakers, have Google Translate support and robust Wikipedia presences, while languages the same size or larger, like Bhojpuri (51 million), Fula (24 million), Sylheti (11 million), Quechua (9 million), and Kirundi (9 million) languish in technological obscurity? Swedish, Greek, Hungarian, and Czech have a wealth of language resources, created one human at a time over centuries. They’re the languages of entire nation-states, with national TV and radio recordings that can be used as the foundation for text-to-speech models. Their speakers have the kind of disposable income that makes media companies translate popular novels and subtitle foreign movies and TV shows. They’re found in countries that tech companies imagine their customers might be living in or might at least visit on holiday, meaning it’s worth localizing interfaces and adding them as translation options. They have regularized spelling systems and dictionaries that can be rolled into spellcheckers and predictive text models. They have highly literate speakers with internet access who can contribute to projects like Wikipedia. (Speakers who can even, in the case of Swedish, create a bot to automatically make basic Wikipedia articles for rivers, mountains, and other natural features.) Language resources don’t just appear. People have to decide to create them, and those people need to be fed and watered and educated and housed and supported, whether that’s by governments or by companies or by the kind of personal wealth that lets individuals take on time-consuming intellectual hobbies. Creating parallel corpora and other language resources takes years, if it happens at all, and cost tens of millions of dollars per language.”
— Gretchen McCulloch, The widely-spoken languages we still can’t translate online. (My latest article as Wired’s Resident Linguist.)
3K notes
·
View notes
Video
8 year old me watching at the theater for the first time:
25 year old me on my couch watching on my laptop for 1000th time:
10K notes
·
View notes
Photo
Your arms are bigger than mine. Wow!

2018.02.16 - Hittin the Gymbo
80 notes
·
View notes