globalmapping
globalmapping
Mapping Globalization
18 posts
Globalization is a system of flows: of people, money, goods, viruses, ideas, etc. To understand the system, it's best to start by mapping: rendering the complex interactions between the system’s various parts. Though the speed and density of contemporary flows may be unique, our age is not the first instance of globalization. Charting a course across centuries, we offer these maps not to fully encompass globalization, but as a means of sparking conversation about what the maps tell us. -Manish Nag and Miguel Centeno, Princeton University
Don't wanna be here? Send us removal request.
globalmapping · 11 years ago
Text
A Lot More to Say About Languages Spoken in the U.S.
By Manish Nag The answer to "What language does your state speak at home?" has become a subject of intense discussion, as Ben Blatt's original article in Slate.com, and responses from Karthick Ramakrishnan at AapiData.org, and Peter Frase at Jacobin Magazine attest. At Global Mapping we decided to see if there was anything more to add on the subject, so we dug into the U.S Census Bureau's American Community Survey (ACS) ourselves. As it turns out, there is still more to say. Using the ACS 2008-2012 5-year estimates on languages spoken at home (Census Table b16001), here's what we found:
Language aggregation still matters.
Ramakrishnan questions Blatt's labeling of Tagalog as the third most spoken language in California. Tagalog's high rank is based on data from a 2006-08 3-year ACS file that tabulates individual Chinese languages as Cantonese, Mandarin, or Chinese based on write-in survey responses. Once these languages are aggregated under the category Chinese (as they are in 2008-12 table), this new category becomes the most spoken language in the state after English and Spanish. Following this logic, we wondered what would happen if Hindi, Gujarati, and Urdu were similarly aggregated under the existing category of Indic languages (those spoken in Northern India, Pakistan, Nepal and partly in Sri Lanka). The U.S. Census Bureau may consider Hindi, Urdu, and Gujarati separately in its official capacity to help inform voter assistance decisions in particular districts, but a reasonable argument can be put forth to aggregate Indic languages on linguistic grounds. Hindi and Gujarati share similar words and alphabets, while Hindi and Urdu share many common words though not an alphabet. The three languages are also spoken in close proximity to one another in Northern India and Pakistan. After aggregating Indic languages, the impact on third languages (those outside English and Spanish) is evident across 3 Eastern (New Jersey, Pennsylvania, Delaware) and 6 Southern states (Texas, Virginia, North Carolina, Tennesse, Georgia, Alabama). In the AapiData maps on third languages, NJ, PA, DE and NC were all labeled Chinese, TX was listed as Vietnamese, VA as Korean, and TN and AL as German.
Tumblr media
Post aggregation, Indic languages also become the most used non-European language throughout the eastern half of the U.S outside the New England region.
Tumblr media
Absolute size matters.
Having just explained how to improve the maps that AapiData and Slate.com have created regarding third languages, we will now discuss why focusing on the numbers of non-English or non-Spanish speakers might be misleading in the larger context. Here is a bar chart showing population counts in the US as a whole for all language categories in the 2012 ACS 5-year survey, with Indic languages aggregated as before.
Tumblr media
Numbers of English-only and Spanish speakers are levels of magnitude larger than the numbers of speakers in the remaining 38 language categories. On this chart, the bars that can just barely be made out are Chinese, Indic languages, Tagalog, Vietnamese and French. No other languages are registerable. Though a lot of internet ink is being spilled discussing language diversity, in the big picture, the difference in numbers of foreign language speakers is hardly noticeable when the number of English and Spanish speakers is included on the same graph. In order to make some sense of the different sizes of all language populations at once, we need to update our graph to use the logarithmic scale.
Tumblr media
[On a technical note, using the logarithmic scale allows for compressing counts ranging from a few hundred to a few hundred million on one chart. However, some finesse is needed in reading these charts, as the size of units changes. For example, for bars that fall between 100K and 1 million, each white vertical line over a bar indicates an additional 100K in size. Similarly, for bars between 1 million and 10 million, each vertical white bar represents an additional 1 million in value.] Looking at the logarithmic chart, one striking observation is how close the estimated numbers of speakers are for languages other than Spanish and English. Chinese languages and Indic languages cluster around 2-3 million speakers each, while the next 10 languages from the bottom all cluster around 1 million. This provides an interesting picture of language speakers in the U.S. Aside from the spikes in the distribution for English and Spanish, the distribution of the remaining languages is reasonably flat.
Why not view more data at once?
While we understand that Slate.com created simple, one dimensional language maps to enhance readability and virality, something is lost when you don't show more data to readers. With the maps for Slate and AapiData, you have no way to make sense of the language variation within and across state boundaries. Although more visually complex, many more insights can be gleaned at once from a map that shows more data. Rather than feeding a story piecemeal to readers, we wanted to provide the tools for them to make their own discoveries. Following this logic, here is a map that shows the top 6 languages for each state and their magnitudes at logarithmic scale. The map is split it into three pieces for this post, but the image is also available as a single map.
Tumblr media Tumblr media Tumblr media
The maps provide a way to see the data points related to our previous discussion about the importance of aggregating Indic languages in states east of the Mississippi River. You can also make interesting observations across state borders. The large number of Spanish speakers in California and Texas-over 10 million in each state-is readily apparent, as is the fact that the number of Spanish speakers in either of these states dwarfs the number of English-only speakers in any of the remaining Western states. Within California, you can also compare the sizes of Chinese and Tagalog, the two languages at the heart of the debate between Slate and AapiData.org. At logarithmic scale, it's clear that the relative sizes of Chinese and Tagalog populations are quite close (~25K difference). Similarly, discussions about whether Indic or Chinese languages are larger on the East Coast should be balanced by understanding that Indic and Chinese speaking populations track each other closely in size. Discussion of spoken languages must be therefore be taken in the proper context, and being able to compare magnitudes simultaneously within and across states gives us this context. At the same time, you can see the national significance of English and Spanish. Contrary to Ben Blatt's findings using 2006-08 ACS data, both New Hampshire and Louisiana have seen the decline of French and rise of Spanish to the second language spot in their states. A 'second languages' map using our data reinforces this point.
Tumblr media
Margins of error matter.
Since ACS population estimates are calculated using samples, the US Census Bureau always provides a margin of error along with any population estimate. The importance of margins of error came up while trying to choose the right data set for our maps. We initially wanted to use the 1-year estimates for 2012, as they would give us the most up-to-date results. However, table B16001 for the 2012 1-year data was missing for 10 of the 50 states. After digging into this further, it turns out that the ACS's data-quality filtering rules most likely prevented the release of this table for these states due to margins of error that prevented reliable estimates from being released. Even with the 3-year file that pools three consecutive years of data to produce average results for 2010-2012, one state was missing due to similar data-quality filtering. We therefore went to the 5-year file for 2012, which has a larger sample size and thus smaller margins of error. In addition, the ACS does not apply data-quality filtering to 5-year data products. Paying attention to margin of error was something Peter Frase emphasized in his Jacobin magazine piece on Slate's language maps. I followed up on Frase's discussion of Native American language counts by looking at the top Native American languages for each state, and whether their margin of error exceeds zero. What I found was that 9 states had a margin of error that was greater than the estimate for the most-spoken Native American language (HI, DE, MA, MD, NH, NJ, PA, RI, VT). Not only were the counts of Native American speakers tiny in these states (most around 100), we can't be confident that the top Native American speakers on Blatt's map exceed zero at all in these states. For rare population characteristics—such as speakers of Native American languages in these 9 states—the uncertainty inherent in sampling means that it is difficult to produce reliable estimates. Not displaying this uncertainty can thus be misleading, and this is why we include margin of error in any bar graphs using sample data, even when error bars are often indiscernible at log scale. Ben Blatt's article sparked an engaged discussion not just on language, but also on data, how to visualize it and how to interpret it. These are all things we are passionate about at Global Mapping, so we're grateful to have the opportunity to contribute our perspectives. Manish Nag is a Doctoral Candidate in Sociology at Princeton University. He can be reached at [email protected]. The author wishes to thank Erik Vickstrom at the U.S. Census Bureau for his help with the American Community Survey data. All mistakes are the author's alone.
134 notes · View notes
globalmapping · 11 years ago
Photo
Tumblr media Tumblr media Tumblr media
Coal consumption, production and reserves. Source: eia.gov
101 notes · View notes
globalmapping · 11 years ago
Photo
Tumblr media
The Umayyad Caliphate's empire at its largest extent (750 A.D.), from Spain to China. Source: William Shepherd's (1923) Historical Atlas.
18 notes · View notes
globalmapping · 11 years ago
Photo
Tumblr media
Rise and Fall of the Ottoman Empire, 1300-1900 A.D.
63 notes · View notes
globalmapping · 11 years ago
Text
Rise and Fall of the Byzantine Empire 476-1400 AD
1 note · View note
globalmapping · 11 years ago
Photo
Tumblr media Tumblr media Tumblr media Tumblr media
US Military Deployments 2012
1 note · View note
globalmapping · 11 years ago
Photo
Tumblr media Tumblr media Tumblr media Tumblr media
Global meat traffic, 2009: Beef, Poultry, Pork, Fisheries
94 notes · View notes
globalmapping · 11 years ago
Photo
Tumblr media Tumblr media Tumblr media
Global Oil: Supply, Demand, and Reserves for 2011. Source: eia.gov.
0 notes
globalmapping · 11 years ago
Photo
Tumblr media Tumblr media Tumblr media
American fast food around the world.
178 notes · View notes
globalmapping · 11 years ago
Photo
Tumblr media Tumblr media Tumblr media
Global Grain: Trade Flows 2009
0 notes
globalmapping · 11 years ago
Photo
Tumblr media
World map shows Air Postal Routes: internal postal routes, international postal routes, and airports. Smaller maps showing more detailed information of different regions are framed separately. Map was edited by the International Office of the Universal Postal Union. Source
1 note · View note
globalmapping · 11 years ago
Photo
Tumblr media
Global air traffic, 2009-2012
Data Source
262 notes · View notes
globalmapping · 11 years ago
Photo
Tumblr media
2008 Global Shipping Traffic.
Data Source
14 notes · View notes
globalmapping · 11 years ago
Photo
Tumblr media
Distribution and Limits of Cultivation of the Principal Plants Useful to Mankind, 1865 Source
0 notes
globalmapping · 11 years ago
Photo
Tumblr media
Trade routes and distances by existing lines, and by the Panama Canal, 1912
1 note · View note
globalmapping · 11 years ago
Photo
Tumblr media
Exports and Imports of Corn (Maize), Oats and Barley in 1925
0 notes
globalmapping · 11 years ago
Photo
Tumblr media
Steamship routes, circa 1900
0 notes