Additional letters
This is a compilation of additional letters in the main scripts of the world.
Arabic script
The basic Arabic abjad has 28 letters: ح ج ث ت ب ا ص ش س ز ر ذ د ق ف غ ع ظ ط ض ي و ه ن م ل ك. Some languages have adapted it by including additional letters:
پ: Arabic, Balochi, Kashmiri, Khowar, Kurdish, Pashto, Persian, Punjabi, Sindhi, Urdu, Uyghur
ٻ: Saraiki, Sindhi
ڀ: Sindhi
ٺ: Sindhi
ٽ: Sindhi
ٿ: Rajasthani, Sindhi
ﭦ: Kashmiri, Punjabi, Urdu
ټ: Pashto
چ: Kashmiri, Kurdish, Pashto, Persian, Punjabi, Urdu
څ: Pashto
ځ: Pashto
ڊ: Saraiki
ډ: Pashto
ڌ: Sindhi
ڈ: Kashmiri, Punjabi, Urdu
ݙ: Saraiki
ڕ: Kurdish
ړ: Ormuri, Torwali
ژ: Kurdish, Pashto, Persian, Punjabi, Urdu, Uyghur
ڑ: Punjabi, Urdu
ږ: Pashto
ݭ: Gawri, Ormuri
ݜ: Shina
ښ: Pashto
ڜ: Moroccan Arabic
ڠ: Malay
ڥ : Algerian Arabic, Tunisian Arabic
ڤ: Kurdish, Malay
ڨ: Algerian Arabic, Tunisian Arabic
ک: Sindhi
ݢ: Malay
گ: Pashto, Kurdish, Kyrgyz, Mesopotamian Arabic, Persian, Punjabi, Urdu, Uyghur
ګ: Pashto
ڱ: Sindhi
ڳ: Saraiki, Sindhi
ڪ: Sindhi
ڬ: Malay
ڭ: Algerian Arabic, Kyrgyz, Moroccan Arabic, Uyghur
ڵ: Kurdish
لؕ : Punjabi
ݪ: Gawri, Marwari
ڽ: Malay
ڻ: Sindhi
ݨ: Punjabi, Saraiki
ڼ: Pashto
ۏ: Malay
ۋ: Kyrgyz, Uyghur
ۆ: Kurdish, Uyghur
ۇ: Kyrgyz, Uyghur
ۅ: Kyrgyz
ی: Pashto
ې: Pashto, Uyghur
ىٓ: Saraiki
ێ: Kurdish
ۍ: Pashto
ئ: Pashto, Punjabi, Saraiki, Urdu
ھ: Kurdish, Punjabi, Urdu, Uyghur
ے: Punjabi, Urdu
Cyrillic script
The basic Cyrillic alphabet includes 29 letters: А а Б б В в Г г Д д Е е Ж ж З з И и Й й К к Л л М м Н н О о �� п Р р С с Т т У у Ф ф Х х Ц ц Ч ч Ш ш Щ щ Ь ь Ю ю Я я. Most languages use additional letters:
Ӕ ӕ: Ossetian
Ä ӓ: Hill Mari, Kildin Sámi
Ӑ ӑ: Chuvash
Ґ ґ: Belarusian, Rusyn, Ukrainian
Ӷ ӷ: Abkhaz
Ѓ ѓ: Macedonian
Г' г': Kurdish
Гъ гъ: Avar, Ossetian
Гь гь: Avar
Гӏ гӏ: Avar
Ғ ғ: Azerbaijani, Bashkir, Tajik, Uzbek
Дә дә: Abkhaz
Дж дж: Bulgarian, Ossetian
Дз дз: Bulgarian, Ossetian
Ђ ђ: Montenegrin, Serbian
Ѕ ѕ: Macedonian
Ҙ ҙ: Bashkir
Є є: Rusyn, Ukrainian
Ә ә: Abkhaz, Azerbaijani, Bashkir, Dungan, Kalmyk, Kurdish, Tatar
Ә' ә': Kurdish
Ё ё: Azerbaijani, Bashkir, Buryat, Chuvash, Dungan, Hill Mari, Khalkha, Kildin Sámi, Komi-Permyak, Kyrgyz, Meadow Mari, Ossetian, Russian, Rusyn, Tajik, Tatar, Ukrainian, Uzbek
Ӗ ӗ: Chuvash
Ӂ ӂ: Moldovan
Җ җ: Dungan, Kalmyk
Жә жә: Abkhaz
З́ з́: Montenegrin
Ӡ ӡ: Abkhaz
Ӡә ӡә: Abkhaz
І і: Avar, Belarusian, Rusyn, Ukrainian
Ї ї: Rusyn, Ukrainian
Ӣ ӣ: Tajik
Ҋ ҋ: Kildin Sámi
Ј ј: Azerbaijani, Kildin Sámi, Macedonian, Montenegrin, Serbian
Ҝ ҝ: Azerbaijani
Қ қ: Abkhaz, Tajik, Uzbek
Қь қь: Abkhaz
Ҡ ҡ: Bashkir
Ҟ ҟ: Abkhaz
Ҟь ҟь: Abkhaz
Ќ ќ: Macedonian
К' к': Kurdish
Къ къ: Avar, Ossetian
Кь кь: Abkhaz, Avar
Кӏ кӏ: Avar
Кӏкӏ кӏкӏ: Avar
Кк кк: Avar
Ӆ ӆ: Kildin Sámi
Љ љ: Macedonian, Montenegrin, Serbian
Ӎ ӎ: Kildin Sámi
Ӊ ң: Bashkir, Dungan, Kalmyk, Kildin Sámi, Kyrgyz, Tatar
Ҥ ҥ: Meadow Mari
Ӈ ӈ: Kildin Sámi
Њ њ: Macedonian, Montenegrin, Serbian
Ө ө: Azerbaijani, Bashkir, Buryat, Kalmyk, Khalkha, Kyrgyz, Tatar
Ö ӧ: Hill Mari, Komi-Permyak, Kurdish, Meadow Mari
Ԥ ԥ: Abkhaz
П' п': Kurdish
Ҧ ҧ: Abkhaz
Пъ пъ: Ossetian
Ҏ ҏ: Kildin Sámi
Р' р': Kurdish
Ҫ ҫ: Bashkir, Chuvash
С́ с́: Montenegrin
Ҭ ҭ: Abkhaz
Ҭә ҭә: Abkhaz
Т' т': Kurdish
Тә тә: Abkhaz
Тъ тъ: Ossetian
Тӏ тӏ: Avar
Ћ ћ: Montenegrin, Serbian
Ӱ ӱ: Hill Mari, Meadow Mari
Ӳ ӳ: Chuvash
Ў ў: Belarusian, Dungan, Uzbek
Ӯ ӯ: Tajik
Ү ү: Azerbaijani, Bashkir, Buryat, Dungan, Kalmyk, Khalkha, Kyrgyz, Tatar
Ҳ ҳ: Abkhaz, Tajik, Uzbek
Хъ хъ: Ossetian
Хь хь: Abkhaz
Хӏ хӏ: Avar
Ҳ ҳ: Abkhaz
Ҳә ҳә: Abkhaz
Һ һ: Azerbaijani, Bashkir, Buryat, Kalmyk, Kildin Sámi, Kurdish, Tatar
Һ' һ': Kurdish
Ҵ ҵ: Abkhaz
Ҵә ҵә: Abkhaz
Цә цә: Abkhaz
Цъ цъ: Ossetian
Цц цц: Avar
Цӏ цӏ: Avar
Цӏцӏ цӏц: Avar
Џ џ: Abkhaz, Macedonian, Montenegrin, Serbian
Џь џь: Abkhaz
Ҹ ҹ: Azerbaijani
Ҷ ҷ: Azerbaijani, Tajik
Ч' ч': Kurdish
Чъ чъ: Ossetian
Чӏ чӏ: Avar
Чӏчӏ чӏчӏ: Avar
Ҽ ҽ: Abkhaz
Ҿ ҿ: Abkhaz
Шь шь: Abkhaz
Шә шә: Abkhaz
’: Belarusian, Ukrainian
Ъ ъ: Azerbaijani, Bashkir, Chuvash, Dungan, Hill Mari, Khalkha, Komi-Permyak, Meadow Mari, Ossetian, Russian, Rusyn, Tajik, Tatar, Uzbek
Ҍ ҍ: Kildin Sámi
Ы ы: Abkhaz, Azerbaijani, Bashkir, Belarusian, Buryat, Chuvash, Dungan, Hill Mari, Khalkha, Kildin Sámi, Komi-Permyak, Kyrgyz, Meadow Mari, Moldovan, Ossetian, Russian, Tatar
Ӹ ӹ: Hill Mari
Ҩ ҩ: Abkhaz
Э э: Azerbaijani, Bashkir, Belarusian, Buryat, Chuvash, Dungan, Hill Mari, Kalmyk, Khalkha, Kildin Sámi, Komi-Permyak, Kyrgyz, Kurdish, Meadow Mari, Moldovan, Ossetian, Russian, Tajik, Tatar, Uzbek
Ӭ ӭ: Kildin Sámi
Ԛ ԛ: Kurdish
Ԝ ԝ: Kurdish
Devanagari script
The basic Devanagari abugida includes 48 letters: अ आ इ ई उ ऊ ऋ ए पॅ ऐ ओ औ अं अः ॲं क ख ग घ ङ ह च छ ज झ ञ य श ट ठ ड ढ ण र ष त थ द ध न ल स प फ ब भ म व. But some languages add additional ones:
ॠ: Sanskrit
ऌ: Sanskrit
ॡ: Sanskrit
ॲ: Marathi
ऑ: Marathi
क़: Hindi
ख़: Hindi
ग़: Hindi
ॻ: Saraiki, Sindhi
ज़: Hindi
ॼ: Saraiki, Sindhi
झ़: Hindi
ॾ: Saraiki, Sindhi
फ़: Hindi
ड़: Hindi
ढ़: Hindi
ॿ: Saraiki, Sindhi
ळ: Gharwali, Konkani, Marathi, Rajasthani, Sanskrit
ॸ: Marwari
Geʽez script
The basic Geʽez abugida consists of 217 letters: ሀ ሁ ሂ ሃ ሄ ህ ሆ ለ ሉ ሊ ላ ሌ ል ሎ ሏ ሐ ሑ ሒ ሓ ሔ ሕ ሖ ሗ መ ሙ ሚ ማ ሜ ም ሞ ሟ ፙ ሠ ሡ ሢ ሣ ሤ ሥ ሦ ሧ ረ ሩ ሪ ራ ሬ ር ሮ ሯ ፘ ሰ ሱ ሲ ሳ ሴ ስ ሶ ሸ ሹ ሺ ሻ ሼ ሽ ሾ ሷ ቀ ቁ ቂ ቃ ቄ ቅ ቆ ቋ በ ቡ ቢ ባ ቤ ብ ቦ ቧ ተ ቱ ቲ ታ ቴ ት ቶ ቷ ቸ ቹ ቺ ቻ ቼ ች ቾ ኀ ኁ ኂ ኃ ኄ ኅ ኆ ኋ ነ ኑ ኒ ና ኔ ን ኖ ኗ አ ኡ ኢ ኣ ኤ እ ኦ ኧ ከ ኩ ኪ ካ ኬ ክ ኮ ኳ ወ ዉ ዊ ዋ ዌ ው ዎ ዐ ዑ ዒ ዓ ዔ ዕ ዖ ዘ ዙ ዚ ዛ ዜ ዝ ዞ ዟ የ ዩ ዪ ያ ዬ ይ ዮ ደ ዱ ዲ ዳ ዴ ድ ዶ ዷ ገ ጉ ጊ ጋ ጌ ግ ጎ ጓ ጠ ጡ ጢ ጣ ጤ ጥ ጦ ጧ ጰ ጱ ጲ ጳ ጴ ጵ ጶ ጷ ጸ ጹ ጺ ጻ ጼ ጽ ጾ ጿ ፀ ፁ ፂ ፃ ፄ ፅ ፆ ፈ ፉ ፊ ፋ ፌ ፍ ፎ ፏ ፚ ፐ ፑ ፒ ፓ ፔ ፕ ፖ ፗ. Certain languages use additional letters:
ቈ ቊ ቋ ቌ ቍ: Amharic, Bilen, Tigrinya
ኈ ኊ ኋ ኌ ኍ: Amharic, Bilen
ኰ ኲ ኳ ኴ ኵ: Amharic, Bilen, Tigrinya
ጐ ጒ ጓ ጔ ጕ: Amharic, Bilen, Tigrinya
ቐ ቑ ቒ ቓ ቔ ቕ ቖ: Amharic, Bilen, Harari, Tigre, Tigrinya
ቘ ቚ ቛ ቜ ቝ: Tigrinya
ቨ ቩ ቪ ቫ ቬ ቭ ቮ: Amharic, Bilen, Harari, Tigrinya
ⶓ ⶔ ጟ ⶕ ⶖ: Bilen
ኘ ኙ ኚ ኛ ኜ ኝ ኞ: Amharic, Bilen, Harari, Tigrinya
ኸ ኹ ኺ ኻ ኼ ኽ ኾ: Amharic, Harari, Tigrinya
ዀ ዂ ዃ ዄ ዅ: Amharic, Bilen, Tigrinya
ዠ ዡ ዢ ዣ ዤ ዥ ዦ: Amharic, Bilen, Tigre, Tigrinya
ጀ ጁ ጂ ጃ ጄ ጅ ጆ: Amharic, Bilen, Harari, Tigrinya
ጘ ጙ ጚ ጛ ጜ ጝ ጞ: Bilen, Tigre
ጨ ጩ ጪ ጫ ጬ ጭ ጮ: Amharic, Bilen, Harari, Tigrinya
Hebrew script
The basic Hebrew abjad has 22 letters: א ב ג ד ה ו ז ח ט י ך/כ ל ם/מ ן/נ ס ע ף/פ ץ/צ ק ר ש ת. Yiddish adds two more:
וו וי יי: Yiddish
בֿ: Yiddish
Latin script
The basic Latin alphabet consists of 26 letters: A a B b C c D d E e F f G g H h I i J j K k L l M m N n O o P p Q q R r S s T t U u V v W w X x Y y Z z. Many languages add special characters:
Countries between parentheses are added to distinguish between different languages that have the same name.
Æ æ: Danish, English, Faroese, Icelandic, Kawésqar, Lule Sámi, Norwegian, Southern Sámi, Yaghan
Ɑ ɑ (Latin alpha): Duka, Fe’fe’, Mbembe, Mbo, Tigon
Ð ð (eth): Anii, Elfdalian, Faroese, Icelandic
Ǝ ǝ (turned E): Anii, Bangolan, Bissa, Bura, Kanuri, Kposo, Lama, Lukpa, Ngizim, Tamahaq, Tamasheq, Turka, Yom
Ə ə (schwa): Awing, Bafut, Bulu, Daba, Dazaga, Dii, Ewondo, Fe’fe’, Gude, Kamwe, Kasena, Kemezung, Kpelle, Lyélé, Mada, Makaa, Manengumba, Mfumte, Mofu-Gudur, Mundang, Mundani, Ngas, Nso, Nuni, Parkwa, Tarok, Teda, Temne, Vengo, Vute, Yom, Zulgo-Gemzek
Ɛ ɛ (Latin epsilon): Abidji, Adele, Adjukru, Aghem, Ahanta, Ait Seghrouchen, Ait Warain, Aja (Benin), Akan, Anii, Anyin, Ayizo, Bafia, Bafut, Baka (Cameroon), Bambara, Baoulé, Bariba, Basa (Cameroon), Beni Snous, Bhele, Bissa, Boko, Busa (Nigeria), Central Atlas Tamazight, Cerma, Chakosi, Dagaare, Dan, Dangme, Dendi, Dii, Dinka, Djerbi, Duala, Dyula, Ewe, Ewondo, Ghomara, Iznasen, Kabyle, Kako, Kemezung, Kenyang, Kposo, Kyode, Lika, Lingala, Lupka, Maasai, Mandi (Cameroon), Manenguba, Mangbetu, Matmata, Mbelime, Medumba, Mzab-Wargla, Nawdm, Ngiemboon, Ngomba, Noni, Nuer, Sanhaja de Srair, Shawiya, Shenwa, Shilha, Tarifit, Tem, Tigon, Turka, Yoruba, Zuwara
Ɣ ɣ (Latin gamma): Air Tamajaq, Dagbani, Dinka, Ewe, Kabiye, Kabyle, Kpelle, Kposo, Lukpa, Tamahaq, Tamasheq, Tawellemet, Wakhi
ɤ (ram’s horn/baby gamma): Dan, Goo
I ı (Dotless): Crimean Tatar, Gagauz, Kazakh, Turkish
Ɪ ɪ (small capital): Kulango, Lomakka
Ɩ ɩ (iota): Bissa, Kabiye
Kʼ ĸ (kra): Inuttitut
Ł ł (L with stroke): Gwich’in, Iñupiaq, Kashubian, Navajo, Polish, Silesian, Sorbian, Venetian
Ŋ ŋ (eng): Aghem, Iñupiaq, Kemezung, Lukpa, Mandi (Cameroon), Medumba, Mundani, Nawdm, Ngiemboon, Ngomba, Noni, Northern Sámi, Nuer, Skolt Sámi, Tem, Tigon, Wuzlam
Ɔ ɔ (open O): Aghem, Akan, Bafia, Baka, Bambara, Baoulé, Bariba, Bassa, Boko, Dii, Dinka, Duala, Dyula, Ewe, Ewondo, Kako, Kemezung, Kposo, Lika, Lingala, Maasai, Mandi (Cameroon), Manenguba, Mangbetu, Mbelime, Medumba, Mundani, Nawdm, Ngiemboon, Ngomba, Nuer, Tem, Tigon, Turka, Yoruba
Œ œ: French, Lombard
Ʀ ʀ (small capital R): Alutiiq
ẞ ß (Eszett): German
Þ þ (thorn): Icelandic
Ɥ ɥ (turned H): Dan
Ʊ ʊ (upsilon): Anii, Anyin, Foodo, Lukpa, Tem, Yom
Ʌ ʌ (turned V): Dan, Ch’ol, Oneida, Temne, Tepehuán, Wounaan
Ʒ ʒ (ezh): Aja, Dagbani, Laz, Skolt Sámi
Ɂ ɂ (glottal stop): Chipewyan, Ditidaht, Dogrib, Halkomelem, Kutenai, Lushootseed, Nuu-chah-nulth, Slavey, Thompson
Ꞌ ꞌ (saltillo): Central Sama, Mexicanero, Mi'kmaq, Nahuatl, Nawat, Rapa Nui, Tlapane
Tibetan script
The basic Tibetan abugida is formed by 34 letters: ཀ ཁ ག ང ཅ ཆ ཇ ཉ ཏ ཐ ད ན པ ཕ བ མ ཙ ཚ ཛ ཝ ཞ ཟ འ ཡ ར ལ ཤ ས ཧ ཨ ཨི ཨུ ཨེ ཨོ. Balti uses four additional characters:
ཫ: Balti
ཬ: Balti
ཁ༹: Balti
ག༹: Balti
67 notes
·
View notes
Cyrillic English (more detailed description)
a while ago I made a post showing off a system I came up with for writing English in Cyrillic, while still preserving the wacky features of English orthography we all know and love. I never actually bothered formally defining how it’s supposed to work though beyond just a few key details, so let’s do that now!
most letters are replaced one-to-one. while there is some consideration given to how words are pronounced, it’s important for spellings that don’t make sense in the Latin orthography to still not make sense.
(this chart ignores digraphs, which I’ll get to in a sec)
most of these mappings are the exact thing someone who knows how to read the Cyrillic alphabet would expect, but here’s some things worth pointing out:
the two transcriptions for c are used for “hard c” (/k/) and “soft “c” (/s/), with c /s/ transcribed as though it were pronounced [ts]
in the original post I transcribed “language” with дж for “soft g”, but since the pronunciation of g in English is far less predictable than c, having a separate thing for soft g feels too much like it’s fixing broken spellings, so I’m retconning that out of this system
e is written with the “soft sign” when it’s silent. much like silent e in English, the Cyrillic letter ь is something that used to be pronounced as a vowel but now indicates a change in pronunciation of other nearby letters
for those used to Russian spelling, yes, е is being used for normal e and not for “ye”. since e in English does cause “palatalization” on consonants before it (soft c and g), I think this is fine.
h is transcribed as though it were pronounced /x/, but see below for digraphs
the distinction between k and q is lost, with no special case given for <qu> (it’s just transcribed as though it were <ku>)
s is always с, even when it’s voiced
u is transcribed with ю when it’s pronounced like /juː/, as in кють “cute” (or reduced versions of this as in прессюрь “pressure”), and with у in all other contexts
w is given the letter ў (short u), a letter that doesn’t appear in Russian but really is just the most sensible way to write /w/ in Cyrillic
x is transcribed with the sequence кз (kz), regardless of pronunciation
y is written as й (short i) as a consonant, including when it appears as part of a digraph as in плай “play”, and as ы (yery) as a vowel
this stuff is pretty basic. the biggest change is with digraphs. languages written with Cyrillic don’t use digraphs nearly as often as languages written with the Latin alphabet, so those should be dealt with.
the digraphs ph, th, and ch were introduced into the Latin alphabet to transcribe the Greek letters phi, theta, and chi. when they are pronounced the way they are in Greek loanwords (regardless of actual etymology), these three digraphs are transcribed using the Cyrillic letters that directly descend from the Greek letters they are meant to represent. this includes the archaic letter ѳ (fita), the most direct analogue to English <th> in the Cyrillic script.
for normal ch (and the “soft” ch of relatively recent French loans, while we’re at it), the letter ч is the best fit, and similarly ш makes perfect sense for sh.
wh is transcribed as though it were always pronounced /hw/, including in words like “who” where it definitely is not /hw/.
gh is either transcribed like normal g (when pronounced as such) or as /x/. х is used when gh is silent (as in ѳроух “through”) or pronounced like /f/ (as in роух “rough”), transcribed according to the historical pronunciation of these words.
then finally, while <ya> is not really a digraph in English in any meaningful sense, using Cyrillic and not making use of the letter я just feels plain wrong, so it’s used for words like Янкее (Yankee).
these digraph rules depend partially on pronunciation, not entirely spelling, so like “lighthouse” is лихтхоусь with a тх rather than *лихѳоусь with a ѳ.
247 notes
·
View notes