Don't wanna be here? Send us removal request.
Text
How does UTF-8 work - demonstration
see this for an explanation of what UTF-8 is
Red bits carry no meaning - they only tell the computer the position in the character and its length.
Black bits are the actual character, so the binary number given by reading them in order gives the Unicode number of the character.
0 notes
Text
The size of Unicode
0 notes
Text
Basic Latin - ASCII - Block 1
The Basic Latin Unicode block (also called the ASCII block) is the 1st block of the Unicode standard, and the only block which is encoded in one byte in UTF-8.
Contents: Upper and lower case English letters(52), Arabic numerals(10), basic punctuation marks and math symbols(33) and control characters(33)
Number (allocated / used): 128 / 128
UTF-8 character size: 1 byte / 8bits
Position (hex / dec): 0000-007F / 0-127
Version introduced: 1.0.0
#unicode#unicodenthusiast#unicodenthusiast intro#not a chatacter#unicode blocks#un bl 1#ascii#basic latin
0 notes
Text
What is a Unicode block?
A block is a set of characters from a single language, group of languages, or of a similar use (like math symbols).
The sizes of the blocks are usually multiples of powers of 16, or in larger blocks, other, larger powers of 2.
There are 328 blocks, as of Unicode 15.1.
0 notes
Text
What is UTF-8
UTF-8 is by far the most used way to encode characters in electronic communication. It's purpose is to turn the numbers that the Unicode standard gives each character and put them in computer memory in such a way that they are easily readable, compact, and unambiguous.
Since the largest number possible in Unicode is 1,112,063 and its binary representation is 100001111011111111111(21 digits), one would think that having every 21 bits(bite is one binary digit) in memory represent one character is a good idea. However, there are 2 problems here.
Firstly, it seems inefficient to use all 21 bits for all the characters, since the vast majority of all characters used are in the first 1/20th or so of the list. If this were to be used, almost everything would start with a ton of leading zeros, wasting precious memory.
The second problem is that computers, for computer reasons that we will not get into, "like" having their memory in "bytes" of 8, so a potential encoding where 2 characters can share a byte isn't ideal.
The way UTF-8 solves both of these is by having a variable length encoding, where each character starts with information on how many bytes it has("0" for 1, "110" for 2, "1110" for 3, and "11110" for 4), and each byte after that has a prefix("10") which states that it is a continuation of a character already started.
In the table above, this is illustrated by x's being the actual character, while the numbers written down are just there to tell the computer what is what and where characters begin. (Example)
Another handy thing is that for the first 128 characters, the encoding is the exact same as the ASCII encoding, so any text written in ASCII can be interpreted by UTF-8.
0 notes
Text
My computer didn't render the flag properly :{
The trans flag doesnt have its own emoji in Unicode so it is by convention represented by a white flag(🏳️) plus the trans symbol(⚧️). And many apps don't render it, but Microsoft, a bajilion dolar corporation, should.

So I’ve heard that sharks are kind of big in the trans community, so this is for y’all, happy Pride Month and remember you are very cool 🏳️⚧️🏳️🌈
83 notes
·
View notes
Text
What is Unicode?
Unicode is a standard, set up in 1991 and since regularly updated. It is a huge list of basically all characters ever used by every human language, and each one has its own number. For example, C is 67, ! is 33, 月 is 26376 and Д is 1044.
It originally had 7129 characters, and currently as of Unicode 15.0 has 149186(of which more than half are archaic/historical Japanese and Chinese characters).
The standard allows for 1,112,064 ((2^9 + 2^5 - 1)* 2^11) code points total, of which most are currently unassigned.
Unicode is by far the most used way to digitally represent text, accounting for 98% of all websites(including Tumblr).
0 notes