Character encoding

« Back to Glossary Index

Introduction and History of Character Encoding
– Character encoding is the process of assigning numbers to graphical characters.
– It allows characters to be stored, transmitted, and transformed using digital computers.
– Code points are the numerical values that make up a character encoding.
– Code space, code page, and character map are terms used to describe the collective code points.
– Modern computer systems allow more elaborate character codes like Unicode.
– Early character codes were limited to a subset of characters used in written languages.
– Morse code, introduced in the 1840s, was the earliest well-known electrically transmitted character code.
– Various character encoding systems were developed, including Morse code, Baudot code, ASCII, and Unicode.
– The Baudot code, created by Émile Baudot in 1870, was later standardized as ITA2.
ASCII, released in 1963, addressed the shortcomings of previous codes and was widely adopted.
– Punch card data encoding was invented by Herman Hollerith in the late 19th century.
– IBM developed Binary Coded Decimal (BCD) as a six-bit encoding scheme.
– BCD was later extended to include alphabetic and special characters, becoming EBCDIC.
– Researchers in the 1980s faced the challenge of accommodating additional characters without wasting computing resources.
– The compromise solution was Unicode, which introduced the concept of code points and variable-length encodings.

Terminology
– Informally, character encoding, character map, character set, and code page are often used interchangeably.
– A character is a minimal unit of text with semantic value.
– A character set is a collection of elements used to represent text, such as the Latin alphabet or Greek alphabet.
– A coded character set is a character set map.
– The distinction between these terms has become important with the emergence of more sophisticated character encodings.

Importance of Character Encoding
– Character encoding enables worldwide interchange of text in electronic form.
– It allows for the representation of most characters used in many written languages.
– Unicode has become the widely adopted encoding system, replacing earlier character encodings.
– The development of character encodings has been driven by the need for machine-mediated character-based symbolic information.
– The evolution of character codes has been influenced by the capabilities and limitations of early machines.

Code Pages and Code Units
– Code page is a historical name for a coded character set.
– Code page refers to a specific page number in the IBM standard character set manual.
– Other vendors, including Microsoft, SAP, and Oracle Corporation, have their own sets of code pages.
– Well-known code page suites are Windows (based on Windows-1252) and IBM/DOS (based on code page 437).
– The term code page is often used to refer to character encodings in general.
– Code unit size varies depending on the encoding.
– US-ASCII consists of 7-bit code units.
– UTF-8, EBCDIC, and GB 18030 consist of 8-bit code units.
– UTF-16 consists of 16-bit code units.
– UTF-32 consists of 32-bit code units.

Code Points, Characters, and Unicode Encoding Model
– A code point is represented by a sequence of code units.
– UTF-8 maps code points to a sequence of one, two, three, or four code units.
– UTF-16 uses surrogate pairs for code points with a value U+10000 or higher.
– UTF-32 represents every code point as a single code unit.
– GB 18030 commonly uses multiple code units per code point.
– What constitutes a character varies between character encodings.
– Letters with diacritics can be encoded as a single unified character or as separate characters that combine into a single glyph.
– Handling glyph variants is a choice made when constructing a character encoding.
– Some writing systems, like Arabic and Hebrew, accommodate different ways of joining graphemes.
– Characters in different contexts may represent the same semantic character.
– Unicode and ISO/IEC 10646 constitute a unified standard for character encoding.
– Unicode defines an abstract character repertoire (ACR) that supports all characters.
– A coded character set (CCS) maps characters to code points.
– A character encoding form (CEF) maps code points to code units.
– A character encoding scheme (CES) maps code units to octets for storage or transmission.

Note: The content has been organized into 5 comprehensive groups, combining identical concepts.

Character encoding (Wikipedia)

Character encoding is the process of assigning numbers to graphical characters, especially the written characters of human language, allowing them to be stored, transmitted, and transformed using digital computers. The numerical values that make up a character encoding are known as "code points" and collectively comprise a "code space", a "code page", or a "character map".

Punched tape with the word "Wikipedia" encoded in ASCII. Presence and absence of a hole represents 1 and 0, respectively; for example, "W" is encoded as "1010111".

Early character codes associated with the optical or electrical telegraph could only represent a subset of the characters used in written languages, sometimes restricted to upper case letters, numerals and some punctuation only. The low cost of digital representation of data in modern computer systems allows more elaborate character codes (such as Unicode) which represent most of the characters used in many written languages. Character encoding using internationally accepted standards permits worldwide interchange of text in electronic form.

« Back to Glossary Index

Submit your RFP

We can't wait to read about your project. Use the form below to submit your RFP!

Gabrielle Buff
Gabrielle Buff

Just left us a 5 star review

Great customer service and was able to walk us through the various options available to us in a way that made sense. Would definitely recommend!

Stoute Web Solutions has been a valuable resource for our business. Their attention to detail, expertise, and willingness to help at a moment's notice make them an essential support system for us.

Paul and the team are very professional, courteous, and efficient. They always respond immediately even to my minute concerns. Also, their SEO consultation is superb. These are good people!

Paul Stoute & his team are top notch! You will not find a more honest, hard working group whose focus is the success of your business. If you’re ready to work with the best to create the best for your business, go Stoute Web Solutions; you’ll definitely be glad you did!

Wonderful people that understand our needs and make it happen!

Paul is the absolute best! Always there with solutions in high pressure situations. A steady hand; always there when needed; I would recommend Paul to anyone!

facebook
Vince Fogliani
recommends

The team over at Stoute web solutions set my business up with a fantastic new website, could not be happier

facebook
Steve Sacre
recommends

If You are looking for Website design & creativity look no further. Paul & his team are the epitome of excellence.Don't take my word just refer to my website "stevestours.net"that Stoute Web Solutions created.This should convince anyone that You have finally found Your perfect fit

facebook
Jamie Hill
recommends

Paul and the team at Stoute Web are amazing. They are super fast to answer questions. Super easy to work with, and knows their stuff. 10,000 stars.

Paul and the team from Stoute Web solutions are awesome to work with. They're super intuitive on what best suits your needs and the end product is even better. We will be using them exclusively for our web design and hosting.

facebook
Dean Eardley
recommends

Beautifully functional websites from professional, knowledgeable team.

Along with hosting most of my url's Paul's business has helped me with website development, graphic design and even a really cool back end database app! I highly recommend him as your 360 solution to making your business more visible in today's social media driven marketplace.

I hate dealing with domain/site hosts. After terrible service for over a decade from Dreamhost, I was desperate to find a new one. I was lucky enough to win...

Paul Stoute has been extremely helpful in helping me choose the best package to suite my needs. Any time I had a technical issue he was there to help me through it. Superb customer service at a great value. I would recommend his services to anyone that wants a hassle free and quality experience for their website needs.

Paul is the BEST! I am a current customer and happy to say he has never let me down. Always responds quickly and if he cant fix the issue right away, if available, he provides you a temporary work around while researching the correct fix! Thanks for being an honest and great company!!

Paul Stoute is absolutely wonderful. Paul always responds to my calls and emails right away. He is truly the backbone of my business. From my fantastic website to popping right up on Google when people search for me and designing my business cards, Paul has been there every step of the way. I would recommend this company to anyone.

I can't say enough great things about Green Tie Hosting. Paul was wonderful in helping me get my website up and running quickly. I have stayed with Green...