Encoding – Difference Between Encoding and Charset

character-encodingencoding

I am confused about the text encoding and charset. For many reasons, I have to
learn non-Unicode, non-UTF8 stuff in my upcoming work.

I find the word "charset" in email headers as in "ISO-2022-JP", but there's no
such a encoding in text editors. (I looked around the different text editors.)

What's the difference between text encoding and charset? I'd appreciate it
if you could show me some use case examples.

Best Answer

Basically:

  1. charset is the set of characters you can use
  2. encoding is a way these characters are stored into memory

People sometimes use charset to refer both to the character repertoire and the encoding scheme. The Unicode Standard charset has multiple encodings, e.g., UTF-8, UTF-16, UTF-32, UCS-4, UTF-EBCDIC, Punycode, and GB18030.