Unicode is an international standard designed to consistently represent and handle all characters in the world on computers.
7비트, 영문 대소문자 및 기본 기호만 표현 (128자)
가변 길이 (1-4바이트), ASCII 호환, 웹 표준
가변 길이 (2-4바이트), BMP 대부분 2바이트
You can convert to ASCII, UTF-8, UTF-16, HTML entities, and URL encoding.
-
-
-
UTF-8 is mainly used on the web and has good ASCII compatibility. UTF-16 is efficient for East Asian characters and is used in Java/Windows.
Usually due to missing charset declaration or server/client encoding mismatch. Check .
Most emojis are encoded as 4 bytes in UTF-8. (Example: 😊 = F0 9F 98 8A)