HTML CSS Bootstrap JavaScript jQuery MySQL PHP Data Mining

HTML Charset

To display an HTML page correctly, a web browser must know which character set (encoding) was used to write the page. Without the correct charset, special characters may appear as strange symbols or question marks.


What is Character Encoding?

Computers store text as numbers (bytes). A character encoding is a mapping between those numbers and actual characters (letters, digits, symbols).

For example, in ASCII:

  • 65A
  • 66B
  • 97a
  • 480

The meta charset Tag

Always specify the character encoding near the top of your HTML <head>:

<meta charset="UTF-8">
Important: If you omit the <meta charset> tag, browsers may guess the encoding incorrectly, causing characters like é, ñ, ü, € to display as garbled text.

ASCII — The First Encoding

ASCII (American Standard Code for Information Interchange) was the first character encoding standard. It defined 128 characters using 7-bit numbers (0–127):

  • English letters (A–Z, a–z)
  • Digits (0–9)
  • Basic punctuation (. , ! ? ; : etc.)
  • Control characters (line feed, tab, etc.)

ASCII only supports English and has no support for accented letters, foreign scripts, or symbols.


ANSI — Windows Default

ANSI (Windows-1252) extended ASCII to support 256 characters — adding accented Western European letters like é, ñ, ü. It was the default charset in older Windows HTML editors.

Charset Characters Languages Supported
ASCII 128 English only
ANSI (Windows-1252) 256 Western European languages
ISO-8859-1 256 Western European languages
UTF-8 1,112,064+ All languages, symbols, emojis

UTF-8 — The Modern Standard

UTF-8 (Unicode Transformation Format — 8-bit) is the recommended charset for all HTML pages. It covers:

  • All ASCII characters (backward compatible)
  • Accented and special characters from all Western languages
  • All world scripts — Arabic, Chinese, Japanese, Korean, Hindi, etc.
  • Mathematical and technical symbols
  • Emojis 😀 🚀 🎉
Best Practice: Always use UTF-8 for all your HTML pages. It is the default encoding for HTML5 and supported by every modern browser and device in the world.

The Full Charset Declaration

Here is the recommended complete head section for any HTML5 page with proper charset:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Page Title</title>
</head>
<body>

    <p>Hello World — supports all characters: é ñ ü 😀 € مرحبا</p>

</body>
</html>

ASCII Character Reference (0–127)

Decimal Character Decimal Character Decimal Character
32(space)65A97a
33!66B98b
34"67C99c
35#68D100d
36$69E101e
37%70F102f
38&71G103g
39'72H104h
40(73I105i
41)74J106j
48075K107k
49176L108l
50277M109m
51390Z122z