UTF

Each Unicode code point can be expressed in several different formats. These formats are called Unicode transformation formats (UTFs). For example, the letter M is the Unicode code point U+004D. In UTF-8, this code point is represented as X’4D’. In UTF-16, this code point can be represented as X'004D'.

Name	UTF-8	UTF-16	UTF-16BE	UTF-16LE	UTF-32	UTF-32BE	UTF-32LE
Smallest code point	0000	0000	0000	0000	0000	0000	0000
Largest code point	10FFFF	10FFFF	10FFFF	10FFFF	10FFFF	10FFFF	10FFFF
Code unit size	8 bits	16 bits	16 bits	16 bits	32 bits	32 bits	32 bits
Byte order	N/A	BOM	big-endian	little-endian	BOM	big-endian	little-endian
Fewest bytes per character	1	2	2	2	4	4	4
Most bytes per character	4	4	4	4	4	4	4

BOM

Byte Order Map

Fish Touching🐟🎣

Explorer

UTF

BOM

Links

Graph View

Table of Contents

Backlinks