Input
Output
About UTF-8 Encoding
UTF-8 (Unicode Transformation Format - 8-bit) is a variable-width character encoding that uses 1-4 bytes per character. It's backward compatible with ASCII and is the dominant encoding on the web. Unlike showing Unicode code points, this tool outputs the actual UTF-8 byte values β the real bytes stored in memory and transmitted over networks.
Complete Guide to UTF-8 Encoding
Free Online Text to UTF-8 Byte Converter
Convert text to actual UTF-8 byte values or decode UTF-8 bytes back to text instantly. This tool shows the real bytes that computers use to store and transmit text β not just code points. Perfect for developers, network engineers, and anyone debugging character encoding issues.
Key Features
π Text to UTF-8 Encoding
- Convert any text to UTF-8 bytes
- Full Unicode character support
- Handles emojis and special symbols
- Real-time conversion as you type
- Decimal and hex byte output
π UTF-8 to Text Decoding
- Decode UTF-8 bytes to readable text
- Validates byte sequences
- Error detection & messages
- Handles space/comma separators
- Supports hex input (0xFF format)
β‘ Real-Time Processing
- Instant conversion on input
- 300ms debounce for performance
- Live byte count display
- No button clicks required
πΎ Export Options
- Download as .txt file
- Export as .html file
- Save as .json format
- One-click copy to clipboard
What is UTF-8?
UTF-8 (Unicode Transformation Format - 8-bit) is a variable-width character encoding that can represent every character in the Unicode standard. It's backward compatible with ASCII (first 128 characters are identical) and uses 1-4 bytes per character. UTF-8 is now the dominant character encoding on the web and supports all languages, symbols, and emojis.
UTF-8 Byte Ranges:
1 byte (0x00-0x7F): Basic ASCII characters (A, B, 0-9, etc.)
2 bytes (0xC0-0xDF + 0x80-0xBF): Latin extended, Greek, Cyrillic, Arabic, Hebrew
3 bytes (0xE0-0xEF + 2Γ0x80-0xBF): Most Asian languages (Chinese, Japanese, Korean), symbols
4 bytes (0xF0-0xF7 + 3Γ0x80-0xBF): Rare languages, musical notation, emojis
UTF-8 vs Code Points
Many tools claim to show βUTF-8β but actually show Unicode code points (the abstract number assigned to each character). This tool shows the actual UTF-8 bytesβ the real data stored in files and sent over networks. Here's the difference:
Example: ββ¬β (Euro sign)
Code point: U+20AC (decimal: 8364) β one number
UTF-8 bytes: 0xE2 0x82 0xAC (decimal: 226 130 172) β three bytes
Example: βπβ (Grinning face)
Code point: U+1F600 (decimal: 128512) β one number
UTF-8 bytes: 0xF0 0x9F 0x98 0x80 (decimal: 240 159 152 128) β four bytes
How UTF-8 Encoding Works
- ASCII characters (U+0000 to U+007F): Encoded as a single byte, identical to ASCII. Example: 'A' β 0x41
- 2-byte characters (U+0080 to U+07FF): First byte starts with 110xxxxx, second with 10xxxxxx. Example: 'Γ©' β 0xC3 0xA9
- 3-byte characters (U+0800 to U+FFFF): First byte starts with 1110xxxx, followed by two 10xxxxxx bytes. Example: 'β¬' β 0xE2 0x82 0xAC
- 4-byte characters (U+10000 to U+10FFFF): First byte starts with 11110xxx, followed by three 10xxxxxx bytes. Example: 'π' β 0xF0 0x9F 0x98 0x80
Common Use Cases
Debugging Encoding Issues: See the actual bytes stored in files to diagnose mojibake, garbled text, or encoding mismatches.
Network Analysis: Verify how text is encoded when transmitted over HTTP, WebSocket, or other protocols.
Database Debugging: Check UTF-8 byte sequences stored in databases to troubleshoot character set issues.
Education: Learn how UTF-8 encoding works at the byte level and understand variable-width encoding.
File Analysis: Understand how text editors and systems store characters in UTF-8 encoded files.
Quick Reference: UTF-8 Byte Examples
Programming Examples
Get UTF-8 Bytes in Different Languages:
JavaScript:
new TextEncoder().encode('β¬') // Uint8Array [226, 130, 172]
new TextDecoder().decode(new Uint8Array([226, 130, 172])) // 'β¬'Python:
'β¬'.encode('utf-8') # b'\xe2\x82\xac'
b'\xe2\x82\xac'.decode('utf-8') # 'β¬'Java:
"β¬".getBytes(StandardCharsets.UTF_8) // [-30, -126, -84] (signed)
new String(bytes, StandardCharsets.UTF_8) // "β¬"π 100% Privacy Guaranteed
All UTF-8 encoding and decoding is performed entirely in your web browser using JavaScript. Your text and data never leave your device - nothing is uploaded to servers, stored in databases, logged, or transmitted to any third party. Complete privacy and security for all your conversions.
Learn More About UTF-8
Want to understand how UTF-8 encoding works under the hood? Read our in-depth guide covering variable-width encoding, byte patterns, step-by-step encoding examples, and best practices.
Read: What is UTF-8?Related Encoding & Text Tools
ASCII Converter
Convert text to ASCII character codes (0-127) and decode ASCII numbers back to text.
Hex Converter
Convert text to hexadecimal and hex to text for web development and debugging.
Base64 Encoder
Encode and decode Base64 strings for data transmission and web development.
URL Encoder
Encode and decode URLs for safe transmission. Handle special characters and query parameters.
Base Converter
Convert numbers between Binary, Octal, Decimal, and Hexadecimal number systems.
Text Editor Pro
Advanced text editing with find/replace, multi-line editing, and transformation tools.