What is URL Encoding? A Complete Guide to Percent-Encoding
Every time you search Google, click a link with special characters, or submit a web form, URL encoding is working behind the scenes. It's the invisible mechanism that keeps the web functioning — converting unsafe characters into a format that can travel safely across the internet.
Table of Contents
- What is a URL?
- Anatomy of a URL
- What is URL Encoding (Percent-Encoding)?
- Why Do We Need URL Encoding?
- How URL Encoding Works
- Reserved vs Unreserved Characters
- Common URL-Encoded Characters
- Encoding Unicode & International Characters
- URL Encoding in Programming Languages
- encodeURI vs encodeURIComponent
- Common Mistakes with URL Encoding
- Real-World Examples
What is a URL?
A URL (Uniform Resource Locator) is the address of a resource on the internet. Every web page, image, API endpoint, and downloadable file has a URL that tells your browser exactly where to find it and how to access it.
URLs were defined in RFC 1738 (1994) and later refined in RFC 3986 (2005), which is the current standard. A URL is actually a specific type of URI (Uniform Resource Identifier) — one that includes the information needed to locate and access the resource.
For example, when you type https://www.example.com/search?q=hello+world in your browser, every part of that string has a specific meaning — from the protocol to the query parameters.
Anatomy of a URL
A URL is made up of several components, each serving a specific purpose:
https://user:pass@www.example.com:443/path/page?key=value&q=test#section
└─┬──┘ └──┬───┘ └──────┬───────┘└┬┘└───┬────┘└──────┬──────┘ └──┬──┘
scheme userinfo host port path query fragment| Component | Example | Description |
|---|---|---|
| Scheme | https | The protocol used to access the resource (http, https, ftp, mailto, etc.) |
| User Info | user:pass | Optional credentials (rarely used in modern web — security risk) |
| Host | www.example.com | The domain name or IP address of the server |
| Port | 443 | The server port (defaults: 80 for HTTP, 443 for HTTPS) |
| Path | /path/page | The specific resource location on the server |
| Query | key=value&q=test | Key-value pairs for passing data (starts with ?) |
| Fragment | #section | A bookmark within the page (never sent to the server) |
Not all components are required. A minimal URL might be just https://example.com, while a complex API call might use every component.
What is URL Encoding (Percent-Encoding)?
URL encoding, also known as percent-encoding, is a mechanism for converting characters that are not allowed in a URL into a safe representation. It works by replacing unsafe characters with a percent sign (%) followed by two hexadecimal digitsrepresenting the character's byte value.
For example, a space character (which is not allowed in URLs) becomes %20, because the ASCII code for a space is 32, and 32 in hexadecimal is 20.
Before encoding: https://example.com/search?q=hello world&lang=en
After encoding: https://example.com/search?q=hello%20world&lang=en
^^^^^
space → %20This encoding is defined in RFC 3986 and is one of the fundamental building blocks of the web. Without it, URLs containing spaces, non-ASCII characters, or special symbols would simply break.
Why Do We Need URL Encoding?
URLs can only contain a limited set of characters from the ASCII character set. URL encoding is necessary for several critical reasons:
- Reserved characters have special meaning — Characters like
?,&,=, and#are structural delimiters in URLs. If your data contains these characters, they must be encoded so they aren't misinterpreted. - Spaces aren't allowed — URLs cannot contain spaces. A space in a URL would break the request because HTTP uses spaces to delimit parts of the request line.
- Non-ASCII characters — Characters like
ñ,ü,日本語, or emojis are not part of the ASCII set and must be encoded for URL transmission. - Data integrity — Encoding ensures that data passed through URLs arrives exactly as intended, without being corrupted or misinterpreted by web servers, proxies, or browsers.
- Security — Proper encoding helps prevent injection attacks. Without it, an attacker could craft malicious URLs that break out of the intended context.
How URL Encoding Works
The encoding process follows these steps:
- Take the character that needs encoding.
- Convert it to its byte representation in UTF-8.
- For each byte, write a % followed by the two-digit hexadecimal value of that byte.
For ASCII characters, this is straightforward — each character is one byte:
Character: space ASCII code: 32 Hex: 20 Encoded: %20
Character: ! ASCII code: 33 Hex: 21 Encoded: %21
Character: # ASCII code: 35 Hex: 23 Encoded: %23
Character: @ ASCII code: 64 Hex: 40 Encoded: %40For non-ASCII characters, UTF-8 may produce multiple bytes, resulting in multiple percent-encoded triplets:
Character: é UTF-8 bytes: 0xC3 0xA9 Encoded: %C3%A9
Character: ñ UTF-8 bytes: 0xC3 0xB1 Encoded: %C3%B1
Character: 日 UTF-8 bytes: 0xE6 0x97 0xA5 Encoded: %E6%97%A5
Character: 😀 UTF-8 bytes: 0xF0 0x9F 0x98 0x80 Encoded: %F0%9F%98%80Reserved vs Unreserved Characters
RFC 3986 divides URL characters into two categories:
Unreserved Characters (Never Need Encoding)
These characters can appear anywhere in a URL without encoding:
Letters: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
Digits: 0 1 2 3 4 5 6 7 8 9
Special: - _ . ~Reserved Characters (Have Special Meaning)
These characters serve as delimiters in URLs. They must be percent-encoded when used as data (not as delimiters):
| Character | Encoded | Purpose in URLs |
|---|---|---|
: | %3A | Separates scheme from rest, host from port |
/ | %2F | Separates path segments |
? | %3F | Starts the query string |
# | %23 | Starts the fragment identifier |
& | %26 | Separates query parameters |
= | %3D | Separates key from value in query params |
@ | %40 | Separates user info from host |
+ | %2B | Represents a space in form data (application/x-www-form-urlencoded) |
% | %25 | The escape character itself (must be encoded as data) |
Common URL-Encoded Characters
Here's a quick reference for the most frequently encoded characters:
| Character | Encoded | Character | Encoded |
|---|---|---|---|
| Space | %20 | ! | %21 |
" | %22 | # | %23 |
$ | %24 | % | %25 |
& | %26 | ' | %27 |
( | %28 | ) | %29 |
+ | %2B | , | %2C |
Encoding Unicode & International Characters
Modern URLs frequently contain international characters — Japanese, Arabic, Chinese, accented letters, and even emojis. These are handled by first converting the character to its UTF-8 byte sequence, then percent-encoding each byte:
Example: Encoding "café" in a URL
c → c (ASCII, no encoding needed)
a → a (ASCII, no encoding needed)
f → f (ASCII, no encoding needed)
é → UTF-8: 0xC3 0xA9 → %C3%A9
Result: caf%C3%A9
Full URL: https://example.com/search?q=caf%C3%A9Browsers handle this automatically in the address bar — they show the readable characters but send the percent-encoded version in the actual HTTP request. This is sometimes called IRI (Internationalized Resource Identifier) support.
URL Encoding in Programming Languages
JavaScript
// Encode a single query parameter value
encodeURIComponent("hello world & goodbye")
// → "hello%20world%20%26%20goodbye"
// Decode it back
decodeURIComponent("hello%20world%20%26%20goodbye")
// → "hello world & goodbye"
// Encode an entire URI (keeps structural chars intact)
encodeURI("https://example.com/path?q=hello world")
// → "https://example.com/path?q=hello%20world"
// Modern: URLSearchParams handles encoding automatically
const params = new URLSearchParams({ q: "hello world", lang: "en" });
params.toString();
// → "q=hello+world&lang=en"
// Modern: URL constructor
const url = new URL("https://example.com/search");
url.searchParams.set("q", "hello world & more");
url.toString();
// → "https://example.com/search?q=hello+world+%26+more"Python
from urllib.parse import quote, unquote, urlencode
# Encode a string
quote("hello world & goodbye")
# → 'hello%20world%20%26%20goodbye'
# Decode it back
unquote("hello%20world%20%26%20goodbye")
# → 'hello world & goodbye'
# Encode query parameters from a dict
urlencode({"q": "hello world", "lang": "en"})
# → 'q=hello+world&lang=en'
# Encode with safe characters (keep slashes)
quote("/path/to/resource", safe="/")
# → '/path/to/resource'PHP
// Encode (spaces become %20)
rawurlencode("hello world & goodbye");
// → "hello%20world%20%26%20goodbye"
// Encode (spaces become +, for form data)
urlencode("hello world & goodbye");
// → "hello+world+%26+goodbye"
// Decode
rawurldecode("hello%20world"); // → "hello world"
urldecode("hello+world"); // → "hello world"Java
import java.net.URLEncoder;
import java.net.URLDecoder;
import java.nio.charset.StandardCharsets;
// Encode
URLEncoder.encode("hello world", StandardCharsets.UTF_8);
// → "hello+world"
// Decode
URLDecoder.decode("hello%20world", StandardCharsets.UTF_8);
// → "hello world"encodeURI vs encodeURIComponent
This is one of the most common sources of confusion in web development. JavaScript provides two encoding functions, and using the wrong one can break your URLs:
| Function | Use Case | Does NOT encode |
|---|---|---|
encodeURI() | Encode a complete URL | : / ? # [ ] @ ! $ & ' ( ) * + , ; = |
encodeURIComponent() | Encode a single parameter value | - _ . ~ ! ' ( ) * |
const value = "price=10¤cy=USD";
// ❌ WRONG: encodeURI won't encode & and = in the value
encodeURI("https://api.com/search?filter=" + value);
// → "https://api.com/search?filter=price=10¤cy=USD"
// The server sees TWO parameters: filter=price=10 AND currency=USD
// ✅ RIGHT: encodeURIComponent encodes & and = in the value
"https://api.com/search?filter=" + encodeURIComponent(value);
// → "https://api.com/search?filter=price%3D10%26currency%3DUSD"
// The server sees ONE parameter: filter=price=10¤cy=USDRule of Thumb
Use encodeURIComponent() for encoding individual values (query params, path segments). Use encodeURI() only when encoding an entire URL where the structural characters should remain intact. When in doubt, use encodeURIComponent().
Common Mistakes with URL Encoding
1. Double Encoding
Encoding a value that's already encoded turns %20 into %2520 (the % gets encoded to %25). This is the most common URL encoding bug:
// ❌ Double encoding
const encoded = encodeURIComponent("hello world"); // "hello%20world"
encodeURIComponent(encoded); // "hello%2520world" — WRONG!
// ✅ Only encode once
const value = "hello world";
const url = "/search?q=" + encodeURIComponent(value);2. Using encodeURI for Query Parameters
As shown above, encodeURI() won't encode characters like & and =, which can cause parameter injection when user input contains these characters.
3. Confusing + and %20 for Spaces
There are two ways to encode a space in URLs, and they come from different standards:
%20— RFC 3986 percent-encoding (used in path segments and general URLs)+— application/x-www-form-urlencoded (used in HTML form submissions and query strings)
Both are valid, but you need to decode them correctly. decodeURIComponent() does NOT decode + as a space — only %20.
4. Not Encoding File Paths
File names can contain spaces and special characters. When using file names in URLs, always encode them:
// ❌ Broken URL
const url = "/files/" + "my report (final).pdf";
// → "/files/my report (final).pdf" — spaces break this
// ✅ Properly encoded
const url = "/files/" + encodeURIComponent("my report (final).pdf");
// → "/files/my%20report%20(final).pdf"Real-World Examples
Google Search
When you search for "what is URL encoding?" on Google:
https://www.google.com/search?q=what+is+URL+encoding%3F
^^^ ^^^
spaces → + ? → %3FAPI Requests
APIs frequently require encoded parameters:
// Fetching data with special characters in filters
GET /api/users?name=O%27Brien&city=San%20Francisco
Decoded:
name = O'Brien (' → %27)
city = San Francisco (space → %20)Email Mailto Links
mailto:user@example.com?subject=Hello%20World&body=Hi%2C%20how%20are%20you%3F
Decoded:
subject = Hello World
body = Hi, how are you?Redirect URLs
OAuth and login flows often pass redirect URLs as parameters — a URL inside a URL:
https://auth.example.com/login?redirect=https%3A%2F%2Fapp.example.com%2Fdashboard%3Ftab%3Dsettings
Decoded redirect value:
https://app.example.com/dashboard?tab=settingsWithout encoding, the inner URL's ? and =would be interpreted as part of the outer URL's query string, completely breaking the redirect.
Encode & Decode URLs Instantly
Use our free URL Encoder & Decoder tool to encode or decode any URL, query parameter, or text — right in your browser with no data uploaded to any server.
Try URL Encoder & Decoder →References
- Berners-Lee, T., Fielding, R., & Masinter, L. (2005). RFC 3986 — Uniform Resource Identifier (URI): Generic Syntax. https://datatracker.ietf.org/doc/html/rfc3986
- Berners-Lee, T., Masinter, L., & McCahill, M. (1994). RFC 1738 — Uniform Resource Locators (URL). https://datatracker.ietf.org/doc/html/rfc1738
- Mozilla Developer Network. encodeURIComponent() — JavaScript. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent
- WHATWG. URL Standard. https://url.spec.whatwg.org/
- Python Software Foundation. urllib.parse — Parse URLs into components. https://docs.python.org/3/library/urllib.parse.html