·10 min read

What is URL Encoding? A Complete Guide to Percent-Encoding

Every time you search Google, click a link with special characters, or submit a web form, URL encoding is working behind the scenes. It's the invisible mechanism that keeps the web functioning — converting unsafe characters into a format that can travel safely across the internet.

What is a URL?

A URL (Uniform Resource Locator) is the address of a resource on the internet. Every web page, image, API endpoint, and downloadable file has a URL that tells your browser exactly where to find it and how to access it.

URLs were defined in RFC 1738 (1994) and later refined in RFC 3986 (2005), which is the current standard. A URL is actually a specific type of URI (Uniform Resource Identifier) — one that includes the information needed to locate and access the resource.

For example, when you type https://www.example.com/search?q=hello+world in your browser, every part of that string has a specific meaning — from the protocol to the query parameters.

Anatomy of a URL

A URL is made up of several components, each serving a specific purpose:

https://user:pass@www.example.com:443/path/page?key=value&q=test#section
└─┬──┘ └──┬───┘ └──────┬───────┘└┬┘└───┬────┘└──────┬──────┘ └──┬──┘
scheme  userinfo      host     port   path      query       fragment
ComponentExampleDescription
SchemehttpsThe protocol used to access the resource (http, https, ftp, mailto, etc.)
User Infouser:passOptional credentials (rarely used in modern web — security risk)
Hostwww.example.comThe domain name or IP address of the server
Port443The server port (defaults: 80 for HTTP, 443 for HTTPS)
Path/path/pageThe specific resource location on the server
Querykey=value&q=testKey-value pairs for passing data (starts with ?)
Fragment#sectionA bookmark within the page (never sent to the server)

Not all components are required. A minimal URL might be just https://example.com, while a complex API call might use every component.

What is URL Encoding (Percent-Encoding)?

URL encoding, also known as percent-encoding, is a mechanism for converting characters that are not allowed in a URL into a safe representation. It works by replacing unsafe characters with a percent sign (%) followed by two hexadecimal digitsrepresenting the character's byte value.

For example, a space character (which is not allowed in URLs) becomes %20, because the ASCII code for a space is 32, and 32 in hexadecimal is 20.

Before encoding:  https://example.com/search?q=hello world&lang=en
After encoding:   https://example.com/search?q=hello%20world&lang=en
                                                     ^^^^^
                                                space → %20

This encoding is defined in RFC 3986 and is one of the fundamental building blocks of the web. Without it, URLs containing spaces, non-ASCII characters, or special symbols would simply break.

Why Do We Need URL Encoding?

URLs can only contain a limited set of characters from the ASCII character set. URL encoding is necessary for several critical reasons:

  • Reserved characters have special meaning — Characters like ?, &, =, and # are structural delimiters in URLs. If your data contains these characters, they must be encoded so they aren't misinterpreted.
  • Spaces aren't allowed — URLs cannot contain spaces. A space in a URL would break the request because HTTP uses spaces to delimit parts of the request line.
  • Non-ASCII characters — Characters like ñ, ü, 日本語, or emojis are not part of the ASCII set and must be encoded for URL transmission.
  • Data integrity — Encoding ensures that data passed through URLs arrives exactly as intended, without being corrupted or misinterpreted by web servers, proxies, or browsers.
  • Security — Proper encoding helps prevent injection attacks. Without it, an attacker could craft malicious URLs that break out of the intended context.

How URL Encoding Works

The encoding process follows these steps:

  1. Take the character that needs encoding.
  2. Convert it to its byte representation in UTF-8.
  3. For each byte, write a % followed by the two-digit hexadecimal value of that byte.

For ASCII characters, this is straightforward — each character is one byte:

Character: space       ASCII code: 32     Hex: 20     Encoded: %20
Character: !           ASCII code: 33     Hex: 21     Encoded: %21
Character: #           ASCII code: 35     Hex: 23     Encoded: %23
Character: @           ASCII code: 64     Hex: 40     Encoded: %40

For non-ASCII characters, UTF-8 may produce multiple bytes, resulting in multiple percent-encoded triplets:

Character: é       UTF-8 bytes: 0xC3 0xA9      Encoded: %C3%A9
Character: ñ       UTF-8 bytes: 0xC3 0xB1      Encoded: %C3%B1
Character: 日      UTF-8 bytes: 0xE6 0x97 0xA5  Encoded: %E6%97%A5
Character: 😀      UTF-8 bytes: 0xF0 0x9F 0x98 0x80  Encoded: %F0%9F%98%80

Reserved vs Unreserved Characters

RFC 3986 divides URL characters into two categories:

Unreserved Characters (Never Need Encoding)

These characters can appear anywhere in a URL without encoding:

Letters:  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
          a b c d e f g h i j k l m n o p q r s t u v w x y z
Digits:   0 1 2 3 4 5 6 7 8 9
Special:  - _ . ~

Reserved Characters (Have Special Meaning)

These characters serve as delimiters in URLs. They must be percent-encoded when used as data (not as delimiters):

CharacterEncodedPurpose in URLs
:%3ASeparates scheme from rest, host from port
/%2FSeparates path segments
?%3FStarts the query string
#%23Starts the fragment identifier
&%26Separates query parameters
=%3DSeparates key from value in query params
@%40Separates user info from host
+%2BRepresents a space in form data (application/x-www-form-urlencoded)
%%25The escape character itself (must be encoded as data)

Common URL-Encoded Characters

Here's a quick reference for the most frequently encoded characters:

CharacterEncodedCharacterEncoded
Space%20!%21
"%22#%23
$%24%%25
&%26'%27
(%28)%29
+%2B,%2C

Encoding Unicode & International Characters

Modern URLs frequently contain international characters — Japanese, Arabic, Chinese, accented letters, and even emojis. These are handled by first converting the character to its UTF-8 byte sequence, then percent-encoding each byte:

Example: Encoding "café" in a URL

c → c          (ASCII, no encoding needed)
a → a          (ASCII, no encoding needed)
f → f          (ASCII, no encoding needed)
é → UTF-8: 0xC3 0xA9 → %C3%A9

Result: caf%C3%A9

Full URL: https://example.com/search?q=caf%C3%A9

Browsers handle this automatically in the address bar — they show the readable characters but send the percent-encoded version in the actual HTTP request. This is sometimes called IRI (Internationalized Resource Identifier) support.

URL Encoding in Programming Languages

JavaScript

// Encode a single query parameter value
encodeURIComponent("hello world & goodbye")
// → "hello%20world%20%26%20goodbye"

// Decode it back
decodeURIComponent("hello%20world%20%26%20goodbye")
// → "hello world & goodbye"

// Encode an entire URI (keeps structural chars intact)
encodeURI("https://example.com/path?q=hello world")
// → "https://example.com/path?q=hello%20world"

// Modern: URLSearchParams handles encoding automatically
const params = new URLSearchParams({ q: "hello world", lang: "en" });
params.toString();
// → "q=hello+world&lang=en"

// Modern: URL constructor
const url = new URL("https://example.com/search");
url.searchParams.set("q", "hello world & more");
url.toString();
// → "https://example.com/search?q=hello+world+%26+more"

Python

from urllib.parse import quote, unquote, urlencode

# Encode a string
quote("hello world & goodbye")
# → 'hello%20world%20%26%20goodbye'

# Decode it back
unquote("hello%20world%20%26%20goodbye")
# → 'hello world & goodbye'

# Encode query parameters from a dict
urlencode({"q": "hello world", "lang": "en"})
# → 'q=hello+world&lang=en'

# Encode with safe characters (keep slashes)
quote("/path/to/resource", safe="/")
# → '/path/to/resource'

PHP

// Encode (spaces become %20)
rawurlencode("hello world & goodbye");
// → "hello%20world%20%26%20goodbye"

// Encode (spaces become +, for form data)
urlencode("hello world & goodbye");
// → "hello+world+%26+goodbye"

// Decode
rawurldecode("hello%20world");  // → "hello world"
urldecode("hello+world");       // → "hello world"

Java

import java.net.URLEncoder;
import java.net.URLDecoder;
import java.nio.charset.StandardCharsets;

// Encode
URLEncoder.encode("hello world", StandardCharsets.UTF_8);
// → "hello+world"

// Decode
URLDecoder.decode("hello%20world", StandardCharsets.UTF_8);
// → "hello world"

encodeURI vs encodeURIComponent

This is one of the most common sources of confusion in web development. JavaScript provides two encoding functions, and using the wrong one can break your URLs:

FunctionUse CaseDoes NOT encode
encodeURI()Encode a complete URL: / ? # [ ] @ ! $ & ' ( ) * + , ; =
encodeURIComponent()Encode a single parameter value- _ . ~ ! ' ( ) *
const value = "price=10&currency=USD";

// ❌ WRONG: encodeURI won't encode & and = in the value
encodeURI("https://api.com/search?filter=" + value);
// → "https://api.com/search?filter=price=10&currency=USD"
// The server sees TWO parameters: filter=price=10 AND currency=USD

// ✅ RIGHT: encodeURIComponent encodes & and = in the value
"https://api.com/search?filter=" + encodeURIComponent(value);
// → "https://api.com/search?filter=price%3D10%26currency%3DUSD"
// The server sees ONE parameter: filter=price=10&currency=USD

Rule of Thumb

Use encodeURIComponent() for encoding individual values (query params, path segments). Use encodeURI() only when encoding an entire URL where the structural characters should remain intact. When in doubt, use encodeURIComponent().

Common Mistakes with URL Encoding

1. Double Encoding

Encoding a value that's already encoded turns %20 into %2520 (the % gets encoded to %25). This is the most common URL encoding bug:

// ❌ Double encoding
const encoded = encodeURIComponent("hello world");  // "hello%20world"
encodeURIComponent(encoded);  // "hello%2520world" — WRONG!

// ✅ Only encode once
const value = "hello world";
const url = "/search?q=" + encodeURIComponent(value);

2. Using encodeURI for Query Parameters

As shown above, encodeURI() won't encode characters like & and =, which can cause parameter injection when user input contains these characters.

3. Confusing + and %20 for Spaces

There are two ways to encode a space in URLs, and they come from different standards:

  • %20 — RFC 3986 percent-encoding (used in path segments and general URLs)
  • + — application/x-www-form-urlencoded (used in HTML form submissions and query strings)

Both are valid, but you need to decode them correctly. decodeURIComponent() does NOT decode + as a space — only %20.

4. Not Encoding File Paths

File names can contain spaces and special characters. When using file names in URLs, always encode them:

// ❌ Broken URL
const url = "/files/" + "my report (final).pdf";
// → "/files/my report (final).pdf"  — spaces break this

// ✅ Properly encoded
const url = "/files/" + encodeURIComponent("my report (final).pdf");
// → "/files/my%20report%20(final).pdf"

Real-World Examples

Google Search

When you search for "what is URL encoding?" on Google:

https://www.google.com/search?q=what+is+URL+encoding%3F
                                       ^^^                 ^^^
                                    spaces → +            ? → %3F

API Requests

APIs frequently require encoded parameters:

// Fetching data with special characters in filters
GET /api/users?name=O%27Brien&city=San%20Francisco

Decoded:
  name = O'Brien        (' → %27)
  city = San Francisco  (space → %20)

Email Mailto Links

mailto:user@example.com?subject=Hello%20World&body=Hi%2C%20how%20are%20you%3F

Decoded:
  subject = Hello World
  body = Hi, how are you?

Redirect URLs

OAuth and login flows often pass redirect URLs as parameters — a URL inside a URL:

https://auth.example.com/login?redirect=https%3A%2F%2Fapp.example.com%2Fdashboard%3Ftab%3Dsettings

Decoded redirect value:
  https://app.example.com/dashboard?tab=settings

Without encoding, the inner URL's ? and =would be interpreted as part of the outer URL's query string, completely breaking the redirect.

Encode & Decode URLs Instantly

Use our free URL Encoder & Decoder tool to encode or decode any URL, query parameter, or text — right in your browser with no data uploaded to any server.

Try URL Encoder & Decoder →

References

  1. Berners-Lee, T., Fielding, R., & Masinter, L. (2005). RFC 3986 — Uniform Resource Identifier (URI): Generic Syntax. https://datatracker.ietf.org/doc/html/rfc3986
  2. Berners-Lee, T., Masinter, L., & McCahill, M. (1994). RFC 1738 — Uniform Resource Locators (URL). https://datatracker.ietf.org/doc/html/rfc1738
  3. Mozilla Developer Network. encodeURIComponent() — JavaScript. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent
  4. WHATWG. URL Standard. https://url.spec.whatwg.org/
  5. Python Software Foundation. urllib.parse — Parse URLs into components. https://docs.python.org/3/library/urllib.parse.html