URL Encoding Explained: When and Why to Percent-Encode
URLs are the addresses of the web โ but they can only legally contain a limited set of characters. When your URL needs to include spaces, non-ASCII characters, or characters with special meaning in URL syntax, you need percent-encoding. Understanding exactly which characters need encoding โ and where โ will save you from subtle bugs in API integrations and form submissions.
Why URLs Need Encoding
The URL specification (RFC 3986) defines a set of unreserved characters that may appear anywhere in a URL without encoding: uppercase and lowercase letters, digits, and the four symbols - _ . ~. All other characters either have reserved meaning (like ? starting the query string, # starting the fragment, or / separating path segments) or are simply not allowed in URLs at all.
When you need to include a reserved or disallowed character as data โ for example, a search query containing a & sign, or a filename containing a space โ you must encode it so the URL parser does not misinterpret it.
How Percent-Encoding Works
Percent-encoding replaces each byte of the character with a % followed by its two-digit hexadecimal value. A space (ASCII 0x20) becomes %20. An ampersand (ASCII 0x26) becomes %26. Non-ASCII characters (like accented letters or emoji) are first encoded as UTF-8 bytes, and each byte is then percent-encoded.
Examples of common characters and their percent-encoded forms:
| Character | Encoded | Notes |
|---|---|---|
(space) | %20 | Or + in form data |
! | %21 | |
" | %22 | |
# | %23 | Reserved: fragment start |
& | %26 | Reserved: parameter separator |
+ | %2B | Has special meaning in form data |
/ | %2F | Reserved: path separator |
: | %3A | Reserved: scheme separator |
= | %3D | Reserved: key/value separator |
? | %3F | Reserved: query start |
@ | %40 | Reserved: userinfo separator |
café | caf%C3%A9 | UTF-8 bytes C3 A9, each percent-encoded |
encodeURI vs encodeURIComponent
JavaScript provides two built-in functions for encoding URLs, and confusing them is one of the most common URL-handling bugs.
encodeURI(url)
Encodes a complete URL. It does not encode characters that are legal and meaningful in a URL: / ? # & = : @ and so on. Use this when you have a complete URL that may contain non-ASCII characters but whose structure must be preserved.
encodeURI("https://example.com/search?q=hello world&lang=en")
// returns: "https://example.com/search?q=hello%20world&lang=en"
// space encoded to %20; ? & = / : left unchanged encodeURIComponent(value)
Encodes a component of a URL โ a query parameter value, a path segment, or a hash value. It encodes everything except unreserved characters, including / ? # & =. Use this when you are building a URL by concatenating user-supplied values.
const q = "rock & roll";
const url = "https://example.com/search?q=" + encodeURIComponent(q);
// returns: "https://example.com/search?q=rock%20%26%20roll"
// & encoded to %26 so it's not parsed as a parameter separator
If you had used encodeURI on just the query value, the & would not be encoded and the URL parser would split your parameter at the wrong place.
application/x-www-form-urlencoded
HTML forms submitted with method="POST" and no explicit enctype use the application/x-www-form-urlencoded format. This format is similar to percent-encoding but with one crucial difference: spaces are encoded as +, not %20.
name=John+Doe&city=New+York&country=US
When you read this format on the server (or in JavaScript via URLSearchParams), the + is decoded back to a space. If you build a query string manually using encodeURIComponent and then embed it in a form body, the space will be %20 instead of + โ which most servers handle correctly, but it is not strictly standard.
// Correct way to build a form-encoded body in JavaScript:
const params = new URLSearchParams({ name: "John Doe", city: "New York" });
params.toString();
// returns: "name=John+Doe&city=New+York" When You Do Not Need to Encode
Not everything in a URL needs encoding. The unreserved characters โ A-Z a-z 0-9 - _ . ~ โ never need encoding anywhere in a URL. Slug-style paths (/blog/my-first-post) are already safe. Only encode when your data contains characters that would otherwise break the URL structure or be misinterpreted.
Debugging Encoding Issues
Common symptoms of URL encoding problems:
- Parameter value is truncated โ an unencoded
&in the value is being parsed as a separator. - 404 on path with special characters โ a space or non-ASCII character in a path segment is not encoded.
- Double-encoded URLs โ encoding an already-encoded URL causes
%to become%25. Always decode before re-encoding. - API returns 400 on emoji in query string โ emoji are multi-byte UTF-8 and each byte must be percent-encoded.
Summary
- Percent-encoding replaces unsafe characters with
%XXhex sequences. - Use
encodeURIComponentfor individual query parameter values and path segments. - Use
encodeURIonly for whole URLs where you want to preserve the URL structure. - Form encoding uses
+for spaces; standard percent-encoding uses%20. - Never encode an already-encoded URL without decoding it first.