URL Encoding: What It Is and Why It Matters

You're building a search feature. Everything works great until a user searches for "C++ algorithms" and the API returns a 400. You log the outgoing URL and see: https://api.example.com/search?q=C++ algorithms. That + is being interpreted as a space by the server, and the whole query string is malformed. You've just met URL encoding — the hard way.

Every developer who works with web APIs, form submissions, or URL construction eventually runs into this. The rules aren't complicated, but there are enough edge cases that it's worth understanding properly rather than patching things one character at a time.

Why URLs have character restrictions

A URL isn't just a string of text. It's a structured syntax where specific characters mean specific things. The ? separates the path from the query string. The & separates query parameters. The # marks the fragment. The : and // indicate the scheme. The / separates path segments.

If any of these characters appears in your data — say, a user's search query containing & or a password containing # — the URL parser will misread it as a structural character and you'll get unexpected behavior or a broken request.

RFC 3986 (the URL standard) defines a small set of "unreserved" characters that can appear anywhere in a URL without any special meaning: A–Z, a–z, 0–9, -, _, ., ~. Everything else either has reserved structural meaning or is simply not allowed at all.

The solution is percent-encoding: any character that can't be used safely gets replaced with % followed by its hexadecimal byte value.

How percent-encoding works

The mechanism is straightforward. Take the character's UTF-8 byte representation, write each byte as two hex digits, prefix each with %. That's it.

Some common encodings worth memorizing:

Space → %20 (or + in form data, which is a different context)
& → %26
= → %3D
+ → %2B
# → %23
/ → %2F
@ → %40

Non-ASCII characters work the same way but require more bytes. The Chinese character 你 is three bytes in UTF-8 (E4 BD A0), so it becomes %E4%BD%A0. Emoji can be four bytes. JavaScript's encodeURIComponent handles all of this automatically, which is why you should almost always reach for that rather than rolling your own encoding.

The two functions you need in JavaScript

This is where most developers get it wrong by using encodeURI when they should use encodeURIComponent, or vice versa.

encodeURIComponent encodes everything except the unreserved characters. Use this on individual query parameter values and path segments — the actual data you're encoding.

encodeURI is designed for encoding a complete URL. It won't encode ://, /, ?, &, =, or # because those are structural parts of a URL. If you pass a full URL to encodeURIComponent, it'll destroy the structure.

// Right: encode individual values
const query = encodeURIComponent("C++ algorithms");  // "C%2B%2B%20algorithms"
const url = `https://api.example.com/search?q=${query}`;

// Wrong: this destroys the URL structure
encodeURIComponent("https://api.example.com/search?q=foo");
// → "https%3A%2F%2Fapi.example.com%2Fsearch%3Fq%3Dfoo" — broken

// decoding
decodeURIComponent("C%2B%2B%20algorithms");  // "C++ algorithms"

When building query strings with multiple parameters, use URLSearchParams — it handles encoding automatically and is available in all modern browsers and Node.js:

const params = new URLSearchParams({
  q: "C++ algorithms",
  lang: "en",
  page: 1
});
// "q=C%2B%2B+algorithms&lang=en&page=1"
// Note: URLSearchParams uses + for spaces, not %20 (application/x-www-form-urlencoded format)

In Python

from urllib.parse import quote, urlencode, quote_plus

# Encode a single path segment or query value
quote("C++ algorithms")        # 'C%2B%2B%20algorithms'

# Form-encoding (spaces as +)
quote_plus("C++ algorithms")   # 'C%2B%2B+algorithms'

# Build a query string
urlencode({"q": "C++ algorithms", "lang": "en"})
# 'q=C%2B%2B+algorithms&lang=en'

The `+` vs `%20` distinction

This trips people up constantly. In the application/x-www-form-urlencoded format (what HTML forms use, and what URLSearchParams produces), spaces are encoded as +. In standard URL encoding (path segments, modern APIs), spaces are %20.

Most servers handle both when decoding query parameters, but not all. If you're writing an API client and the server seems to be receiving literal + signs instead of spaces, switch from + to %20. If spaces are turning into + unexpectedly, check whether your HTTP library is using form-encoding.

Encoding passwords and credentials in connection strings

This one genuinely causes production incidents. A database connection string looks like postgresql://user:password@host/db. If the password contains @, #, /, or other reserved characters, the URL parser will misread the structure entirely.

# Password is "p@ss/word"
# WRONG: parser thinks the host starts at "ss"
postgresql://user:p@ss/word@localhost/mydb

# RIGHT: encode the special characters
postgresql://user:p%40ss%2Fword@localhost/mydb

The @ encodes to %40, / encodes to %2F. If your app is failing to connect to a database in production but works fine in development, this is a likely culprit — especially if someone recently rotated credentials and the new password happens to contain a special character.

Double-encoding: the other common mistake

If you encode a string that's already encoded, you break it. %20 encoded again becomes %2520 — because % itself gets encoded to %25. The server then decodes %2520 back to %20 (a literal percent sign followed by "20"), not to a space.

This happens most often when you're constructing a URL and one layer of your code does the encoding while another layer also tries to encode, or when you encode a value before storing it in a database and then encode it again when retrieving it for use in a URL.

The rule: encode as late as possible, decode as early as possible, and never double-encode. If you're unsure whether a string is already encoded, decode it first, then encode it once.

Framework helpers usually have this covered

The good news: if you're using a modern HTTP client or web framework, it probably handles encoding correctly by default. fetch with URLSearchParams, Python's requests library with a params dict, Rails URL helpers — they all encode values properly. The cases where you need to think about it directly are when you're constructing URLs manually as strings, working with legacy code, or dealing with a non-standard API that has unusual encoding requirements.

Understanding the underlying mechanics means you can debug these situations quickly when they come up, rather than spending an hour binary-searching through a request builder wondering why the API keeps returning a 400.

Need to quickly encode or decode a URL string? Try the free URL Encoder/Decoder — handles both standard percent-encoding and form encoding, entirely in your browser.