Why Your API is ‘Broken by Design’ Without JSON Schema
People might argue that JSON Schema is overkill or outdated. But in this article, we'll explore that it is not the case and understand how it is often overlooked.
People might argue that JSON Schema is overkill or outdated. But in this article, we'll explore that it is not the case and understand how it is often overlooked.
First, hear me out — You are already using a schema
Think about it: if your end users are API consumers, whether they’re external (think Rapid API) or internal teammates using your (micro?)service’s APIs, they’re counting on responses to look a certain way each time they hit an endpoint. They’re not expecting surprises; they’re expecting consistency. That consistency? It’s a schema. You’ve essentially been using an implicit schema all along; you just haven’t formalized it yet.
If you're still with me, let's discuss the benefits of having a JSON Schema.
- Validates Data: It catches issues (like invalid emails or negative ages) before they mess up your API. Also, the best part is, that most programming languages have validation libraries for JSON Schema.
- Documents Your API: JSON Schema is basically self-documenting, so other developers can see exactly what data they need to send. No guesswork!
- Keeps Your API Consistent: When every request and response follows the same structure, you avoid those frustrating “wait, what format was this field supposed to be?” moments.
- Saves Time: Many tools can even auto-generate code and documentation based on JSON Schema, so you can skip a lot of tedious setups.
Wait a minute! Do you know what JSON Schema is?
Before we dive deep into this article, let's understand what JSON Schema is. If you’re a developer and haven’t heard of JSON Schema, it's alright! Honestly, I was right there with you for the first few years of my coding life, blissfully unaware.
What’s JSON Schema, anyway?
I am sure you’re already thinking, “Oh great, yet another layer of type overload slapped onto my response data.” And honestly? You’re not entirely wrong! It can look like that at first glance. But let’s take a closer look and see it in action.
Imagine you’re setting up a User Registration form with just three fields:
- Name: Needs to be a string, anywhere from 2 to 50 characters long.
- Email: Well, it’s gotta be an email. Duh!
- Age: Only users 18 and up can register.
{
"name": "John",
"email": "[email protected]",
"age": 25
}
Oh, and all three fields are required. Here’s how that schema would shape up:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "User Registration",
"type": "object",
"properties": {
"name": {
"type": "string",
"minLength": 2,
"maxLength": 50
},
"email": {
"type": "string",
"format": "email"
},
"age": {
"type": "integer",
"minimum": 18
}
},
"required": ["name", "email", "age"]
}
Probably it looks scary in the beginning. But at the same time, it makes sense, isn't it?
Let's break it down and understand what each part is doing.
$schema
: This just tells you which version of JSON Schema you’re using. (More on versions later.)title
: It’s like a label. Here, it’s “User Registration.” This doesn’t do anything technical, but it’s nice to have a name for your schema.type
: Here’s where you define what kind of data you’re dealing with. Most APIs useobject
(since it’s structured data), but JSON Schema can also handle arrays, strings, numbers, and more.properties
: This is where the magic happens. It’s where you lay down the rules for each field in your object.required
: This is a list of fields that must be included, no exceptions.
Okay, wait! This isn’t going to be yet another “how to JSON Schema” tutorial. We’re not here for the same old spiel on what JSON Schema is and how to use it. There are already tons of blogs, videos, and tutorials out there for that. So, do your own research, and let’s skip the redundancy, shall we?
This could be your homework:
1. Explore the Constraints and Validation Rules
2. Explore the Advanced JSON Schema Features (oneOf, anyOf, and allOf)
3. Implement JSON Schema in your API Requests
4. How to Validate API Requests with JSON Schema (e.g.,AJV
for JavaScript,jsonschema
for Python)
5. How to Integrate JSON Schema validation into CI/CD pipelines
6. Try reusing schemas across multiple API requests via$ref
Common Pitfalls and How to Avoid Them
Now you’ve got your types, properties, and validations all lined up, and your API is looking pretty solid. But at this point, it's very common to get carried away and make mistakes. This may lead either you or your users in a very frustrating situation. Let's cover some of the cases and see the potential solutions to avoid them.
1. Overly Strict vs. Overly Lenient Schemas
It’s easy to get a little carried away with JSON Schema rules. One minute you’re setting up a few sensible validations, and the next you’re blocking every request that doesn’t fit your exact idea of perfect data. On the other side, if you go too lenient, you’ll end up with a chaotic API where pretty much anything goes. Finding the right balance is key.
So you may ask — How Strict Is Too Strict?
Imagine you have a schema for a user profile, and you’ve set it up to require a bunch of fields like firstName
, lastName
, phoneNumber
, address
, and bio
. But then someone tries to create a new profile and—uh oh—they’re missing a bio. Should that really be a deal-breaker?
- The Problem with Being Too Strict: If you’re too rigid with required fields, you’ll end up blocking users over things they might not have or need. They’ll be frustrated, and your API will look unfriendly.
- The Problem with Being Too Lenient: If you go too easy, like making every field optional and adding no constraints, you’ll end up with junk data—empty fields, wrong formats, and all sorts of weird entries. Your database becomes a mess, and validation issues pop up later when you try to use that data.
Finding the Sweet Spot
Start by making only the truly necessary fields required (like userId
and email
for a user schema). Then, add sensible constraints, but don’t overdo it. Think of it this way: if a missing field doesn’t actually break your API, maybe it doesn’t need to be required.
Example:
{
"type": "object",
"properties": {
"userId": { "type": "string" },
"email": { "type": "string", "format": "email" },
"phoneNumber": { "type": "string", "minLength": 10 },
"bio": { "type": "string", "maxLength": 250 }
},
"required": ["userId", "email"]
}
Here, userId
and email
are required. phoneNumber
and bio
are nice-to-haves, but they won’t block the request if they’re missing.
2. Ensuring Backward Compatibility with Schema Updates
Ah, backward compatibility!!! It’s tempting to update your schema whenever you spot a better way to structure data, but changing it too often can cause big problems. If existing clients are relying on a particular structure, changing that structure will break them.
The Backward Compatibility Pitfall
Let’s say you decide to update your user schema by renaming phoneNumber
to mobileNumber
. Seems harmless enough, right? Wrong. Now, anyone using the old phoneNumber
field is going to get errors, and suddenly your support inbox is flooded with “What happened to my API?!”
How to Avoid Breaking Changes
- Add New Fields, Don’t Replace: If you want to add a new field, go for it! But try not to rename or remove existing fields. If you must change a field name, consider adding the new field as optional and marking the old one as “deprecated” in your documentation. This way, you don’t break any existing clients.
- Version Your API: If you’re making a big change that will break existing setups, consider creating a new version of your API (
v2
), so people can choose to upgrade when they’re ready. If possible, usesemantic versioning
. - Communicate Changes: If you have to make a change, be upfront with your users. Add it to your changelog, send an email, or put a notice in your docs.
Surprises are great for birthdays, not so much for API updates.
There are different ways to handle the breaking changes if you absolutely have to make one. Either you can do your own research or let me know if I should write another article on that. (Who knows, maybe I'll do that anyway in next 2-3 years?)
Final thoughts — Why You Should Give JSON Schema a Try
So, you made it to the end — what are you, a rare breed, a superfan, or just my soulmate? Because you actually read the whole thing! Kudos to you (and maybe to me for writing such a captivating piece).
But if you're with me so far, now’s a great time to try it out. Start small—maybe just a single schema for an API request or response—and see how it helps keep things organized. Once you see the benefits, it’s easy to expand, adding schemas for more endpoints and building up a well-structured API.
JSON Schema might seem like extra work at first, but the payoff is huge. You’ll end up with a cleaner, more predictable API that’s easier to maintain, and anyone who uses your API (including you) will appreciate the clear structure and validation. It’s one of those tools that gives you more control with less effort—and who doesn’t want that?
So, let me know if this was worth the read, if you learned a thing or two, or if you’re diving in and implementing a JSON Schema. I’d love to hear your thoughts — and see if I’ve converted you to the schema side.
Next steps
Okay well, JSON Schema is not the only way to organise your APIs. There are various other methods/tools you can use. For now, do your own research. Or just wait for my next blog (I hope it comes out soon)!
Here are some of your research topics:
- Protobuf and gRPC: Protocol Buffers (Protobuf), developed by Google, allows developers to define strict data contracts with a language-neutral format that is efficient and compact. Paired with gRPC, this approach enables fast communication and data validation across services.
- TypeScript with io-ts or zod: These libraries (like io-ts and zod) allow you to define validation rules at runtime using TypeScript types, ensuring data consistency across the stack. This avoids duplication, as the same definitions can be used both for type checking and validation.
- Yup: It provides a clean API for building validation schemas, especially useful for complex, nested objects. It’s highly compatible with front-end frameworks like React and integrates well with form libraries like Formik.
- GraphQL: Instead of rigid schemas, GraphQL schemas can evolve with minimal versioning, offering a more fluid experience for both backend and frontend developers.