BytePane

JSON Schema Validation: How to Validate JSON Data Structures

JSON18 min read

The Production Bug That Started This

A payment service at a mid-size fintech was accepting webhook payloads from a third-party processor. The contract said amount would always be a number. One day, the processor shipped an update that serialized amounts as strings in some edge cases ("amount": "150.00"). The payment service accepted the payload, coerced the string silently in one code path, and failed silently in another. Three days of reconciliation work later: an explicit JSON Schema validator at the ingestion point would have caught this in 50 milliseconds and rejected the malformed payload with a clear error message.

Key Takeaways

  • AJV has 145M weekly npm downloads — it is the Node.js standard for JSON Schema validation, compiling schemas to JIT-optimized functions
  • JSON Schema draft 2020-12 replaced items/additionalItems with prefixItems/items for tuple validation
  • Never enable AJV's coerceTypes on API inputs — it silently converts "150"150 and hides contract violations
  • Zod for TypeScript-first projects: one schema gives you both runtime validation AND compile-time types — zero duplication
  • Always use additionalProperties: false for request validation, never for response validation (breaks forward-compatibility)

What JSON Schema Is

JSON Schema is a vocabulary for describing the structure of JSON data. A schema is itself a JSON document that defines what valid instances must look like: which types are allowed, which properties are required, what value ranges are acceptable, what string patterns must match. The specification is maintained by the JSON Schema organization and the current stable draft is 2020-12, published December 2020.

JSON Schema powers more infrastructure than most developers realize. The OpenAPI 3.x specification uses JSON Schema for request/response body definitions. VS Code's IntelliSense for JSON files (settings.json, package.json, tsconfig.json) is driven by JSON Schema definitions from the SchemaStore catalog, which hosts over 500 schemas. Kubernetes manifest validation uses JSON Schema. JSON Forms generates UI components from JSON Schema automatically.

The simplest valid schema is {} — an empty object that accepts any valid JSON value. A schema of true also accepts everything; false rejects everything. This boolean schema form is used heavily in additionalProperties and unevaluatedProperties.

Core Keywords: The Building Blocks

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://api.example.com/schemas/user.json",
  "title": "User",
  "description": "A registered user account",
  "type": "object",

  "properties": {
    "id": {
      "type": "string",
      "format": "uuid",           // Validated with ajv-formats
      "description": "UUID v4 user identifier"
    },
    "email": {
      "type": "string",
      "format": "email",
      "maxLength": 254            // RFC 5321 maximum email length
    },
    "age": {
      "type": "integer",
      "minimum": 13,              // inclusive minimum
      "maximum": 120
    },
    "role": {
      "type": "string",
      "enum": ["admin", "editor", "viewer"]  // enumeration
    },
    "tags": {
      "type": "array",
      "items": { "type": "string" },
      "minItems": 1,
      "maxItems": 20,
      "uniqueItems": true         // no duplicate tags
    },
    "metadata": {
      "type": "object",
      "additionalProperties": { "type": "string" }  // any string-valued keys
    },
    "createdAt": {
      "type": "string",
      "format": "date-time"       // RFC 3339: "2026-04-19T10:30:00Z"
    }
  },

  "required": ["id", "email", "role", "createdAt"],
  "additionalProperties": false   // reject unknown fields
}

The required keyword and additionalProperties behavior are two of the most misunderstood aspects of JSON Schema. required is a sibling of properties — not inside it. A property listed in properties but not in required is optional. A property in required but not in properties is required but can be any type.

AJV: The Production Standard

AJV (Another JSON Schema Validator) is the dominant Node.js implementation with 145 million weekly npm downloads as of 2025. It compiles schemas to optimized JavaScript functions at startup — validation at runtime is just a function call, not tree traversal. This compilation approach puts AJV at 3–5 million validations per second for typical schemas.

import Ajv from 'ajv'
import addFormats from 'ajv-formats'
import addErrors from 'ajv-errors'

// Initialize once at application startup — compilation is the expensive step
const ajv = new Ajv({
  allErrors: true,       // Collect ALL errors, not just the first
  // coerceTypes: false  // NEVER enable for API input — hides contract violations
  // useDefaults: true   // Optional: fill in default values during validation
})
addFormats(ajv)          // Enables: date-time, email, uri, uuid, ipv4, etc.
addErrors(ajv)           // Enables custom error messages per keyword

const userSchema = {
  $schema: 'https://json-schema.org/draft/2020-12/schema',
  type: 'object',
  properties: {
    id:    { type: 'string', format: 'uuid' },
    email: { type: 'string', format: 'email' },
    role:  { type: 'string', enum: ['admin', 'editor', 'viewer'] },
    age:   { type: 'integer', minimum: 13, maximum: 120 },
  },
  required: ['id', 'email', 'role'],
  additionalProperties: false,
}

// Compile once
const validate = ajv.compile(userSchema)

// Validate at runtime — extremely fast (compiled function)
function validateUser(data: unknown): asserts data is User {
  if (!validate(data)) {
    const errors = validate.errors!
      .map(e => `${e.instancePath} ${e.message}`)
      .join('; ')
    throw new Error(`Invalid user: ${errors}`)
  }
}

// Usage in Express middleware:
app.post('/users', (req, res) => {
  try {
    validateUser(req.body)
  } catch (err) {
    return res.status(422).json({ error: err.message, errors: validate.errors })
  }
  // req.body is now typed as User
  createUser(req.body)
})

One critical AJV configuration decision: coerceTypes. When enabled, AJV silently converts "42" to 42, "true" to true, and so on. This is the exact behavior that caused the production bug in the opening of this article. Never enable coerceTypes for API input validation. It is appropriate only for query string parameters (which are always strings) in server-side URL parsing.

Draft 2020-12: What Changed from Older Drafts

If you are reading older JSON Schema tutorials, be aware of the breaking changes introduced in draft 2019-09 and finalized in 2020-12. The most common migration trip:

// ❌ Draft 07 tuple validation (deprecated)
{
  "type": "array",
  "items": [                    // positional items = tuple
    { "type": "string" },       // index 0: string
    { "type": "integer" }       // index 1: integer
  ],
  "additionalItems": false      // no extra elements allowed
}

// ✅ Draft 2020-12 tuple validation
{
  "type": "array",
  "prefixItems": [              // renamed: items → prefixItems for tuples
    { "type": "string" },
    { "type": "integer" }
  ],
  "items": false                // renamed: additionalItems → items for extra
}

// Other 2020-12 changes:
// $defs instead of $definitions (though both work)
// $dynamicRef / $dynamicAnchor replace $recursiveRef / $recursiveAnchor
// unevaluatedProperties / unevaluatedItems keywords added
// format is now only an annotation by default (opt-in validation)

// AJV draft 2020-12 setup:
import Ajv2020 from 'ajv/dist/2020'
const ajv = new Ajv2020({ allErrors: true })

The unevaluatedProperties keyword added in 2019-09 is a more powerful version of additionalProperties that works correctly with allOf/oneOf/anyOf composition. When composing schemas with additionalProperties: false, properties evaluated in sub-schemas are invisible to the parent's additionalProperties, causing false rejections. Use unevaluatedProperties: false for correct closed-object validation across composed schemas.

Schema Composition: $ref, $defs, allOf, anyOf, oneOf

Production schemas are not monolithic. They compose reusable type definitions using $defs and $ref:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$defs": {
    "Address": {
      "type": "object",
      "properties": {
        "street":  { "type": "string" },
        "city":    { "type": "string" },
        "country": { "type": "string", "pattern": "^[A-Z]{2}$" }
      },
      "required": ["street", "city", "country"]
    },
    "Money": {
      "type": "object",
      "properties": {
        "amount":   { "type": "number", "minimum": 0 },
        "currency": { "type": "string", "pattern": "^[A-Z]{3}$" }
      },
      "required": ["amount", "currency"]
    }
  },

  "type": "object",
  "properties": {
    "billingAddress":  { "$ref": "#/$defs/Address" },
    "shippingAddress": { "$ref": "#/$defs/Address" },
    "orderTotal":      { "$ref": "#/$defs/Money" }
  },
  "required": ["billingAddress", "orderTotal"]
}

// anyOf: valid if at least one schema matches (overlapping union)
{
  "anyOf": [
    { "type": "string", "format": "email" },
    { "type": "string", "format": "uri" }
  ]
}
// Accepts: email addresses OR URLs (including both at once if valid as both)

// oneOf: valid if EXACTLY one schema matches (exclusive union)
{
  "oneOf": [
    { "type": "object", "properties": { "type": { "const": "card" } }, "required": ["type", "cardNumber"] },
    { "type": "object", "properties": { "type": { "const": "bank" } }, "required": ["type", "accountNumber"] }
  ]
}
// Discriminated union — card OR bank payment, never both

// allOf: all schemas must match (useful for extension/mixin)
{
  "allOf": [
    { "$ref": "#/$defs/BaseEntity" },    // has id, createdAt, updatedAt
    { "$ref": "#/$defs/UserFields" }     // has email, role
  ],
  "unevaluatedProperties": false         // closed: no unknown fields
}

AJV vs Zod: Which Should You Use?

This is not a pure performance question — it is a workflow question. The right answer depends on whether you are writing TypeScript, and whether you need language-agnostic schema sharing.

PropertyAJVZodJoi
StandardJSON Schema (all drafts)Proprietary (exports JSON Schema)Proprietary
TypeScript inferencePartial (manual type guards)First-class (z.infer<T>)Poor
Performance~3–5M ops/sec (compiled)~300K–500K ops/sec~200K–400K ops/sec
Weekly downloads~145M (npm)~30M (npm, fast growing)~17M (npm)
Cross-languageYes (share schema as JSON)Via zod-to-json-schemaJS only
Error messagesTechnical (path + message)Human-friendly, customizableHuman-friendly
Best forHigh-throughput APIs, OpenAPITypeScript-first apps, tRPC, RemixExpress.js, simple validation
// Zod: one definition gives you runtime validation + TypeScript type
import { z } from 'zod'

const UserSchema = z.object({
  id:        z.string().uuid(),
  email:     z.string().email().max(254),
  role:      z.enum(['admin', 'editor', 'viewer']),
  age:       z.number().int().min(13).max(120).optional(),
  createdAt: z.string().datetime(),
})

// TypeScript type inferred automatically — no duplication:
type User = z.infer<typeof UserSchema>
// { id: string; email: string; role: 'admin' | 'editor' | 'viewer'; age?: number; createdAt: string }

// Parse (throws on invalid) or safeParse (returns result object):
const result = UserSchema.safeParse(req.body)
if (!result.success) {
  return res.status(422).json({ errors: result.error.flatten() })
}
const user = result.data  // fully typed as User

// Export as JSON Schema for OpenAPI or cross-service sharing:
import { zodToJsonSchema } from 'zod-to-json-schema'
const jsonSchema = zodToJsonSchema(UserSchema, 'User')
// → standard draft 2020-12 JSON Schema document

Validation Patterns for Common Scenarios

Discriminated Unions (Polymorphic Payloads)

// Payment event: card, bank transfer, or crypto wallet
const PaymentEventSchema = {
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "oneOf": [
    {
      "type": "object",
      "properties": {
        "type":           { "const": "card" },
        "last4":          { "type": "string", "pattern": "^[0-9]{4}$" },
        "brand":          { "type": "string", "enum": ["visa", "mastercard", "amex"] }
      },
      "required": ["type", "last4", "brand"],
      "additionalProperties": false
    },
    {
      "type": "object",
      "properties": {
        "type":          { "const": "bank_transfer" },
        "bankName":      { "type": "string" },
        "accountLast4":  { "type": "string", "pattern": "^[0-9]{4}$" }
      },
      "required": ["type", "bankName", "accountLast4"],
      "additionalProperties": false
    }
  ]
}

// AJV handles oneOf correctly — validates each schema, requires exactly one match
// Error messages: "must match exactly one schema in oneOf"
// For better errors, use discriminator keyword (OpenAPI extension in AJV):
const ajv = new Ajv({ discriminator: true })

Conditional Validation (if/then/else)

// Shipping address only required if order type is "physical"
{
  "type": "object",
  "properties": {
    "orderType": { "type": "string", "enum": ["digital", "physical"] },
    "shippingAddress": { "$ref": "#/$defs/Address" }
  },
  "required": ["orderType"],
  "if": {
    "properties": { "orderType": { "const": "physical" } }
  },
  "then": {
    "required": ["shippingAddress"]
  }
  // "else" is optional — omitting it means no extra constraints for non-physical
}

// Multiple conditions with allOf:
{
  "allOf": [
    {
      "if": { "properties": { "role": { "const": "admin" } } },
      "then": { "required": ["adminToken"] }
    },
    {
      "if": { "properties": { "type": { "const": "business" } } },
      "then": { "required": ["taxId", "companyName"] }
    }
  ]
}

Where to Validate in a Production API

Not every layer of your application should re-validate. Here is the right placement:

  • Request ingestion (always): Validate every incoming payload at the API boundary before it touches business logic. This is where AJV or Zod lives. Fail fast with 422 Unprocessable Content and a detailed error list. Never pass unvalidated data deeper.
  • Configuration loading (always): Validate config files and environment variables at startup. A misconfigured service should crash immediately on start, not fail silently mid-request. JSON Schema with AJV is the standard approach for Node.js config validation.
  • Webhook ingestion (always): Third-party systems change. Validate webhook payloads against a schema and alert on schema violations — this is your early warning system for upstream contract changes.
  • Database reads (usually not): Data coming out of your own database has already been validated on write. Re-validating every query result adds latency with no real benefit unless your DB schema and application schema have drifted.
  • Inter-service calls (depends): In a microservices environment, validate responses from other teams' services at your service boundary. Assume nothing. Your compile-time types do not guarantee the other team's runtime behavior.

Use our JSON Formatter to pretty-print and inspect JSON payloads during development before writing your schema — it is much faster to understand structure visually than from a raw response string.

Common Mistakes and How to Avoid Them

1. type: "number" vs type: "integer"

JSON Schema's number accepts any numeric value including floats. integer requires the value to be a whole number (no decimal component). 1.0 passes integer validation in most validators because mathematically it has no fractional part; 1.5 does not.

2. Missing additionalProperties: false on Request Schemas

Without additionalProperties: false, clients can send any extra fields. This can lead to mass assignment vulnerabilities if extra fields are passed directly to an ORM — for example, a client sending {"role": "admin"} on a signup endpoint that does not explicitly allow role. Always close schemas on ingestion boundaries. See our REST API best practices guide for more security patterns.

3. Validating Response Schemas with additionalProperties: false

The reverse mistake: using strict closed-object schemas for responses or third-party data you do not control. If the server adds a new field, your validator rejects valid responses. Response schemas should allow extra properties (omit the keyword or set it to true) while validating only what you care about.

4. Schema Compilation in the Hot Path

// ❌ Do NOT compile schemas per-request — compilation is expensive
app.post('/users', (req, res) => {
  const validate = ajv.compile(userSchema)  // ~1-5ms overhead every request
  validate(req.body)
})

// ✅ Compile once at module load time
const validateUser = ajv.compile(userSchema)  // runs once on startup

app.post('/users', (req, res) => {
  validateUser(req.body)  // pure function call, <0.1ms
})

Validate & Format Your JSON

Use BytePane's JSON Formatter to pretty-print, validate, and inspect JSON payloads before writing your schemas. Catch structural issues visually before coding validation logic.

Frequently Asked Questions

What is JSON Schema and what is it used for?

JSON Schema is a vocabulary for describing the structure and constraints of JSON data. It defines valid types, required fields, value ranges, string patterns, and object shapes. It powers OpenAPI 3.x request/response definitions, VS Code IntelliSense for JSON files, Kubernetes manifest validation, and API contract testing. Current stable spec: draft 2020-12.

What is the difference between JSON Schema and Zod?

JSON Schema is a language-agnostic spec stored as JSON, working across Python, Go, Java, and JavaScript. Zod is TypeScript-first with code-based schemas, providing superior TypeScript inference via z.infer<T>. Zod can export JSON Schema via zod-to-json-schema for interoperability. Use AJV for cross-language schemas; use Zod for TypeScript-first projects where type inference matters.

Which is the fastest JSON Schema validator for Node.js?

AJV with 145 million weekly npm downloads. It compiles schemas to optimized JS functions at startup, achieving ~3–5 million validations per second for typical schemas. Zod is 5–10x slower (~300K–500K ops/sec) but provides better TypeScript DX. For high-throughput APIs validating thousands of requests per second, AJV's compiled validation is the right choice.

How do I validate nested objects with JSON Schema?

Use the properties keyword for object shape and required array for mandatory fields. For reusable nested types, define them in $defs and reference with $ref: "#/$defs/TypeName". This allows circular references (e.g., a node containing an array of nodes) and avoids schema duplication across multiple endpoint schemas.

What does additionalProperties: false do in JSON Schema?

It rejects objects containing keys not listed in properties. Use it for API request validation where unknown fields indicate client errors. Never use it for response validation — a new server field would break old client validators. For composed schemas (allOf/oneOf), use unevaluatedProperties: false instead, which works correctly across sub-schema boundaries.

How do I validate that a JSON field matches one of several types?

Use oneOf (exactly one match, exclusive), anyOf (at least one match, overlapping), or allOf (all match simultaneously). For pure type unions, use "type": ["string", "number"] — simpler than anyOf for type-only checks. For discriminated unions (different shapes based on a type field), use oneOf with const on the discriminator property.

Can JSON Schema validate date and datetime formats?

Yes via the format keyword. Draft 2020-12 defines date-time (RFC 3339, e.g. 2026-04-19T10:30:00Z), date, time, email, uri, uuid, and ipv4. Format validation is opt-in by default — enable with ajv-formats: addFormats(new Ajv()). Without the plugin, AJV treats format as a non-validating annotation.

Related Articles