BytePane

REST API Best Practices: Design, Security & Performance

API18 min read

84% of organizations experienced an API security incident in the past 12 months — per a 2024 survey of 1,200 IT and security professionals. The 2025 Global State of API Security Report (Traceable AI) found that 57% of organizations suffered API-related data breaches in the past two years, and 28% of those breaches compromised sensitive data and critical systems. Meanwhile, a Treblle analysis of over 1 billion API calls found that only 15% of APIs implement rate limiting.

The problem is rarely that developers don't know the basics. It's that security and performance best practices get treated as optional enhancements rather than table stakes. This guide covers the practices that actually matter in production.

Key Takeaways

  • REST powers 83% of web services, but the gap between a functional API and a production-grade one is enormous — most organizations are not closing it.
  • Rate limiting is the single most neglected practice: only 15% of APIs implement it per Treblle's analysis of 1B+ calls. It protects both availability and your budget.
  • Use 401 for unauthenticated and 403 for unauthorized — they are semantically different, and CDNs/gateways use this distinction for caching decisions.
  • Never expose stack traces, internal IDs, or database errors in API responses — this is an OWASP API Security Top 10 failure mode.
  • Per the Postman 2025 State of the API report (5,700+ respondents), 82% of organizations have adopted API-first design — the remaining 18% consistently struggle with integration churn.

Common Mistakes vs. Best Practices

Most of these violations are not exotic edge cases — they show up repeatedly in code reviews, security audits, and production incidents. Here's the pattern of what separates functional APIs from well-engineered ones:

AreaCommon MistakeBest Practice
HTTP MethodsPOST for everything (/getUsers)GET/POST/PUT/PATCH/DELETE per semantics
Error codesAlways return 200, encode status in bodyUse correct 4xx/5xx with structured body
Error bodyBare string or raw exception messageStructured JSON with code, message, request_id
Rate limitingNo limits (85% of APIs per Treblle)429 + Retry-After + rate limit headers
VersioningBreak existing clients on updatesURL path versioning (/v1/)
PaginationReturn entire collection without limitCursor or offset pagination with max page size
CORSAccess-Control-Allow-Origin: * everywhereAllowlist specific origins; block * for auth endpoints
Secrets in responsesStack traces, SQL errors, internal IDsOpaque error codes + request_id for tracing

Authentication: Choosing the Right Method

Authentication is not one-size-fits-all. The right choice depends on your API's consumers (internal microservices, third-party developers, browser clients), your security requirements, and your operational complexity tolerance. Here is a practical decision framework:

API Keys: Simple but Risky If Mishandled

API keys are the simplest authentication mechanism — a single secret string passed in a header. They are appropriate for server-to-server communication where the client can store secrets securely, and for public APIs where you want low integration friction.

# Always in a header — never in the URL (URLs are logged)
GET /api/v1/data
Authorization: Bearer sk_live_abc123xyz

# Wrong: key in query parameter (gets logged everywhere)
GET /api/v1/data?api_key=sk_live_abc123xyz  # ✗

# Rotate keys without downtime: support overlapping keys
GET /api/v1/data
Authorization: Bearer sk_live_newkey456     # ✓ new key works
# Old key still valid for 24h grace period

JWT Bearer Tokens: Stateless, Scalable

JWTs are the standard for stateless API authentication. The token contains signed claims — no server-side session lookup required. This makes them ideal for horizontally scaled services. The cost is revocation complexity: you cannot invalidate a JWT before it expires without a blocklist, which reintroduces statefulness.

// JWT structure: header.payload.signature
// All three parts are Base64URL-encoded

// Validate on every request — don't trust unverified claims
import jwt from 'jsonwebtoken';

function authenticate(req, res, next) {
  const token = req.headers.authorization?.split(' ')[1];
  if (!token) return res.status(401).json({ error: { code: 'MISSING_TOKEN' } });

  try {
    const payload = jwt.verify(token, process.env.JWT_SECRET, {
      algorithms: ['HS256'],   // Never accept 'none'
      issuer: 'api.example.com',
      audience: 'api-clients',
    });
    req.user = payload;
    next();
  } catch (err) {
    return res.status(401).json({ error: { code: 'INVALID_TOKEN' } });
  }
}

Never accept the none algorithm in JWT validation — this was the vector for a category of CVEs where attackers forged tokens by stripping the signature. Always specify the expected algorithm explicitly.

OAuth 2.0: Delegated Authorization

OAuth 2.0 is for delegated authorization — allowing third-party clients to act on behalf of a user without sharing their password. Use the Authorization Code + PKCE flow for public clients (SPAs, mobile apps). Client Credentials flow for machine-to-machine. Never use the deprecated Implicit flow for new integrations.

# OAuth 2.0 Authorization Code + PKCE flow (simplified)

# Step 1: Redirect user to authorization server
GET https://auth.example.com/oauth/authorize
  ?response_type=code
  &client_id=app_123
  &redirect_uri=https://yourapp.com/callback
  &scope=read:users write:reports
  &code_challenge=BASE64URL(SHA256(code_verifier))  # PKCE
  &code_challenge_method=S256
  &state=random_csrf_token

# Step 2: Exchange code for tokens
POST https://auth.example.com/oauth/token
{
  "grant_type": "authorization_code",
  "code": "auth_code_from_redirect",
  "redirect_uri": "https://yourapp.com/callback",
  "code_verifier": "original_random_string"   # PKCE verifier
}

# Step 3: Use access token
GET /api/v1/users
Authorization: Bearer eyJhbGciOiJSUzI1NiJ9...

Rate Limiting: The Most Neglected Practice

A Treblle analysis of over 1 billion API calls to 500,000 endpoints across 15,000 real-world APIs found that only 15% implement any form of rate limiting. This is the single largest gap between best practice and actual practice. Without rate limiting, a single misbehaving client — or attacker — can exhaust your server resources, drive up your cloud bill, or trigger downstream throttling from third-party services.

Rate Limit Response Headers

# Successful response — include rate limit state
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000        # Max requests per window
X-RateLimit-Remaining: 847     # Remaining in current window
X-RateLimit-Reset: 1742169600  # Unix timestamp when window resets
Content-Type: application/json

# Rate limit exceeded
HTTP/1.1 429 Too Many Requests
Retry-After: 30                # Seconds until they can retry
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1742169600

{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "You have exceeded 1000 requests per hour.",
    "retry_after_seconds": 30,
    "request_id": "req_a1b2c3d4"
  }
}

Choosing a Rate Limiting Algorithm

AlgorithmBehaviorBest For
Fixed WindowCount resets at fixed intervalsSimple quotas, predictable behavior
Sliding WindowRolling count over the last N secondsPrevents burst attacks at window boundaries
Token BucketTokens refill at constant rate; burst allowed up to bucket sizeAPIs with legitimate burst patterns (batch uploads)
Leaky BucketRequests processed at constant rate; excess queued or droppedSmooth traffic shaping, downstream protection

Implement rate limiting at the reverse proxy or API gateway layer (Nginx, Kong, Cloudflare, AWS API Gateway), not in application code. This protects you even before requests reach your service, and works correctly across multiple application instances without requiring shared state in your app.

Consistent Error Responses

Error responses are arguably more important to design carefully than success responses. When an integration breaks at 2 AM, the error body is all the on-call engineer has to go on. A well-structured error response answers three questions: what happened, why it happened, and what to do about it.

// The gold standard error envelope — use it for every error
// HTTP 422 Unprocessable Entity
{
  "error": {
    "code": "VALIDATION_ERROR",           // Machine-readable, stable string
    "message": "Request body is invalid.", // Human-readable, for developers
    "request_id": "req_7x9k2m4p",         // Correlates to your logs
    "documentation_url": "https://api.example.com/docs/errors#VALIDATION_ERROR",
    "details": [                           // Field-level validation errors
      {
        "field": "email",
        "code": "INVALID_FORMAT",
        "message": "Must be a valid email address.",
        "rejected_value": "not-an-email"
      },
      {
        "field": "age",
        "code": "OUT_OF_RANGE",
        "message": "Must be between 1 and 120.",
        "rejected_value": -5
      }
    ]
  }
}

// 500 error — NEVER expose internals
// HTTP 500 Internal Server Error
{
  "error": {
    "code": "INTERNAL_ERROR",
    "message": "An unexpected error occurred.",
    "request_id": "req_9z3n8v1q"   // Give this to support; maps to full trace internally
    // ✗ NEVER include: stack trace, SQL query, internal path, exception message
  }
}

Validate your error response shapes with our JSON Formatter before documenting them. Consistency across all endpoints is more important than the specific schema — if your team debates the naming for more than 15 minutes, pick one and move on.

HTTP Status Code Decision Tree

CodeUse WhenCommon Mistake
200 OKGET/PUT/PATCH succeededUsing 200 for errors with an error body
201 CreatedPOST created a resourceReturning 200 after creation (missing Location header)
204 No ContentDELETE succeeded, no bodyReturning 200 with empty body instead
400 Bad RequestMalformed syntax, unparseable bodyUsing 400 for all validation errors (use 422)
401 UnauthorizedMissing or invalid credentialsUsing 403 when authentication is missing
403 ForbiddenAuthenticated but lacks permissionUsing 404 to hide resource existence (valid but intentional)
404 Not FoundResource does not existUsing 400 for not-found resources
409 ConflictDuplicate create, state conflictUsing 400 for uniqueness violations
422 UnprocessableSyntactically valid but semantically invalidUsing 400 for all validation failures
429 Too Many RequestsRate limit exceededUsing 503 or returning no Retry-After
503 Service UnavailableIntentional downtime / overloadReturning 500 during planned maintenance

API Versioning in Practice

The SmartBear API survey of 1,100+ API professionals found that 71% of teams use API versioning, but only 57% of those use semantic versioning conventions. The other 43% use ad hoc schemes — which creates ambiguity for consumers about what constitutes a breaking change.

What Counts as a Breaking Change

Change TypeBreaking ✗Non-Breaking ✓
FieldsRemoving or renaming a fieldAdding optional fields
TypesChanging a field's type (string → int)Widening an enum to include new values
EndpointsRemoving or renaming an endpointAdding new endpoints
AuthRequiring new required scopesAdding optional auth methods
Status codesChanging a success code to a different success codeAdding new error codes
# URL path versioning (recommended)
GET /api/v1/users/42      # stable, visible in logs and browser
GET /api/v2/users/42      # new version coexists

# Deprecation: communicate timeline in headers before removing
HTTP/1.1 200 OK
Deprecation: true
Sunset: Sat, 31 Dec 2026 23:59:59 GMT
Link: </api/v2/users/42>; rel="successor-version"

# Versioning your OpenAPI spec
openapi: "3.1.0"
info:
  version: "2.4.1"   # Semver: major.minor.patch
  title: "Example API"
  # major = breaking, minor = feature, patch = bugfix

Pagination: Offset vs. Cursor

Any endpoint returning a collection must support pagination with a configurable page size ceiling (typically 100-250 items). Never return unbounded collections — at scale, a single GET /orders with no limit can return millions of rows.

Offset Pagination

GET /api/v1/orders?page=3&per_page=25

// Response
{
  "data": [ ... ],
  "pagination": {
    "page": 3,
    "per_page": 25,
    "total_items": 1250,
    "total_pages": 50,
    "next": "/api/v1/orders?page=4&per_page=25",
    "prev": "/api/v1/orders?page=2&per_page=25"
  }
}

// The problem: if a record is inserted on page 2 while you're reading page 3,
// you'll see a duplicate. If one is deleted, you'll skip an item.

Cursor Pagination (Recommended for Live Data)

// Cursor is an opaque Base64-encoded pointer to last seen record
GET /api/v1/orders?limit=25&after=eyJpZCI6MTAwLCJ0cyI6MTcwMDAwMH0=

// Response
{
  "data": [ ... ],
  "pagination": {
    "next_cursor": "eyJpZCI6MTI1LCJ0cyI6MTcwMDAxMH0=",
    "prev_cursor": "eyJpZCI6NzYsInRzIjoxNjk5OTk1fQ==",
    "has_more": true
  }
}

// Server-side: decode cursor to get the last seen primary key
// SELECT * FROM orders WHERE id > :cursor_id ORDER BY id LIMIT :limit
// No COUNT(*) needed — much faster on large tables

Cursor pagination has no concept of "total pages" or "jump to page 50" — this is intentional. For use cases that require random access (admin dashboards, reporting), offset pagination is fine. For infinite scroll, activity feeds, or any data that changes frequently, cursor pagination is the correct choice. Use our Base64 Encoder/Decoder to inspect and debug cursor values during development.

Security Headers and CORS

OWASP API Security Top 10 includes "Broken Object Level Authorization", "Broken Function Level Authorization", and "Security Misconfiguration" as three of the top API vulnerabilities. Proper security headers are a baseline requirement that takes under an hour to configure and protects against entire categories of attacks.

// Nginx: API-appropriate security headers
add_header Content-Type "application/json" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-Frame-Options "DENY" always;
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header Referrer-Policy "no-referrer" always;

// In your API framework (Express.js example with helmet)
import helmet from 'helmet';
app.use(helmet());

// CORS: scope to specific origins — never wildcard authenticated endpoints
const corsOptions = {
  origin: process.env.ALLOWED_ORIGINS?.split(',') ?? [],
  methods: ['GET', 'POST', 'PUT', 'PATCH', 'DELETE'],
  allowedHeaders: ['Content-Type', 'Authorization'],
  credentials: true,    // Allows cookies/auth headers
  maxAge: 86400,        // Cache preflight for 24h
};
app.use(cors(corsOptions));

// ✗ NEVER for authenticated endpoints:
// Access-Control-Allow-Origin: *    (allows any site to make credentialed requests)

Input Validation: Defense in Depth

// Validate at the API boundary — never trust client input
// Example using Zod (TypeScript schema validation)
import { z } from 'zod';

const CreateUserSchema = z.object({
  name: z.string().min(1).max(100).trim(),
  email: z.string().email().toLowerCase(),
  age: z.number().int().min(1).max(120).optional(),
  role: z.enum(['user', 'admin']).default('user'),
});

app.post('/api/v1/users', async (req, res) => {
  const result = CreateUserSchema.safeParse(req.body);
  if (!result.success) {
    return res.status(422).json({
      error: {
        code: 'VALIDATION_ERROR',
        message: 'Request body contains invalid fields.',
        details: result.error.issues.map(issue => ({
          field: issue.path.join('.'),
          code: issue.code,
          message: issue.message,
        })),
        request_id: req.id,
      }
    });
  }
  // result.data is now type-safe and sanitized
  const user = await createUser(result.data);
  res.status(201).json({ data: user });
});

Performance: Caching, Compression, and Field Selection

REST has a structural advantage over GraphQL for caching: GET requests with fixed URLs can be cached at the CDN level with zero application code. This is a significant performance win that many teams leave on the table by using POST for read operations or by not setting cache headers correctly.

HTTP Caching Headers

// Public, cacheable for 1 hour (good for reference data)
HTTP/1.1 200 OK
Cache-Control: public, max-age=3600, stale-while-revalidate=600
ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4"

// Private, user-specific data
HTTP/1.1 200 OK
Cache-Control: private, max-age=60
Vary: Authorization

// Conditional GET: client revalidates with ETag
GET /api/v1/products/42
If-None-Match: "33a64df551425fcc55e4d42a148795d9f25f89d4"

// Server response if unchanged (saves bandwidth entirely)
HTTP/1.1 304 Not Modified

// Disable caching for sensitive endpoints
Cache-Control: no-store, no-cache, must-revalidate

Field Selection and Response Shaping

// Allow clients to request only the fields they need
// Reduces payload size — critical for mobile clients and high-frequency polling
GET /api/v1/users/42?fields=id,name,email

// Response — only requested fields
{
  "id": 42,
  "name": "Alice Chen",
  "email": "[email protected]"
  // billing_address, payment_methods, preferences — not included
}

// Also support embedding related resources to reduce round trips
GET /api/v1/orders/7?include=user,items

// Response — avoids N+1 requests
{
  "id": 7,
  "status": "shipped",
  "user": { "id": 42, "name": "Alice Chen" },
  "items": [...]
}

For large API responses, always enable gzip/Brotli compression at the proxy level. Brotli achieves 20-26% better compression than gzip on JSON, and virtually all modern clients support it. Use our JSON Formatter to understand your response structure and spot unnecessarily verbose fields.

Observability: Logging, Tracing, and Metrics

The Postman 2025 State of the API report found that 93% of teams struggle with API collaboration, leading to duplicated work and integration delays. A significant part of this is inadequate observability — without structured logs and request tracing, diagnosing cross-service issues becomes archaeology rather than engineering.

// Structured logging: every request should produce a log line like this
{
  "timestamp": "2026-03-17T14:32:01.456Z",
  "level": "info",
  "request_id": "req_a1b2c3d4",           // Same ID in error responses
  "trace_id": "4bf92f3577b34da6a3ce929d",  // OpenTelemetry trace
  "method": "POST",
  "path": "/api/v1/users",
  "status": 422,
  "duration_ms": 12,
  "user_id": null,                          // null = unauthenticated
  "ip": "203.0.113.42",
  "user_agent": "MyApp/2.1.0"
}

// Add X-Request-Id to every response so clients can report it
// This is the single most valuable debugging aid in distributed systems
HTTP/1.1 422 Unprocessable Entity
X-Request-Id: req_a1b2c3d4
Content-Type: application/json

Build APIs Faster with BytePane Tools

Format and validate API responses with the JSON Formatter, encode query parameters with the URL Encoder, generate and decode Base64 cursors for pagination, and decode JWT tokens to inspect claims — all free, in-browser, no setup required.

Frequently Asked Questions

Should I use URL path versioning or header versioning?

URL path versioning (/api/v1/users) is the industry standard used by Stripe, GitHub, and Google. It makes the version visible in every log line, browser network tab, and support ticket. Per the SmartBear survey of 1,100+ API professionals, 71% use API versioning — URL path is the dominant approach. Header versioning is architecturally purer but dramatically harder to debug.

What is the difference between 401 and 403?

401 Unauthorized means the request lacks valid credentials — the client is not authenticated. Despite the confusing name, it means "unauthenticated." 403 Forbidden means credentials are valid but the user lacks permission for this specific resource or action. Return 401 for missing, expired, or malformed tokens. Return 403 when the authenticated user simply doesn't have access. CDNs and security tools use this distinction for logging and alerting.

How should I implement rate limiting?

Implement rate limiting at the reverse proxy or API gateway level (Nginx, Kong, Cloudflare) — not in application code — so it works across all instances. Return rate limit state in headers (X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset). Return 429 with a Retry-After header when exceeded. Per Treblle's analysis of 1 billion API calls, only 15% of APIs implement rate limiting — leaving the majority vulnerable.

What is the difference between PUT and PATCH?

PUT replaces the entire resource — omitted fields are set to null or defaults. PATCH applies a partial update — only fields in the request body are changed. PUT is idempotent; identical PUT requests always produce the same resource state. PATCH is not guaranteed idempotent (though it can be designed to be). Use PATCH for field-level updates to avoid accidentally nulling fields the client didn't intend to modify.

When should I use cursor pagination vs offset pagination?

Use cursor pagination for any data that changes frequently: activity feeds, notification lists, real-time order streams. Cursor pagination is immune to the "shifting window" problem (no duplicate or skipped items when rows are inserted/deleted mid-traversal) and avoids expensive COUNT(*) queries. Use offset pagination for stable, rarely-changing data where users need random access to arbitrary pages (admin reports, CSV exports).

What security headers should my REST API return?

At minimum: X-Content-Type-Options: nosniff, Strict-Transport-Security with a year-long max-age, scoped CORS (never * for authenticated endpoints), and Content-Type: application/json. OWASP API Security Top 10 lists Security Misconfiguration as a leading category — improper CORS is one of the most common findings in security audits.

How do I handle errors consistently across all endpoints?

Define one error envelope and enforce it everywhere: a machine-readable code, a human-readable message, a request_id for log correlation, and a details array for field-level validation failures. Never expose stack traces, SQL errors, or internal paths in production. The request_id lets support engineers find the full trace in your logging system without exposing internals to end users.

Related Articles