31  API Gateway vs Middleware: Architectural Difference

The key difference is WHERE and HOW MANY TIMES you implement these concerns.

31.1 Architecture Comparison

31.1.1 Without API Gateway (Middleware in Each Service)

                    Internet/Clients
                           │
           ┌───────────────┼───────────────┐
           │               │               │
           ↓               ↓               ↓
    ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
    │  Service A   │ │  Service B   │ │  Service C   │
    │  (Imaging)   │ │  (Reports)   │ │  (Patients)  │
    ├──────────────┤ ├──────────────┤ ├──────────────┤
    │ Auth MW      │ │ Auth MW      │ │ Auth MW      │ ← Duplicated
    │ CORS MW      │ │ CORS MW      │ │ CORS MW      │ ← Duplicated
    │ Logging MW   │ │ Logging MW   │ │ Logging MW   │ ← Duplicated
    │ Rate Limit   │ │ Rate Limit   │ │ Rate Limit   │ ← Duplicated
    ├──────────────┤ ├──────────────┤ ├──────────────┤
    │ Business     │ │ Business     │ │ Business     │
    │ Logic        │ │ Logic        │ │ Logic        │
    └──────────────┘ └──────────────┘ └──────────────┘
         ↓               ↓               ↓
    ┌─────────────────────────────────────────┐
    │           Database Layer                │
    └─────────────────────────────────────────┘

Problems:
- Code duplication across services
- Each service exposed to internet
- Hard to maintain consistency
- Each service needs SSL certificates
- Different teams might implement differently

31.1.2 With API Gateway (Centralized)

                    Internet/Clients
                           │
                           ↓
                ┌────────────────────┐
                │   API Gateway      │  ← Single Entry Point
                ├────────────────────┤
                │ SSL Termination    │
                │ Authentication     │  ← Once, for all services
                │ Authorization      │
                │ Rate Limiting      │
                │ CORS               │
                │ Logging            │
                │ Request Routing    │
                │ Load Balancing     │
                │ Circuit Breaking   │
                └────────────────────┘
                           │
           ┌───────────────┼───────────────┐
           │               │               │
           ↓               ↓               ↓
    ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
    │  Service A   │ │  Service B   │ │  Service C   │
    │  (Imaging)   │ │  (Reports)   │ │  (Patients)  │
    │              │ │              │ │              │
    │ Business     │ │ Business     │ │ Business     │ ← Clean!
    │ Logic Only   │ │ Logic Only   │ │ Logic Only   │
    │              │ │              │ │              │
    └──────────────┘ └──────────────┘ └──────────────┘
         ↓               ↓               ↓
    ┌─────────────────────────────────────────┐
    │           Database Layer                │
    └─────────────────────────────────────────┘

Benefits:
- No code duplication
- Services not exposed to internet
- Centralized security
- One SSL certificate
- Consistent behavior

31.2 Key Differences

31.2.1 1. Single Entry Point vs Multiple Endpoints

Without Gateway:

Client needs to know:
- imaging.hospital.com/api/scan
- reports.hospital.com/api/report
- patients.hospital.com/api/patient

Each service has its own domain/IP

With Gateway:

Client only knows:
- api.hospital.com/imaging/scan
- api.hospital.com/reports/report
- api.hospital.com/patients/patient

Gateway routes internally to services

31.2.2 2. Request Routing & Service Discovery

API Gateway handles:

                    api.hospital.com
                           │
                    API Gateway
                    (Service Discovery)
                           │
              ┌────────────┼────────────┐
              ↓            ↓            ↓
         imaging-svc   reports-svc  patients-svc
         (10.0.1.5)    (10.0.1.6)   (10.0.1.7)
              │            │            │
         Load Balance   Load Balance  Load Balance
              │            │            │
         ┌────┴───┐   ┌────┴───┐   ┌────┴───┐
         ↓    ↓   ↓   ↓    ↓   ↓   ↓    ↓   ↓
       inst1 inst2   inst1 inst2   inst1 inst2

Gateway knows: - Which service is available - How many instances are running - Health status of each instance - How to load balance

31.2.3 3. Cross-Cutting Concerns: Where They Live

┌─────────────────────────────────────────────────────────┐
│                                                         │
│  Cross-Cutting Concerns in API Gateway                 │
│                                                         │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐   │
│  │   Rate      │  │    Auth     │  │   Logging   │   │
│  │  Limiting   │  │             │  │             │   │
│  │             │  │             │  │             │   │
│  │ Global:     │  │ Global:     │  │ Global:     │   │
│  │ 1000 req/hr │  │ JWT verify  │  │ All traffic │   │
│  │ per user    │  │ API keys    │  │ centralized │   │
│  └─────────────┘  └─────────────┘  └─────────────┘   │
│                                                         │
└─────────────────────────────────────────────────────────┘
                           │
           ┌───────────────┼───────────────┐
           ↓               ↓               ↓
    Service A        Service B        Service C
    (Clean code)     (Clean code)     (Clean code)


vs


┌─────────────────────────────────────────────────────────┐
│  Service A                                              │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐   │
│  │ Rate Limit  │  │    Auth     │  │   Logging   │   │
│  └─────────────┘  └─────────────┘  └─────────────┘   │
│  Business Logic                                         │
└─────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────┐
│  Service B                                              │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐   │
│  │ Rate Limit  │  │    Auth     │  │   Logging   │   │
│  └─────────────┘  └─────────────┘  └─────────────┘   │
│  Business Logic                                         │
└─────────────────────────────────────────────────────────┘

Duplicated everywhere!

31.3 Real-World Medical AI Example

31.3.1 Scenario: Radiology AI Platform

Without Gateway:

# Service A: X-ray Analysis (Python/FastAPI)
@app.middleware("http")
async def auth_middleware(request, call_next):
    verify_jwt()  # Implement auth

@app.middleware("http")
async def rate_limit_middleware(request, call_next):
    check_rate_limit()  # Implement rate limiting

@app.post("/analyze-xray")
async def analyze():
    # Business logic
    pass


# Service B: Report Generation (Node.js/Express)
app.use((req, res, next) => {
    verifyJWT();  // Re-implement auth in different language!
});

app.use((req, res, next) => {
    checkRateLimit();  // Re-implement rate limiting!
});

app.post("/generate-report", (req, res) => {
    // Business logic
});


# Service C: DICOM Storage (C#/.NET)
public class AuthMiddleware {
    // Re-implement auth AGAIN in C#!
}

public class RateLimitMiddleware {
    // Re-implement rate limiting AGAIN!
}

Problems: - Same logic written 3 times in 3 languages - If you change JWT algorithm, update 3 services - Inconsistent behavior possible - Each team maintains their own auth

With Gateway (Kong, AWS API Gateway, etc):

# API Gateway Configuration (One place!)

routes:
  - path: /imaging/*
    service: xray-analysis-service
    plugins:
      - jwt-auth
      - rate-limiting: 100/minute
      - cors
      - logging
  
  - path: /reports/*
    service: report-generation-service
    plugins:
      - jwt-auth
      - rate-limiting: 50/minute
      - cors
      - logging
  
  - path: /dicom/*
    service: dicom-storage-service
    plugins:
      - jwt-auth
      - rate-limiting: 200/minute
      - cors
      - logging
# Service A: X-ray Analysis - CLEAN!
@app.post("/analyze-xray")
async def analyze():
    # JWT already verified by gateway
    # Rate limit already checked
    # Just focus on business logic
    result = ai_model.predict(image)
    return result
// Service B: Report Generation - CLEAN!
app.post("/generate-report", (req, res) => {
    // Auth already done by gateway
    // Just generate the report
    const report = generateReport(data);
    res.json(report);
});

31.4 When to Use Each Approach

31.4.1 Use Individual Service Middleware When:

Simple Architecture (Monolith or few services)

    Client
      │
      ↓
┌─────────────┐
│  One App    │
│  +          │
│  Middleware │
└─────────────┘
  • ✅ You have a monolith or very few services (1-3)
  • ✅ All services in the same language/framework
  • ✅ Small team maintaining everything
  • ✅ Don’t need advanced routing
  • ✅ Simple deployment

31.4.2 Use API Gateway When:

Complex Architecture (True Microservices)

         Clients
           │
      API Gateway  ← Single point of management
           │
    ┌──────┼──────┬──────┬──────┐
    ↓      ↓      ↓      ↓      ↓
   Svc1  Svc2   Svc3  Svc4   Svc5  ← Many services
  • ✅ Many microservices (5+)
  • ✅ Different languages/frameworks
  • ✅ Multiple teams
  • ✅ Need service discovery
  • ✅ Need advanced routing/load balancing
  • ✅ Want centralized security
  • ✅ Need API versioning
  • ✅ Want to hide internal architecture

31.6 Hybrid Approach (Common in Practice)

                 Internet
                    │
             ┌──────┴──────┐
             │             │
        Public API    Internal Services
             │             │
             ↓             ↓
      ┌────────────┐  ┌────────────┐
      │ API Gateway│  │ Service    │
      │            │  │ Mesh       │
      │ - Auth     │  │ (Istio)    │
      │ - Rate Lmt │  │            │
      │ - CORS     │  │ - Retry    │
      │ - SSL      │  │ - Circuit  │
      └────────────┘  └────────────┘
             │             │
        ┌────┴─────┬───────┴────┬─────┐
        ↓          ↓            ↓     ↓
     Service A  Service B  Service C  Service D
     (public)   (public)   (internal) (internal)
  • API Gateway: For external/public-facing concerns
  • Service Mesh: For internal service-to-service communication
  • Middleware: For service-specific logic

31.7 Summary

Aspect Middleware in Services API Gateway
Code Duplication High None
Maintenance Each service Central
Consistency Can vary Enforced
Single Entry Point No Yes
Service Discovery Manual Automatic
Learning Curve Low Medium-High
Best For Simple apps Microservices

Think of it this way:

  • Middleware = Each security guard at every building entrance
  • API Gateway = One security checkpoint at the campus entrance