Engineering Design
Observability

Observability

Health Checks

Medusa v2 exposes /_health built-in. Extend it with a custom /health endpoint for richer diagnostics.

// src/api/health/route.ts
export const GET = async (req, res) => {
  const db = await checkDbConnection()
  res.json({
    status: "ok",
    db: db ? "ok" : "error",
    version: process.env.npm_package_version,
    timestamp: new Date().toISOString(),
  })
}

Monitoring Stack by Phase

ToolMonitorsPhase
Better UptimeUptime, response time, SSL expiry, /health endpoint1+
Sentry (frontend)JS errors, Core Web Vitals, source maps1+
Sentry (backend)Unhandled exceptions, slow queries1+
Vercel AnalyticsPage performance, real user monitoring1+
Fly.io MetricsCPU, memory, instance restarts1+
OpenTelemetryFull distributed tracing3
New Relic / DatadogAPM, DB query analysis3

Key Custom Metrics to Track

  • Download token validation rate (success/expired/exhausted/forbidden) — feed to Sentry
  • Invite acceptance rate — track in Medusa subscriber, log to console (Sentry picks it up)
  • Failed login attempts per IP — already rate-limited; log threshold hits to Sentry
  • Stripe webhook processing time — measure in webhook handler, alert if > 5s
  • R2 presigned URL generation failures — alert immediately

Structured Logging

Use Medusa's built-in logger_ (not console.log). Structured logs ship to Fly.io log drain in Phase 2.

// ✅ Correct
this.logger_.info("[DownloadService]: Token validated", { tokenId, customerId })
this.logger_.error("[DownloadService]: R2 presign failed", { productId, error })
 
// ❌ Wrong
console.log("token validated")

OpenTelemetry (Phase 3)

Medusa supports OTel via instrumentation.js at the project root. Add it in Phase 3 when moving to Railway with a Redis event bus.

// instrumentation.js (Phase 3)
const { registerInstrumentation } = require("@medusajs/framework/utils")
registerInstrumentation()