Day 16: Basic logging with Winston. Success: Log API requests to console.

Lesson 16 60 min

Day 16: Basic Logging with Winston. Success: Log API requests to console.

Welcome back, engineers! Today, we're diving into a topic that might seem mundane at first glance but is absolutely foundational to building resilient, observable, and maintainable systems: logging. Specifically, we'll implement basic API request logging using Winston, a powerful and highly flexible logging library for Node.js.

You might be thinking, "Just console.log() it, right?" In a small script, sure. But for a hyperscale CRM handling 100 million requests per second, console.log() is like trying to find a needle in a haystack while blindfolded and deaf. We need structure, levels, and destinations.

Agenda for Day 16

Core Concept: Understanding the critical role of structured logging in distributed systems.
Why Winston?: A brief overview of Winston's capabilities and why it's a good fit.
Component Architecture: Where our logger fits within our CRM API.
Hands-on Implementation: Setting up Winston and creating an Express middleware to log API requests.
Verification: Confirming our logs are captured correctly.

Core Concept: Beyond `console.log()` – The Power of Observability

Flowchart diagram

In the world of ultra-high-scale systems, the ability to understand what's happening inside your black boxes is paramount. This is what we call observability. Logging is one of its three pillars (the others being metrics and tracing).

Imagine our CRM platform processing millions of customer interactions. A user reports an issue: "My lead isn't showing up in the sales pipeline." How do you debug this?

Was the request even received?
Did it hit the right endpoint?
Was the authentication successful?
Did an internal service fail?
What data was sent, and what was returned?

Without robust logging, you're flying blind.

Why Structured Logging is a Game Changer

Traditional console.log("User logged in: " + userId) generates unstructured text. While human-readable, it's terrible for machines. When you have terabytes of logs daily, you need to parse, filter, and analyze them programmatically.

Structured logging means outputting logs as machine-readable data, typically JSON.
Instead of:
[2023-10-27T10:30:00.000Z] INFO User logged in: user123

You get:
{"timestamp":"2023-10-27T10:30:00.000Z", "level":"info", "message":"User logged in", "userId":"user123", "operation":"login"}

This JSON object can be easily ingested by log aggregation tools (like Elasticsearch, Splunk, Datadog) and queried. You can quickly filter for all "login" operations for userId="user123" that occurred within a specific time window. This is indispensable for debugging, security auditing, and operational insights.

Logging Levels: Managing the Noise

Not all log messages are equally important. Winston, like most professional loggers, supports various logging levels:

error: Critical issues that prevent functionality.
warn: Potentially problematic situations.
info: General application flow, important events.
http: HTTP request/response details (often used in web frameworks).
verbose: Detailed information.
debug: Debugging information, often too noisy for production.
silly: Extremely detailed, for deep debugging.

In production, you might only log info, warn, and error to reduce volume and noise. In development, you'd use debug or verbose. This flexibility is crucial for performance and cost management at scale.

Architecture: Fitting Logging into our CRM API

Component Architecture

Our CRM API is built on Express. The most effective way to log incoming API requests is via middleware. Middleware functions in Express sit between the request being received and the route handler being executed. This allows us to intercept every request, log its details, and then pass it along.

Code

+----------------+
|  Client Request|
+-------+--------+
        |
        v
+-------+--------+
|  Express Server|
+-------+--------+
        |
        v
+----------------+
| Logging Middleware | <-- Intercepts request, logs details
+--------+-------+
         |
         v
+--------+-------+
|  Auth Middleware |
+--------+-------+
         |
         v
+--------+-------+
| Route Handler  | <-- e.g., /api/auth/signup
+--------+-------+
         |
         v
+--------+-------+
|  Response Sent |
+----------------+

Our winston.js will configure the logger, and logger.js (or a specific requestLogger.js) will implement the middleware using that configured logger.

Hands-on Implementation: Setting Up Winston

Let's get our hands dirty. We'll configure Winston to log to the console initially. This is perfect for development. Later, we'll explore file transports, daily rotation, and remote log aggregation.

1. Install Winston

First, ensure you're in your project directory (crm-api from previous lessons).

bash

npm install winston

2. Configure the Winston Logger (`src/config/winston.js`)

We'll create a central logger instance. This ensures all parts of our application use the same logging configuration.

javascript

// src/config/winston.js
const winston = require('winston');

// Define your custom log format
const logFormat = winston.format.combine(
  winston.format.timestamp({ format: 'YYYY-MM-DD HH:mm:ss' }),
  winston.format.colorize(), // Add colors to levels
  winston.format.printf(
    (info) => `${info.timestamp} ${info.level}: ${info.message}`
  )
);

// Create the logger instance
const logger = winston.createLogger({
  level: process.env.NODE_ENV === 'production' ? 'info' : 'debug', // Log 'info' and above in prod, 'debug' and above in dev
  format: logFormat,
  transports: [
    // Console transport for development and general visibility
    new winston.transports.Console()
    // In future lessons, we'll add file transports,
    // daily rotate files, and potentially remote transports (e.g., for ELK stack)
  ],
});

module.exports = logger;

Insight: Notice process.env.NODE_ENV. This is a critical pattern. We log debug in development (more verbose) and info in production (less verbose). This prevents our production logs from becoming an unmanageable torrent of data. At 100M RPS, every byte logged matters for performance and cost.

3. Create Request Logging Middleware (`src/middleware/logger.js`)

Now, let's create an Express middleware that uses our logger instance to log incoming HTTP requests.

javascript

// src/middleware/logger.js
const logger = require('../config/winston');

const requestLogger = (req, res, next) => {
  // Log basic request details
  logger.info(`[HTTP] ${req.method} ${req.originalUrl} from ${req.ip}`);

  // You can add more details here, e.g., request body (careful with sensitive data!)
  // if (Object.keys(req.body).length > 0) {
  //   logger.debug(`Request Body: ${JSON.stringify(req.body)}`);
  // }

  // Continue to the next middleware or route handler
  next();
};

module.exports = requestLogger;

Insight: We're logging req.method, req.originalUrl, and req.ip. For a real production system, you'd often add a correlation ID (also known as a request ID) to each log entry. This ID would be generated at the edge of your system and passed down through all services involved in processing that request. This allows you to trace a single request's journey across multiple microservices, which is invaluable for debugging complex distributed systems. We'll explore correlation IDs in a future lesson.

4. Integrate Middleware into `src/server.js`

Finally, we need to tell our Express app to use this middleware. We'll place it early in the middleware chain so it logs all incoming requests before they hit specific route handlers.

javascript

// src/server.js (excerpt)
const express = require('express');
const dotenv = require('dotenv');
const authRoutes = require('./routes/auth');
const logger = require('./config/winston'); // Our Winston logger
const requestLogger = require('./middleware/logger'); // Our request logging middleware

dotenv.config();

const app = express();
const PORT = process.env.PORT || 3000;

// Apply request logger middleware BEFORE any other routes or middleware
app.use(requestLogger);

// Built-in middleware for parsing JSON and urlencoded data
app.use(express.json());
app.use(express.urlencoded({ extended: true }));

// Auth routes
app.use('/api/auth', authRoutes);

// Basic health check route
app.get('/', (req, res) => {
  res.send('CRM API is running!');
});

// Start the server
app.listen(PORT, () => {
  logger.info(`Server running on port ${PORT}`); // Use our logger
});

Verification: See it in Action!

State Machine diagram

Now, when you start your server and make requests, you should see structured log messages in your console.

Start your server: node src/server.js
Open another terminal and make a curl request:

bash

curl -X POST -H "Content-Type: application/json" -d '{"username": "testuser", "password": "password123"}' http://localhost:3000/api/auth/signup

Observe your server console. You should see output similar to this:

Code

2023-10-27 10:30:00 info: Server running on port 3000
2023-10-27 10:30:05 info: [HTTP] POST /api/auth/signup from ::1

(Your timestamp, IP, and exact message might vary slightly based on your system and Winston configuration.)

Success Criteria: You can see [HTTP] POST /api/auth/signup logged to your console when you hit the signup endpoint.

Real-world Production Application at 100 Million RPS

At 100 million requests per second, logging becomes an engineering challenge in itself.

Volume: You're generating an astronomical amount of log data. You cannot log everything. Sampling or extremely selective logging becomes crucial.
Performance: Synchronous logging is a no-go. Every I/O operation blocks your request thread. Asynchronous logging and buffering are essential.
Storage: You need a highly scalable, cost-effective log aggregation system (like an ELK stack - Elasticsearch, Logstash, Kibana, or cloud-native solutions like AWS CloudWatch Logs, Google Cloud Logging, Azure Monitor).
Cost: Storing and processing massive log volumes is expensive. This drives decisions on what to log, how long to retain it, and what level of detail is truly necessary.
Security & Compliance: Logging sensitive data is a major breach risk. Strict rules, data masking, and redaction must be in place. Audit trails for critical actions (e.g., data access, configuration changes) are mandatory for compliance.

Today, we laid the groundwork for robust logging. This seemingly small step is a giant leap towards building a truly observable and resilient CRM system that can scale to any demand.

Assignment: Enhance Logging with HTTP Status Codes

Your task is to modify the requestLogger middleware to also log the HTTP status code of the response after the request has been processed. This will give you a complete picture of each API interaction.

Steps:

In src/middleware/logger.js, modify the requestLogger middleware.
Use res.on('finish', ...) to capture the response status code.
Log the status code along with the request details.

Example Output (Expected):

Code

2023-10-27 10:35:00 info: [HTTP] POST /api/auth/signup from ::1
2023-10-27 10:35:00 info: [HTTP] Response for POST /api/auth/signup: 201

Solution Hints

The res object in Express middleware emits a finish event when the response has been sent.
You can access the status code via res.statusCode within the finish event listener.
Remember to use your logger instance for all log messages.

Learning Objectives

✓ Understand why secrets (like JWT keys) must not be hardcoded in source code.
✓ Learn how environment variables and .env files manage configuration securely.
✓ Implement dotenv in a Node.js backend to load secrets into process.env.
✓ Configure Docker Compose to inject environment variables into containers.
✓ Apply best practices like .gitignore and secret validation to prevent security leaks.

💬 Discuss this topic