Service Events: Rolling Your Own Event Logging


Before AI coding assistants, building your own logging infrastructure would have been a questionable use of time. Third-party services existed for good reason—the problem space is deeper than it appears. In areas like authentication, this advice still holds. Security vulnerabilities aren’t worth the savings.

But for event tracking and application-level logging, the calculation has changed. With AI assistance, you can build a focused solution that meets your needs without inheriting the complexity and cost of a general-purpose platform. Below is an example of a service you might now consider building yourself: logging.

This article describes a DynamoDB-backed event log with automatic cleanup that provides visibility across your system at a fraction of the cost of commercial alternatives. Furthermore, it helps establish a consistent logging interface across your entire system, so in the event you outgrown DynamoDB, you can switch to Datadog later if needed.

Overview

  • Custom event logging — track business events across services without third-party platforms
  • DynamoDB storage — low-cost, serverless, with automatic TTL cleanup
  • Clear separation — events for business visibility, CloudWatch for debugging
  • Admin UI access — query events without logging into AWS console
  • Easy migration path — abstracted interface allows switching to commercial services later

What This Replaces (and Doesn’t)

Service events track business-level actions: user registrations, job completions, subscription changes. These are the things you want to see in an admin dashboard or correlate across services.

This does not replace CloudWatch for:

  • Per-request logs
  • HTTP access logs
  • Full stack traces
  • High-frequency loops
  • Raw console output

Use CloudWatch for debugging. Use service events for visibility into what your application is doing.

The Logger interface below provides methods for both: debug, info, warn, and error write to CloudWatch (via console.log), while event writes structured records to DynamoDB. This keeps a consistent API across all services while separating concerns. The writeEvent function is injected, so swapping from DynamoDB to a commercial service only requires changing that implementation.

Storage Strategy

DynamoDB provides serverless storage with automatic cleanup via TTL. This keeps costs low and eliminates the need for maintenance jobs.

The schema uses a simple partition key pattern:

pk: ORIGIN#{serviceName}
sk: {timestamp}#{eventType}

This pattern optimizes for the most common query: “show me recent events for service X.” The partition key groups all events from a single service together. The sort key combines timestamp and event type, allowing you to query events in chronological order (newest first) while still filtering by type if needed.

This access pattern doesn’t support efficient queries by user ID or event type across all services—those require a GSI or a scan. For applications logging < 10K events/day, scanning recent records is fast enough. As volume grows, add a GSI with userId as the partition key or switch to a purpose-built logging service.

TTL varies by event importance:

  • Regular events: 30 days
  • Errors: 90 days

Adjust these based on your retention needs and query patterns.

Event Types

You are not building a general purpose logging system, you’re building it for yourself, so we can use very bespoke event types. Define events as an enum to prevent arbitrary strings:

export enum EventType {
  UserRegistered = "USER_REGISTERED",
  JobStarted = "JOB_STARTED",
  JobCompleted = "JOB_COMPLETED",
  JobFailed = "JOB_FAILED",
  SubscriptionUpdated = "SUBSCRIPTION_UPDATED",
  Error = "ERROR",
}

The record structure captures event context:

export interface EventRecord {
  pk: string;
  sk: string;
  origin: string;
  event: EventType;
  userId?: string;
  createdAt: string;
  description: string;
  payload?: Record<string, unknown>;
}

Logger Interface

The logger provides a consistent interface across services:

export interface Logger {
  event(
    event: EventType,
    data: {
      description: string;
      userId?: string;
      payload?: Record<string, unknown>;
    },
  ): void;

  debug(message: string, context?: unknown): void;
  info(message: string, context?: unknown): void;
  warn(message: string, context?: unknown): void;
  error(message: string, context?: unknown): void;
}

Standard log methods (debug, info, warn, error) write to CloudWatch. The event method writes durable records to DynamoDB. This separation keeps business events distinct from debugging output.

Implementation

The logger factory creates instances configured for each service:

export function createLogger(config: {
  origin: string;
  environment: "local" | "dev" | "prod";
  writeEvent: (event: EventRecord) => void;
}): Logger {
  function now(): string {
    return new Date().toISOString();
  }

  function safeWrite(event: EventRecord) {
    try {
      config.writeEvent(event);
    } catch {
      // logging must never fail business logic
    }
  }

  return {
    event(eventType, data) {
      const record: EventRecord = {
        pk: `ORIGIN#${config.origin}`,
        sk: `${now()}#${eventType}`,
        origin: config.origin,
        event: eventType,
        createdAt: now(),
        description: data.description,
        userId: data.userId,
        payload: data.payload,
      };

      safeWrite(record);
    },

    debug(message, context) {
      if (config.environment === "prod") return;
      console.debug(`[debug] ${message}`, context);
    },

    info(message, context) {
      console.info(`[info] ${message}`, context);
    },

    warn(message, context) {
      console.warn(`[warn] ${message}`, context);
    },

    error(message, context) {
      console.error(`[error] ${message}`, context);

      // Promote errors to durable events
      safeWrite({
        pk: `ORIGIN#${config.origin}`,
        sk: `${now()}#ERROR`,
        origin: config.origin,
        event: EventType.Error,
        createdAt: now(),
        description: message,
        payload: context as Record<string, unknown>,
      });
    },
  };
}

The safeWrite wrapper ensures logging failures never break business logic. Errors are promoted to durable events automatically, giving you a queryable record of failures.

DynamoDB Writer

The writer handles persistence with automatic TTL:

import { DynamoDBClient, PutItemCommand } from "@aws-sdk/client-dynamodb";
import { marshall } from "@aws-sdk/util-dynamodb";

export function createDynamoEventWriter(config: {
  tableName: string;
  client: DynamoDBClient;
}) {
  return (event: EventRecord) => {
    const ttlSeconds =
      event.event === EventType.Error
        ? 60 * 60 * 24 * 90 // 90 days
        : 60 * 60 * 24 * 30; // 30 days

    const item = {
      ...event,
      ttl: Math.floor(Date.now() / 1000) + ttlSeconds,
    };

    // fire-and-forget
    config.client.send(
      new PutItemCommand({
        TableName: config.tableName,
        Item: marshall(item),
      }),
    );
  };
}

This uses fire-and-forget writes. The DynamoDB call happens asynchronously with no error handling. Events usually arrive within milliseconds, but network issues or throttling could cause writes to fail silently.

For application-level logging—tracking feature usage, monitoring job progress—this trade-off is acceptable. You’re building visibility, not an audit trail. Occasional missing events don’t break anything. If a write fails, no exception propagates to your business logic.

If you need guaranteed delivery (billing events, compliance logs), add error handling, retries, and potentially a dead-letter queue. But recognize that at that point, you’re building a more complex system that might warrant using a commercial service instead.

Lambda Integration

A helper wraps Lambda handlers with automatic logging:

export function withLogging<TEvent, TResult>(
  handler: (
    event: TEvent,
    context: {
      awsRequestId: string;
      functionName: string;
      logger: Logger;
    },
  ) => Promise<TResult>,
) {
  return async (event: TEvent, context: any): Promise<TResult> => {
    const logger = createLogger({
      origin: context.functionName,
      environment: process.env.NODE_ENV as any,
      writeEvent: createDynamoEventWriter({
        tableName: process.env.EVENT_LOG_TABLE!,
        client: new DynamoDBClient({}),
      }),
    });

    return handler(event, {
      awsRequestId: context.awsRequestId,
      functionName: context.functionName,
      logger,
    });
  };
}

Usage inside a Lambda:

export const handler = withLogging(async (event: any, ctx) => {
  ctx.logger.event(EventType.JobStarted, {
    description: "Job processing started",
    userId: event.userId,
    payload: { jobId: event.jobId },
  });

  ctx.logger.info("Fetching job input", { jobId: event.jobId });

  try {
    // do work...
    ctx.logger.event(EventType.JobCompleted, {
      description: "Job completed successfully",
      userId: event.userId,
      payload: { jobId: event.jobId },
    });
  } catch (err) {
    ctx.logger.error("Job failed", err);
    throw err;
  }
});

Querying Events

Retrieve events for a specific service:

import { QueryCommand } from "@aws-sdk/client-dynamodb";
import { unmarshall } from "@aws-sdk/util-dynamodb";

export async function getEventsForOrigin(config: {
  tableName: string;
  client: DynamoDBClient;
  origin: string;
  limit?: number;
}) {
  const res = await config.client.send(
    new QueryCommand({
      TableName: config.tableName,
      KeyConditionExpression: "pk = :pk",
      ExpressionAttributeValues: marshall({
        ":pk": `ORIGIN#${config.origin}`,
      }),
      ScanIndexForward: false, // newest first
      Limit: config.limit ?? 50,
    }),
  );

  return (res.Items ?? []).map((item) => unmarshall(item) as EventRecord);
}

For querying by user or event type, add a GSI. For small-scale applications, scanning recent records often suffices.

Admin Dashboard

Build a simple admin interface to query events without accessing the AWS console. This makes debugging and monitoring faster than switching between services. For tips on building admin tools, see A Playbook for Side Projects.

When to Switch

This approach works well for side projects and small applications. You’ve outgrown it when:

  • Event volume exceeds ~100K/day — DynamoDB scans become slow; you need better indexing
  • You need complex queries — filtering by multiple dimensions (user + event type + time range) requires GSIs that add cost and complexity
  • You want advanced features — alerting, anomaly detection, distributed tracing, or log aggregation across services
  • Multiple teams need access — building fine-grained access controls and different views becomes significant work

At that point, migrate to a commercial service like Datadog, New Relic, or CloudWatch Insights. The abstraction layer makes this straightforward. You’ve been using the same Logger interface everywhere:

ctx.logger.event(EventType.JobCompleted, { ... })

Swapping to a commercial service only requires changing the writeEvent implementation. Your Lambdas, API handlers, and background workers don’t change. This is the real value of the abstraction—you can start simple and upgrade later without rewriting logging calls across your codebase.

Design Trade-offs

This system prioritizes simplicity over features:

  • No guaranteed delivery — fire-and-forget writes mean some events might not arrive
  • Limited query patterns — efficient queries require the right access patterns
  • No real-time streaming — events appear in DynamoDB eventually, not instantly
  • Basic retention — TTL-based cleanup is coarse; you can’t selectively retain specific events

These trade-offs work for application-level logging where occasional missing events don’t break anything critical. For billing events or compliance logging, use a more robust system.