Fix Express GPT Lambdas & Cold-Start Lag

The problem

A real estate platform's Express.js API migrated to Lambda was experiencing 8-12 second cold starts, causing property searches to timeout. Users reported "This site is broken" when trying to view listings. The API had 340MB deployment packages and X-Ray traces showed 85% of latency came from module initialization. Peak traffic hours saw 40% error rates as concurrent executions hit Lambda limits.

How AI created this issue

The team asked GPT-4 to "convert our Express app to run on Lambda". GPT generated the standard serverless-http wrapper approach:


// GPT's Express-to-Lambda conversion
const serverless = require('serverless-http');
const express = require('express');
const app = express();

// Import entire Express app as-is
require('./config/database'); // Mongoose with 50+ models
require('./config/redis');    // Redis client initialization
require('./config/elastic');  // Elasticsearch client
require('./routes');         // 200+ route files

// All middleware loaded on every cold start
app.use(require('body-parser').json({ limit: '50mb' }));
app.use(require('compression')());
app.use(require('helmet')());
app.use(require('cors')());
app.use(require('express-rate-limit')({ /* config */ }));
app.use(require('express-validator')());
// ... 30 more middleware

// GPT's suggestion: "Just wrap it!"
module.exports.handler = serverless(app, {
  binary: ['image/png', 'image/jpeg']
});

GPT treated Lambda like a traditional server, loading the entire Express application on every cold start. It didn't suggest optimizing imports, lazy loading, or breaking apart the monolithic structure. The AI's "lift and shift" approach ignored fundamental serverless constraints.

The solution

Webpack bundle analysis and optimization: Identified and eliminated 180MB of unused code:


// Before: 340MB bundle
// After: 45MB optimized bundle

// webpack.config.js
module.exports = {
  target: 'node',
  mode: 'production',
  entry: './lambda-handlers/index.js',
  externals: [
    'aws-sdk', // Already in Lambda runtime
    nodeExternals({
      allowlist: ['express', 'body-parser'] // Only bundle essentials
    })
  ],
  optimization: {
    minimize: true,
    sideEffects: false,
    usedExports: true
  },
  plugins: [
    new webpack.IgnorePlugin({
      resourceRegExp: /^\.\/locale$/,
      contextRegExp: /moment$/
    })
  ]
};

Route-specific Lambda functions: Split monolith into focused functions:


// Property search Lambda - loads only what it needs
const middy = require('@middy/core');
const jsonBodyParser = require('@middy/http-json-body-parser');
const validator = require('@middy/validator');

// Lazy load heavy dependencies
let elasticClient;
const getElasticClient = async () => {
  if (!elasticClient) {
    const { Client } = require('@elastic/elasticsearch');
    elasticClient = new Client({
      node: process.env.ELASTICSEARCH_URL,
      maxRetries: 3,
      requestTimeout: 5000
    });
  }
  return elasticClient;
};

const searchHandler = async (event) => {
  const { query, filters, page = 1 } = event.body;
  
  const client = await getElasticClient();
  const results = await client.search({
    index: 'properties',
    body: {
      query: { /* ... */ },
      size: 20,
      from: (page - 1) * 20
    }
  });
  
  return {
    statusCode: 200,
    body: JSON.stringify({
      properties: results.body.hits.hits,
      total: results.body.hits.total.value
    })
  };
};

// Minimal middleware chain
module.exports.handler = middy(searchHandler)
  .use(jsonBodyParser())
  .use(validator({ inputSchema }));

Lambda Layers for shared code: Created layers for Express, middleware, and utilities
Container images for complex functions: Used Lambda containers for ML-powered features
Provisioned concurrency: Pre-warmed critical search and listing endpoints

The results

Cold starts reduced from 8-12s to 0.8-1.5s (87% improvement)
Deployment package size: 340MB → 45MB (87% reduction)
Error rate dropped from 40% to 0.3% during peak traffic
Lambda costs decreased 72% through right-sized functions
API response time P50: 2.1s → 340ms
Concurrent execution limit increased 8x with smaller functions

The team learned that serverless migrations require rethinking architecture, not just wrapping existing code. They now build Lambda-first, using Express only for local development. AI tools are great for syntax but lack the context to make architectural decisions about cold starts and bundle optimization.

Ready to fix your codebase?

Let us analyze your application and resolve these issues before they impact your users.

Get Diagnostic Assessment →