The requirement came from our sales team: “Buyers are adding products to their cart, then finding out they’re out of stock at checkout. We’re losing orders.”
They weren’t wrong. Our inventory system was batch-based — stock counts updated every 15 minutes from supplier feeds. In a world where a single corporate buyer might be evaluating 200 SKUs simultaneously, a 15-minute staleness window meant they were frequently making decisions based on inaccurate data.
The ask was simple to state and complicated to build: show live inventory counts on product pages. Update them in real-time as suppliers report changes.
What followed was four months of WebSocket servers, Redis Pub/Sub architectures, connection management headaches, and one very memorable Saturday where we accidentally broadcast every stock update to every connected client simultaneously. This is what we learned.
Why WebSockets and Not Server-Sent Events or Polling
We evaluated three approaches before committing:
Long polling was rejected immediately. At 50,000 concurrent users each polling every 5 seconds, the request volume would be catastrophic. Ruled out.
Server-Sent Events (SSE) were appealing. SSE is simpler than WebSockets — it’s a one-way HTTP stream, so there’s no handshake complexity, and it works well through HTTP/2 multiplexing. The problem: our React Native mobile app needed the same real-time data, and SSE support in React Native is inconsistent across platforms. We wanted one protocol across web and mobile.
WebSockets won by elimination. The bidirectional capability was a bonus — we used it later for the cart synchronisation feature.
Architecture: The Mental Model
The system has three layers:
Supplier Feed → NestJS Event Bus → Redis Pub/Sub → WebSocket Gateway → Client
- Supplier inventory feeds publish stock updates to our NestJS backend via a webhook endpoint.
- The backend publishes these updates to a Redis Pub/Sub channel keyed by product ID.
- Our WebSocket gateway subscribes to the relevant Redis channels based on which products the connected clients have “subscribed” to.
- When a message arrives on a subscribed channel, the gateway pushes it to the appropriate WebSocket connections.
The critical design decision was step 3: clients don’t receive all inventory updates — they subscribe to specific product IDs. A buyer looking at industrial safety equipment shouldn’t receive updates about office supplies.
The NestJS WebSocket Gateway
NestJS has first-class WebSocket support via @WebSocketGateway. We built a gateway that handles client subscriptions and Redis message routing:
@WebSocketGateway({
cors: { origin: process.env.ALLOWED_ORIGINS?.split(',') },
namespace: '/inventory',
})
@Injectable()
export class InventoryGateway
implements OnGatewayInit, OnGatewayConnection, OnGatewayDisconnect
{
@WebSocketServer()
server: Server;
// Map of productId → Set of socket IDs subscribed to that product
private subscriptions = new Map<string, Set<string>>();
// Map of socketId → Set of productIds it's subscribed to
private clientSubscriptions = new Map<string, Set<string>>();
constructor(
private readonly redisSubscriber: Redis,
private readonly logger: Logger,
) {}
afterInit() {
// Subscribe to the Redis wildcard pattern for all inventory channels
this.redisSubscriber.psubscribe('inventory:*', (err) => {
if (err) this.logger.error('Redis psubscribe failed', err);
});
this.redisSubscriber.on('pmessage', (pattern, channel, message) => {
const productId = channel.replace('inventory:', '');
this.broadcastToSubscribers(productId, JSON.parse(message));
});
}
@SubscribeMessage('subscribe')
handleSubscribe(
@ConnectedSocket() client: Socket,
@MessageBody() productIds: string[],
) {
// Validate and sanitise incoming product IDs
const validIds = productIds
.filter(id => /^[a-zA-Z0-9-]{8,36}$/.test(id))
.slice(0, 100); // cap subscriptions per client
for (const productId of validIds) {
if (!this.subscriptions.has(productId)) {
this.subscriptions.set(productId, new Set());
}
this.subscriptions.get(productId)!.add(client.id);
if (!this.clientSubscriptions.has(client.id)) {
this.clientSubscriptions.set(client.id, new Set());
}
this.clientSubscriptions.get(client.id)!.add(productId);
}
}
handleDisconnect(client: Socket) {
const productIds = this.clientSubscriptions.get(client.id);
if (productIds) {
for (const productId of productIds) {
this.subscriptions.get(productId)?.delete(client.id);
if (this.subscriptions.get(productId)?.size === 0) {
this.subscriptions.delete(productId);
}
}
}
this.clientSubscriptions.delete(client.id);
}
private broadcastToSubscribers(productId: string, update: InventoryUpdate) {
const socketIds = this.subscriptions.get(productId);
if (!socketIds || socketIds.size === 0) return;
for (const socketId of socketIds) {
this.server.to(socketId).emit('inventory:update', {
productId,
...update,
});
}
}
}
The subscription cap of 100 products per client is important — without it, a malicious client could subscribe to every product ID in the catalog and create a denial-of-service vector.
The Saturday Incident
Two weeks after we launched the feature, we deployed a refactor that changed how the Redis Pub/Sub channel key was constructed. The old format was inventory:{productId}. The new format was inventory:product:{productId}.
We updated the publisher. We forgot to update the subscriber.
For six hours, the gateway’s psubscribe pattern didn’t match any messages. Inventory updates accumulated in the Redis queue. When a junior engineer on the team noticed the pattern mismatch and fixed it — deploying a hotfix at 11am on Saturday — 40,000 queued messages flushed simultaneously.
Every connected client received 40,000 inventory updates in approximately 2 seconds. The browser tabs that were open on our product pages froze. Clients on mobile crashed. Our error tracking system had a brief but memorable moment where it logged 2.3 million events in a single minute.
The fix was operationally simple (rate limiting on the broadcaster, dead letter queue for failed deliveries), but the root cause was a deployment process that didn’t validate the channel key format before going live.
We added an integration test that checks end-to-end: publish a message to Redis, verify the WebSocket client receives it. It runs in CI on every deploy. We have not had a similar incident since.
Scaling WebSocket Connections Across Multiple Instances
A WebSocket connection is stateful — it lives on a specific server instance. In a horizontally-scaled deployment with four app server instances, a subscription message arriving at instance 1 won’t reach a client connected to instance 2 by default.
We solved this using Socket.IO’s Redis adapter, which uses Redis Pub/Sub as a message bus between server instances:
// main.ts
const app = await NestFactory.create(AppModule);
const httpServer = app.getHttpServer();
const io = new Server(httpServer);
const pubClient = createClient({ url: process.env.REDIS_URL });
const subClient = pubClient.duplicate();
await Promise.all([pubClient.connect(), subClient.connect()]);
io.adapter(createAdapter(pubClient, subClient));
With the Redis adapter, when the gateway on instance 1 calls this.server.to(socketId).emit(...), the Redis adapter ensures the message reaches the correct instance even if the client is connected to instance 3.
The Client Side: React and React Native
The client implementation is straightforward. We built a custom hook that manages the WebSocket connection lifecycle:
// useInventory.ts
export function useInventory(productIds: string[]) {
const [inventory, setInventory] = useState<Record<string, InventoryUpdate>>({});
const socketRef = useRef<Socket | null>(null);
useEffect(() => {
if (productIds.length === 0) return;
socketRef.current = io(`${process.env.NEXT_PUBLIC_WS_URL}/inventory`, {
transports: ['websocket'],
reconnectionAttempts: 5,
reconnectionDelay: 1000,
});
socketRef.current.on('connect', () => {
socketRef.current?.emit('subscribe', productIds);
});
socketRef.current.on('inventory:update', (update: InventoryUpdate) => {
setInventory(prev => ({
...prev,
[update.productId]: update,
}));
});
return () => {
socketRef.current?.disconnect();
};
}, [productIds.join(',')]);
return inventory;
}
The productIds.join(',') dependency is intentional — productIds is an array, and React’s useEffect uses reference equality for arrays. Joining to a string gives a stable comparison.
Impact
Three months after the feature launched:
- Cart abandonment at checkout due to out-of-stock errors: down 34%
- Add-to-cart conversion rate: up 11% (buyers are more confident when they can see live stock)
- Average session duration on product pages: up 22% (buyers spend more time evaluating when data is trustworthy)
The business case justified itself faster than almost any feature we shipped last year.
What We’d Do Differently
Build the integration test on day one. The Saturday incident was entirely preventable. Contract tests between publishers and subscribers should be mandatory for any event-driven system.
Rate limit at the gateway, not the application. We initially rate-limited inventory updates in the NestJS service layer. The right place is the WebSocket gateway — it’s closer to the connection and can shed load before it reaches the application logic.
Think about connection costs early. A WebSocket connection is cheap per-connection but expensive per-server-restart. If you’re deploying frequently, connection reconnection storms become a real problem. We added randomised reconnect jitter to the client and a gradual rollout mechanism for server restarts.
Real-time is one of those features that feels like a small addition and turns out to touch everything. Build it carefully, test the failure modes, and watch your Redis memory usage.
— Rohit Mishra