I am going to say something that will make a certain kind of developer uncomfortable: we tried GraphQL, it was the wrong choice for our context, and switching back to REST was one of the best engineering decisions we made in 2024.
I want to be careful here. GraphQL is not a bad technology. For many products — particularly developer-facing APIs, complex content graphs, and teams where frontend engineers drive the product roadmap — GraphQL is genuinely excellent. This isn’t a “GraphQL is bad” post. It’s a “we misread our context and it cost us” post.
We switched to REST after 18 months. That experience taught me more about API design than the previous four years combined.
Why We Went to GraphQL
The year was 2022. We were rebuilding our frontend from a server-rendered PHP application to Next.js. Our existing REST API was a classic “designed-for-the-current-client” API — endpoints that returned exactly the data the old frontend needed, structured in ways that made sense for the old rendering model. It was deeply mismatched with what our new Next.js frontend wanted.
GraphQL’s pitch was compelling: let the client ask for exactly what it needs. No overfetching. No underfetching. A strongly-typed schema that serves as a contract between frontend and backend. Tooling like Apollo Client that would simplify data management on the frontend.
We were sold. We stood up Apollo Server in front of our NestJS services, wrote a schema, and began migrating endpoints.
The first three months felt like progress. The frontend team loved the flexibility. Data fetching logic that had required multiple REST calls collapsed into single GraphQL queries. The schema served as documentation in a way our old REST API never had.
Then the cracks started showing.
GraphQL’s promise of “no overfetching” is real. The promise of “no underfetching” is also real. What nobody told us was that eliminating both problems at the query layer simply moves them to the resolver layer — and if your resolvers aren’t carefully architected, you end up with something worse than either.
The classic problem is the N+1 query. Our product listing query looked like this:
query GetCategoryProducts($categoryId: ID!) {
category(id: $categoryId) {
products {
id
name
supplier { # triggers N supplier queries
name
rating
}
primaryImage { # triggers N image queries
url
alt
}
}
}
}
For a category with 48 products, this single query triggered 1 initial database query, 48 supplier queries, and 48 image queries — 97 database calls in total. Our previous REST endpoint made 3.
The solution is DataLoader, a batching utility that coalesces the N separate queries into a single batch query. We implemented it. It worked. But it required every resolver that fetched related data to be written with DataLoader in mind — which meant every new resolver required a DataLoader implementation, not just the data fetching logic.
The cognitive overhead was real, and it wasn’t evenly distributed. Our backend engineers — who had years of experience with relational database query optimisation — found DataLoader patterns awkward. Mistakes resulted in N+1 regressions that were invisible during development (test data sets are small) and catastrophic in production (real data sets are large).
We had four N+1 incidents in 18 months. Each one took between 30 minutes and 3 hours to diagnose because nothing in our monitoring surface flagged “this GraphQL query is spawning 200 database queries” out of the box.
Problem 2: The Schema Became a Coordination Tax
In theory, the GraphQL schema is a contract that enables frontend and backend to work independently. In practice, it became a source of coupling that was harder to manage than the REST API it replaced.
Every time a frontend engineer needed data that wasn’t in the schema, they had to:
- Propose a schema change
- Get it reviewed and merged to the GraphQL schema
- Implement the resolver in the NestJS service
- Wait for a backend deployment
REST had the same problem — new data required new endpoints or extended response shapes — but the workflow was simpler because there was no shared artefact (the schema) that both sides had to agree on. With GraphQL, schema evolution required explicit coordination that, in practice, always blocked frontend work on backend availability.
Our frontend engineers, who had been enthusiastic adopters of GraphQL, became its most vocal critics by month 12.
Problem 3: Caching Was Hard
REST caching is a solved problem. HTTP caches understand GET /products/123 with Cache-Control: max-age=300. CDNs can cache it. Service workers can cache it. Browser caches handle it automatically.
GraphQL caching is not a solved problem. GraphQL queries are typically POST requests (because queries can be arbitrarily complex and exceed URL length limits). HTTP caches don’t cache POST by default. To cache GraphQL responses effectively, you need:
- Persisted queries (to allow
GET requests with a query hash)
- A custom caching layer that understands the query structure
- Cache invalidation logic that’s aware of which queries involve which types
We implemented Apollo’s persisted queries. We built a Redis-backed response cache keyed on the query hash plus variables. It worked, but it required a level of infrastructure investment that a REST API with Cache-Control headers does not.
When we moved to our Turborepo monorepo structure and started sharing our @ib/api-client package, we realised that the caching complexity we’d built into the GraphQL layer was invisible to the clients consuming the API. They were getting uncached responses without knowing it.
The GraphQL tooling ecosystem has improved dramatically over the years, but it still lags behind REST for operational tasks. Schema change analysis, breaking change detection, and API contract enforcement are mature in the OpenAPI/REST world (Stoplight, Spectral, dozens of others) and less mature in the GraphQL world.
Our on-call runbooks for REST API incidents have a standard flow: check the request logs, identify the endpoint, query the database explain plan, fix. Our GraphQL incident flow had an extra step — “decode the GraphQL query from the network trace” — that consistently added 10–15 minutes to every investigation.
The Migration Back
The decision to migrate back to REST wasn’t a single meeting — it was a gradual consensus that built over the second half of 2023. By the time we formally approved the migration, the team’s enthusiasm for GraphQL had been replaced by a quiet, collective desire for simpler things.
The migration took three months and followed a strangler fig pattern:
- We implemented a new REST API layer in NestJS alongside the existing GraphQL resolvers.
- We generated an OpenAPI specification from the NestJS decorators using
@nestjs/swagger.
- We auto-generated a typed
@ib/api-client from the OpenAPI spec using openapi-typescript.
- We migrated frontend consumers endpoint by endpoint, with feature flags to allow gradual rollout.
- We decommissioned the Apollo Server once all consumers had been migrated.
The auto-generated client was the key enabler. The fear with REST APIs is that types drift between the server definition and the client implementation. With OpenAPI generation, the client is guaranteed to be in sync with the server — the same promise GraphQL makes, without the schema coordination overhead.
// Auto-generated from OpenAPI spec — never hand-write this
import { createClient } from '@ib/api-client';
const client = createClient({
baseUrl: process.env.API_URL,
headers: { Authorization: `Bearer ${token}` },
});
// Fully typed, matches server response types exactly
const product = await client.GET('/products/{id}', {
params: { path: { id: productId } },
});
What GraphQL Was Right About
I don’t want to leave the impression that the experiment was a failure with no residual value. Several things GraphQL forced us to think about made us better engineers:
Schema-first design. Designing the API contract before writing the implementation is a practice we kept. We write OpenAPI specs before we write NestJS controllers now. The thinking discipline that GraphQL imposed — “what data does the client actually need?” — is valid and valuable.
Type safety between frontend and backend. The generated client from our OpenAPI spec gives us essentially the same type safety guarantee that GraphQL + graphql-codegen provides, with less infrastructure complexity.
Query complexity awareness. GraphQL forced us to think about query cost and N+1 patterns in ways that REST had let us ignore. We now think about these issues in our REST API design — specifically, we design resource representations carefully to avoid requiring multiple round trips for common UI scenarios.
Lessons for Anyone Evaluating GraphQL
If you’re evaluating GraphQL for your product, here are the questions I wish I’d asked in 2022:
Do your clients (frontend apps) have meaningfully different data requirements for the same resource? If your web app, mobile app, and third-party API consumers all need roughly the same data from a product endpoint, GraphQL’s flexibility is solving a problem you don’t have.
Is your team comfortable with the DataLoader pattern? N+1 queries are not optional to solve — they will find you in production. If your backend engineers don’t have experience with batching patterns and find them unintuitive, budget significant time for training and incident remediation.
How important is HTTP-level caching to your architecture? If CDN caching, stale-while-revalidate, and browser caching are part of your performance strategy (they should be), GraphQL’s default POST-everything model works against you.
What does your incident response workflow look like? Can your on-call engineers decode a GraphQL query from a network trace under pressure at 3am? Can your monitoring tooling surface query-level performance metrics? If not, REST’s operational simplicity may be worth more than GraphQL’s query flexibility.
For IndustryBuying, the answers to these questions, in retrospect, all pointed toward REST. We had four client apps with similar data needs. Our backend team had strong SQL skills and weak batching pattern experience. CDN caching was important to our performance strategy. Our monitoring stack was built around HTTP semantics.
Context is everything in architecture. GraphQL is a genuinely good technology deployed in the wrong context. REST + OpenAPI + auto-generated clients turns out to be the right tool for how we work.
I hope this helps someone avoid the 18-month detour we took.
— Rohit Mishra