Skip to main content

Conquering the N+1 Problem in Spring GraphQL with DataLoader

· 6 min read
GraphQL Guy

DataLoader Pattern

The N+1 problem can turn your elegant GraphQL API into a database nightmare. Learn how Spring GraphQL's batch loading solves this efficiently.

The N+1 Problem Revisited

Before diving into solutions, let's understand the problem clearly. Consider this query:

query {
books { # 1 query to get 100 books
title
author { # 100 queries - one per book!
name
}
}
}

With a naive implementation, fetching 100 books triggers 101 database queries:

SELECT * FROM books;                    -- 1 query
SELECT * FROM authors WHERE id = 1; -- +100 queries
SELECT * FROM authors WHERE id = 2;
SELECT * FROM authors WHERE id = 3;
-- ... 97 more queries

That's the N+1 problem: 1 query for the list, plus N queries for related items.

How Spring GraphQL Solves This

Spring GraphQL provides built-in support for batch loading through the @BatchMapping annotation and the BatchLoaderRegistry.

The simplest approach uses @BatchMapping:

@Controller
public class BookController {

private final BookRepository bookRepository;
private final AuthorRepository authorRepository;

@QueryMapping
public List<Book> books() {
return bookRepository.findAll();
}

@BatchMapping
public Map<Book, Author> author(List<Book> books) {
// Collect all author IDs
Set<String> authorIds = books.stream()
.map(Book::authorId)
.collect(Collectors.toSet());

// Single query for all authors
Map<String, Author> authorsById = authorRepository
.findAllById(authorIds)
.stream()
.collect(Collectors.toMap(Author::id, Function.identity()));

// Map each book to its author
return books.stream()
.collect(Collectors.toMap(
Function.identity(),
book -> authorsById.get(book.authorId())
));
}
}

Now the same query executes only 2 queries:

SELECT * FROM books;
SELECT * FROM authors WHERE id IN (1, 2, 3, ...);

Understanding @BatchMapping

The @BatchMapping annotation:

  1. Collects all instances that need the field resolved
  2. Calls your method once with the entire batch
  3. Maps results back to individual instances
┌─────────────────────────────────────────────────────────────────┐
│ Query Execution │
└─────────────────────────────────────────────────────────────────┘

books query returns: Book-1, Book-2, Book-3, ..., Book-100
│ │ │ │
└───────┴───────┴────────────┘

┌───────────▼───────────┐
│ @BatchMapping │
│ author(List<Book>) │
│ │
│ Single batch call │
│ with all 100 books │
└───────────┬───────────┘

┌───────────▼───────────┐
│ Map<Book, Author> │
│ returned │
└───────────────────────┘

Return Types for @BatchMapping

@BatchMapping supports several return types:

// Map - each source maps to one result
@BatchMapping
public Map<Book, Author> author(List<Book> books) { ... }

// Mono<Map> - async loading
@BatchMapping
public Mono<Map<Book, Author>> author(List<Book> books) { ... }

// For collections (one-to-many relationships)
@BatchMapping
public Map<Author, List<Book>> books(List<Author> authors) { ... }

Approach 2: BatchLoaderRegistry

For more control, register batch loaders manually:

@Configuration
public class GraphQLConfig {

@Bean
public RuntimeWiringConfigurer runtimeWiringConfigurer(
AuthorRepository authorRepository) {

return wiringBuilder -> wiringBuilder
.type("Book", builder -> builder
.dataFetcher("author", environment -> {
Book book = environment.getSource();
DataLoader<String, Author> loader =
environment.getDataLoader("authorLoader");
return loader.load(book.authorId());
}));
}

@Bean
public BatchLoaderRegistry batchLoaderRegistry(
AuthorRepository authorRepository) {

return new BatchLoaderRegistry() {
@Override
public <K, V> BatchLoaderRegistry forName(String name) {
return this;
}

@Override
public void registerFor(GraphQLContext context) {
context.put("authorLoader",
DataLoaderFactory.newDataLoader(
(List<String> ids) -> {
Map<String, Author> authors = authorRepository
.findAllById(ids)
.stream()
.collect(Collectors.toMap(Author::id, a -> a));

// Return in same order as requested
return CompletableFuture.completedFuture(
ids.stream()
.map(authors::get)
.toList()
);
}
));
}
};
}
}

Simpler Registration with Spring Boot

Spring Boot 3+ provides a cleaner way:

@Component
public class AuthorBatchLoader implements BatchLoaderWithContext<String, Author> {

private final AuthorRepository authorRepository;

public AuthorBatchLoader(AuthorRepository authorRepository) {
this.authorRepository = authorRepository;
}

@Override
public CompletionStage<List<Author>> load(List<String> authorIds,
BatchLoaderEnvironment env) {
return CompletableFuture.supplyAsync(() -> {
Map<String, Author> authorsById = authorRepository
.findAllById(authorIds)
.stream()
.collect(Collectors.toMap(Author::id, Function.identity()));

// Must return in same order as input
return authorIds.stream()
.map(authorsById::get)
.toList();
});
}
}

@Configuration
public class DataLoaderConfig {

@Bean
public BatchLoaderRegistry batchLoaderRegistry(
AuthorBatchLoader authorBatchLoader) {

return registry -> registry
.forTypePair(String.class, Author.class)
.registerBatchLoader(authorBatchLoader);
}
}

Nested Batch Loading

What about nested relationships?

query {
books {
title
author {
name
publisher { # Another level!
name
}
}
}
}

Just add another @BatchMapping:

@Controller
public class AuthorController {

@BatchMapping
public Map<Author, Publisher> publisher(List<Author> authors) {
Set<String> publisherIds = authors.stream()
.map(Author::publisherId)
.filter(Objects::nonNull)
.collect(Collectors.toSet());

Map<String, Publisher> publishersById = publisherRepository
.findAllById(publisherIds)
.stream()
.collect(Collectors.toMap(Publisher::id, Function.identity()));

return authors.stream()
.filter(a -> a.publisherId() != null)
.collect(Collectors.toMap(
Function.identity(),
a -> publishersById.get(a.publisherId())
));
}
}

Spring GraphQL handles the execution order automatically.

One-to-Many Relationships

For collections, the return type changes slightly:

@Controller
public class AuthorController {

@QueryMapping
public List<Author> authors() {
return authorRepository.findAll();
}

@BatchMapping
public Map<Author, List<Book>> books(List<Author> authors) {
Set<String> authorIds = authors.stream()
.map(Author::id)
.collect(Collectors.toSet());

// Single query for all books by these authors
List<Book> allBooks = bookRepository.findByAuthorIdIn(authorIds);

// Group by author ID
Map<String, List<Book>> booksByAuthorId = allBooks.stream()
.collect(Collectors.groupingBy(Book::authorId));

// Map each author to their books
return authors.stream()
.collect(Collectors.toMap(
Function.identity(),
author -> booksByAuthorId.getOrDefault(author.id(), List.of())
));
}
}

Performance Comparison

Let's measure the difference:

@SpringBootTest
class BatchLoadingPerformanceTest {

@Autowired
private GraphQlTester graphQlTester;

@Test
void measureQueryPerformance() {
// Setup: 100 books, 20 authors
long start = System.currentTimeMillis();

graphQlTester.document("""
query {
books {
title
author {
name
}
}
}
""")
.execute()
.path("books").entityList(Book.class).hasSize(100);

long duration = System.currentTimeMillis() - start;
System.out.println("Query executed in: " + duration + "ms");
}
}
ApproachQueriesTime
Naive (N+1)101~500ms
@BatchMapping2~50ms

10x improvement just by changing how we fetch data!

Caching Within a Request

DataLoader automatically caches within a request:

query {
book1: bookById(id: "1") {
author { name } # Loads author "A"
}
book2: bookById(id: "2") {
author { name } # Also loads author "A" - cached!
}
}

If both books have the same author, the author is fetched only once per request.

Important: This cache is per-request. New requests start with an empty cache.

Common Pitfalls

1. Returning Results in Wrong Order

DataLoader requires results in the same order as the input keys:

// WRONG - HashMap doesn't preserve order
Map<String, Author> map = ...;
return ids.stream().map(map::get).toList();

// RIGHT - explicitly maintain order
return ids.stream()
.map(id -> authorsById.getOrDefault(id, null))
.toList();

2. Forgetting to Handle Nulls

Some IDs might not have corresponding data:

@BatchMapping
public Map<Book, Author> author(List<Book> books) {
// Some books might have invalid authorId
Map<String, Author> authorsById = ...;

return books.stream()
.collect(Collectors.toMap(
Function.identity(),
book -> authorsById.get(book.authorId()), // Might be null!
(a, b) -> a, // Handle duplicate keys
HashMap::new // HashMap allows null values
));
}

3. Not Using Indexes

Batch loading is only fast if the database query is fast:

-- Make sure you have an index!
CREATE INDEX idx_authors_id ON authors(id);
CREATE INDEX idx_books_author_id ON books(author_id);

Advanced: Custom DataLoader Configuration

For fine-tuned control:

@Bean
public DataLoaderOptions dataLoaderOptions() {
return DataLoaderOptions.newOptions()
.setBatchingEnabled(true)
.setCachingEnabled(true)
.setMaxBatchSize(100) // Limit batch size
.setBatchLoaderScheduler(new BatchLoaderScheduler() {
@Override
public <K, V> void scheduleBatchLoader(
ScheduledBatchLoaderCall<V> call,
List<K> keys,
BatchLoaderEnvironment env) {
// Custom scheduling logic
CompletableFuture.runAsync(call::invoke);
}
});
}

Debugging Batch Loading

Add logging to see what's happening:

@BatchMapping
public Map<Book, Author> author(List<Book> books) {
log.debug("Batch loading authors for {} books", books.size());
log.debug("Author IDs: {}", books.stream()
.map(Book::authorId)
.distinct()
.toList());

// ... rest of implementation
}

Or enable SQL logging:

logging:
level:
org.hibernate.SQL: DEBUG
org.hibernate.type.descriptor.sql: TRACE

Summary

PatternUse CaseAnnotation
Single related objectBook.author@BatchMapping returning Map<Book, Author>
List of related objectsAuthor.books@BatchMapping returning Map<Author, List<Book>>
Manual controlComplex logicBatchLoaderRegistry

The N+1 problem is GraphQL's most common performance issue, but Spring GraphQL makes it easy to solve. Use @BatchMapping by default, and you'll have efficient queries without complex configuration.

Next: Testing Spring GraphQL - comprehensive strategies for unit and integration testing.