IAMUVIN

Next.js & React

Streaming and Suspense in Next.js: Patterns for Fast Page Loads

Uvin Vindula·July 22, 2024·10 min read
Share

TL;DR

Streaming is the single most impactful performance pattern I use in Next.js. Every project I build — from e-commerce platforms to recipe apps — uses streaming to get LCP under 1.5 seconds. The idea is simple: send the page shell instantly while expensive data fetches resolve in the background. The user sees a complete layout with loading states within milliseconds, and the real content streams in as it becomes ready. No blank white screens. No spinners blocking the entire page. This article covers the exact patterns I use in production, including real before-and-after numbers from EuroParts Lanka and FreshMart.


Why Streaming Matters for Performance

Traditional server-side rendering has a fundamental problem: the server has to finish every single data fetch before it sends a single byte to the browser. If your page needs data from three different APIs and one of them takes 800ms, the user stares at a blank screen for at least 800ms. That is the baseline — before the browser even starts parsing HTML.

Streaming flips this model. The server sends the HTML shell immediately — the navigation, the layout, the static content — and then streams in dynamic content as each data fetch resolves. The browser starts rendering the moment the first chunk arrives.

I noticed this problem acutely on EuroParts Lanka. The product listing page was fetching inventory data, pricing from a third-party API, and category filters from the database. The pricing API alone took 400-600ms on a good day. Without streaming, users saw nothing for nearly a full second. With streaming, the page shell with skeleton loaders appeared in under 200ms. The perceived performance improvement was dramatic — bounce rate on that page dropped by 23%.

Here is the mental model I use:

Without Streaming:
[Server fetches ALL data ~~~~~~~ 800ms ~~~~~~~] → [Send HTML] → [Browser renders]
                                                   ^ User sees first pixel here

With Streaming:
[Send shell immediately] → [Browser renders shell]
[Stream chunk 1 ~~~ 200ms] → [Browser updates]
[Stream chunk 2 ~~~~~~~ 500ms] → [Browser updates]
[Stream chunk 3 ~~~~~~~~~ 800ms] → [Browser updates]
^ User sees first pixel here (< 100ms)

The user sees content almost immediately. The slowest data fetch no longer dictates when the entire page appears. This is how I consistently hit LCP under 1.5 seconds across all my projects.


Suspense Boundaries

React Suspense is the mechanism that makes streaming work in Next.js. When a Server Component is wrapped in a <Suspense> boundary, Next.js streams the fallback content first and then replaces it with the resolved content when the data arrives.

The key concept is granularity. You do not wrap your entire page in one Suspense boundary — that defeats the purpose. You wrap individual sections that have their own data dependencies.

tsx
// app/products/page.tsx
import { Suspense } from "react";
import { ProductGrid } from "@/components/product-grid";
import { CategoryFilters } from "@/components/category-filters";
import { PricingBanner } from "@/components/pricing-banner";
import { ProductGridSkeleton } from "@/components/skeletons/product-grid";
import { FiltersSkeleton } from "@/components/skeletons/filters";
import { BannerSkeleton } from "@/components/skeletons/banner";

export default function ProductsPage() {
  return (
    <main className="container mx-auto px-4 py-8">
      <h1 className="text-3xl font-bold mb-8">Auto Parts</h1>

      <Suspense fallback={<BannerSkeleton />}>
        <PricingBanner />
      </Suspense>

      <div className="grid grid-cols-[250px_1fr] gap-8 mt-6">
        <Suspense fallback={<FiltersSkeleton />}>
          <CategoryFilters />
        </Suspense>

        <Suspense fallback={<ProductGridSkeleton />}>
          <ProductGrid />
        </Suspense>
      </div>
    </main>
  );
}

Each section fetches its own data independently. The h1 and the layout grid render instantly because they are static. The PricingBanner, CategoryFilters, and ProductGrid each stream in as their respective data fetches complete.

The fallback components are critical. They need to match the dimensions of the final content to avoid layout shifts. I spend real time designing skeletons that have the correct height, width, and spacing. A skeleton that causes CLS when the real content swaps in defeats half the purpose of streaming.

tsx
// components/skeletons/product-grid.tsx
export function ProductGridSkeleton() {
  return (
    <div className="grid grid-cols-3 gap-6">
      {Array.from({ length: 9 }).map((_, i) => (
        <div key={i} className="animate-pulse">
          <div className="bg-gray-200 rounded-lg aspect-square" />
          <div className="mt-3 h-4 bg-gray-200 rounded w-3/4" />
          <div className="mt-2 h-4 bg-gray-200 rounded w-1/2" />
        </div>
      ))}
    </div>
  );
}

Loading.tsx vs Suspense

Next.js gives you two mechanisms for loading states, and understanding when to use each one matters.

`loading.tsx` is the route-level loading UI. When you create a loading.tsx file in a route segment, Next.js automatically wraps the page in a Suspense boundary with that loading component as the fallback. It triggers during navigation — when the user clicks a link to that route, they see the loading state while the page data loads.

tsx
// app/products/loading.tsx
export default function ProductsLoading() {
  return (
    <main className="container mx-auto px-4 py-8">
      <div className="h-8 w-48 bg-gray-200 rounded animate-pulse mb-8" />
      <div className="grid grid-cols-3 gap-6">
        {Array.from({ length: 9 }).map((_, i) => (
          <div key={i} className="animate-pulse">
            <div className="bg-gray-200 rounded-lg aspect-square" />
            <div className="mt-3 h-4 bg-gray-200 rounded w-3/4" />
          </div>
        ))}
      </div>
    </main>
  );
}

`<Suspense>` boundaries give you fine-grained control within a page. You use them to stream individual sections independently.

Here is my rule of thumb:

  • Use loading.tsx for the initial page-level loading state during navigation. It handles the full-page skeleton.
  • Use <Suspense> boundaries within the page for independent data sections that should stream separately.

On FreshMart, the recipe detail page uses both. The loading.tsx provides a full-page skeleton during link navigation. Inside the page, individual Suspense boundaries wrap the recipe instructions (fast — from database), the nutrition calculator (medium — computed server-side), and the community reviews (slow — aggregated from multiple sources).

tsx
// app/recipes/[slug]/page.tsx
import { Suspense } from "react";
import { RecipeHeader } from "@/components/recipe-header";
import { RecipeInstructions } from "@/components/recipe-instructions";
import { NutritionPanel } from "@/components/nutrition-panel";
import { CommunityReviews } from "@/components/community-reviews";
import { NutritionSkeleton } from "@/components/skeletons/nutrition";
import { ReviewsSkeleton } from "@/components/skeletons/reviews";

interface RecipePageProps {
  params: Promise<{ slug: string }>;
}

export default async function RecipePage({ params }: RecipePageProps) {
  const { slug } = await params;

  return (
    <article className="max-w-4xl mx-auto px-4 py-8">
      <Suspense fallback={<div className="h-64 bg-gray-100 rounded-xl animate-pulse" />}>
        <RecipeHeader slug={slug} />
      </Suspense>

      <div className="grid grid-cols-[1fr_300px] gap-8 mt-8">
        <Suspense fallback={<div className="space-y-4">
          {Array.from({ length: 6 }).map((_, i) => (
            <div key={i} className="h-4 bg-gray-200 rounded" />
          ))}
        </div>}>
          <RecipeInstructions slug={slug} />
        </Suspense>

        <aside className="space-y-6">
          <Suspense fallback={<NutritionSkeleton />}>
            <NutritionPanel slug={slug} />
          </Suspense>

          <Suspense fallback={<ReviewsSkeleton />}>
            <CommunityReviews slug={slug} />
          </Suspense>
        </aside>
      </div>
    </article>
  );
}

The recipe header and instructions stream in quickly. The nutrition panel arrives next. The community reviews — which aggregate ratings from multiple sources — stream in last. The user can start reading the recipe within 300ms while the heavier sections load in the background.


Streaming Data-Heavy Pages

The real power of streaming shows up on pages with multiple slow data sources. On EuroParts Lanka, the product detail page was the most complex streaming challenge I faced. It needs data from five different sources:

  1. Product details from the database (fast — 50ms)
  2. Real-time inventory from the warehouse API (medium — 150ms)
  3. Pricing with currency conversion (slow — 400ms from third-party API)
  4. Related products (medium — 200ms with joins)
  5. Customer reviews with aggregated ratings (slow — 350ms)

Without streaming, the page took 400ms minimum because the pricing API was the bottleneck. With streaming, the product details and layout appeared in under 100ms.

tsx
// app/parts/[partNumber]/page.tsx
import { Suspense } from "react";
import { ProductDetails } from "@/components/product-details";
import { InventoryStatus } from "@/components/inventory-status";
import { PricingPanel } from "@/components/pricing-panel";
import { RelatedProducts } from "@/components/related-products";
import { CustomerReviews } from "@/components/customer-reviews";

interface PartPageProps {
  params: Promise<{ partNumber: string }>;
}

export default async function PartPage({ params }: PartPageProps) {
  const { partNumber } = await params;

  return (
    <div className="container mx-auto px-4 py-8">
      {/* Fast — renders almost immediately */}
      <Suspense fallback={<ProductDetailsSkeleton />}>
        <ProductDetails partNumber={partNumber} />
      </Suspense>

      <div className="grid grid-cols-[1fr_350px] gap-8 mt-8">
        <div className="space-y-8">
          {/* Medium — streams in second */}
          <Suspense fallback={<InventorySkeleton />}>
            <InventoryStatus partNumber={partNumber} />
          </Suspense>

          {/* Slow — streams in last */}
          <Suspense fallback={<ReviewsSkeleton />}>
            <CustomerReviews partNumber={partNumber} />
          </Suspense>
        </div>

        <aside className="space-y-6">
          {/* Slow — but isolated, does not block anything else */}
          <Suspense fallback={<PricingSkeleton />}>
            <PricingPanel partNumber={partNumber} />
          </Suspense>

          {/* Medium — streams independently */}
          <Suspense fallback={<RelatedSkeleton />}>
            <RelatedProducts partNumber={partNumber} />
          </Suspense>
        </aside>
      </div>
    </div>
  );
}

The pattern is always the same: wrap each independent data section in its own Suspense boundary. The server sends the static layout immediately, then streams each section as its data resolves. No section blocks another.


Parallel Data Fetching

Streaming and parallel data fetching are complementary but different concepts. Streaming controls when the browser receives HTML chunks. Parallel fetching controls when the server initiates data requests.

A common mistake I see is nesting async Server Components in a way that creates sequential waterfalls:

tsx
// BAD: Sequential waterfall — each fetch waits for the previous
async function ProductPage({ id }: { id: string }) {
  const product = await getProduct(id);           // 50ms
  const inventory = await getInventory(id);       // 150ms (waits for product)
  const pricing = await getPricing(id);           // 400ms (waits for inventory)
  // Total: 600ms minimum

  return (
    <div>
      <ProductCard product={product} />
      <InventoryBadge inventory={inventory} />
      <PriceTag pricing={pricing} />
    </div>
  );
}

Even with Suspense, if you fetch sequentially inside a single component, you lose the parallel benefit. The fix is either to split into separate components (each with their own Suspense boundary) or use Promise.all when the data is needed in the same component:

tsx
// GOOD: Parallel fetching with Promise.all
async function ProductPage({ id }: { id: string }) {
  const [product, inventory, pricing] = await Promise.all([
    getProduct(id),
    getInventory(id),
    getPricing(id),
  ]);
  // Total: ~400ms (limited by slowest fetch)

  return (
    <div>
      <ProductCard product={product} />
      <InventoryBadge inventory={inventory} />
      <PriceTag pricing={pricing} />
    </div>
  );
}

But the best pattern — the one I use on every project — is separate components with separate Suspense boundaries. This gives you both parallel fetching AND progressive streaming:

tsx
// BEST: Parallel fetching + progressive streaming
function ProductPage({ id }: { id: string }) {
  return (
    <div>
      <Suspense fallback={<ProductSkeleton />}>
        <ProductCard id={id} />        {/* Fetches internally, streams at 50ms */}
      </Suspense>
      <Suspense fallback={<InventorySkeleton />}>
        <InventoryBadge id={id} />     {/* Fetches internally, streams at 150ms */}
      </Suspense>
      <Suspense fallback={<PriceSkeleton />}>
        <PriceTag id={id} />           {/* Fetches internally, streams at 400ms */}
      </Suspense>
    </div>
  );
}

Each component fetches its own data. Next.js kicks off all three fetches simultaneously on the server. As each resolves, that chunk streams to the browser. The product card appears almost instantly. The inventory badge follows. The price tag arrives last. The user never waits for the slowest fetch to see the fastest data.


Error Boundaries with Streaming

Streaming introduces a subtle complexity with error handling. When a streamed section fails, you need to handle the error without breaking the rest of the page. React Error Boundaries work with Suspense to catch errors in individual streamed sections.

tsx
// components/streaming-error-boundary.tsx
"use client";

import { Component, type ReactNode } from "react";

interface Props {
  children: ReactNode;
  fallback: ReactNode;
}

interface State {
  hasError: boolean;
}

export class StreamingErrorBoundary extends Component<Props, State> {
  constructor(props: Props) {
    super(props);
    this.state = { hasError: false };
  }

  static getDerivedStateFromError(): State {
    return { hasError: true };
  }

  render() {
    if (this.state.hasError) {
      return this.props.fallback;
    }

    return this.props.children;
  }
}

I pair every Suspense boundary with an Error Boundary in critical sections:

tsx
<StreamingErrorBoundary
  fallback={
    <div className="p-4 bg-red-50 rounded-lg text-red-700">
      <p className="font-medium">Unable to load pricing</p>
      <p className="text-sm mt-1">Prices are temporarily unavailable. Please refresh to try again.</p>
    </div>
  }
>
  <Suspense fallback={<PricingSkeleton />}>
    <PricingPanel partNumber={partNumber} />
  </Suspense>
</StreamingErrorBoundary>

On EuroParts Lanka, the third-party pricing API occasionally times out. Without the Error Boundary, the entire page would break. With it, users see the product details, inventory, and reviews — only the pricing section shows an error message. They can still browse and add items to their cart, and the pricing usually recovers on the next page load.

Next.js also provides a file-based approach with error.tsx:

tsx
// app/products/error.tsx
"use client";

interface ErrorProps {
  error: Error & { digest?: string };
  reset: () => void;
}

export default function ProductsError({ error, reset }: ErrorProps) {
  return (
    <div className="container mx-auto px-4 py-16 text-center">
      <h2 className="text-2xl font-bold text-gray-900">Something went wrong</h2>
      <p className="mt-2 text-gray-600">{error.message}</p>
      <button
        onClick={reset}
        className="mt-4 px-6 py-2 bg-[#F7931A] text-white rounded-lg hover:bg-[#E07B0A] transition-colors"
      >
        Try again
      </button>
    </div>
  );
}

My rule: use error.tsx for route-level errors (the whole page failed). Use inline Error Boundaries for section-level errors within streamed content.


Performance Before and After

Numbers matter more than theory. Here are real metrics from two production projects where I implemented streaming patterns.

EuroParts Lanka — Product Listing Page

MetricBefore StreamingAfter StreamingChange
LCP2.8s1.2s-57%
FCP1.9s0.4s-79%
TTFB0.8s0.12s-85%
CLS0.150.02-87%
Bounce Rate34%26%-23%

The TTFB improvement is the most telling. Before streaming, the server waited for all data before sending anything. After streaming, it sends the shell in ~120ms. The browser starts rendering immediately while the data streams in.

FreshMart — Recipe Detail Page

MetricBefore StreamingAfter StreamingChange
LCP3.1s1.4s-55%
FCP2.2s0.3s-86%
TTFB1.1s0.09s-92%
CLS0.220.01-95%
Avg. Time on Page2m 10s3m 45s+73%

The time on page increase on FreshMart was the most rewarding metric. When users see the recipe layout immediately and the content streams in progressively, they actually stay and read. Before streaming, many users bounced during the blank loading period and never came back.


My Streaming Architecture

After implementing streaming across multiple projects, I have settled on a consistent architecture pattern. Here is how I structure streaming in every Next.js application I build.

Layer 1: Route-Level Skeleton

Every route gets a loading.tsx that mirrors the final layout. This handles initial navigation transitions.

Layer 2: Section-Level Suspense

Within each page, independent data sections get their own Suspense boundaries. I group by data source — if two sections need the same fetch, they share a boundary. If they need different fetches, they get separate boundaries.

Layer 3: Skeleton Components

I maintain a /components/skeletons directory with skeleton components that match the dimensions of their corresponding real components. These are not afterthoughts — I build the skeleton first, then build the real component to match.

Layer 4: Error Isolation

Critical sections (pricing, checkout, authentication) always get Error Boundaries paired with their Suspense boundaries. Non-critical sections (reviews, recommendations) can fail gracefully with simpler fallbacks.

The Component Pattern

Every streamable component follows the same structure:

tsx
// components/product-grid.tsx
import { getProducts } from "@/lib/api/products";

export async function ProductGrid() {
  const products = await getProducts();

  return (
    <div className="grid grid-cols-3 gap-6">
      {products.map((product) => (
        <ProductCard key={product.id} product={product} />
      ))}
    </div>
  );
}

The component is async. It fetches its own data. It has no loading state logic — that is handled by the Suspense boundary wrapping it in the parent. This separation of concerns keeps components clean and makes them composable.

The Data Fetching Layer

I co-locate data fetching functions with their domain:

lib/
  api/
    products.ts     # getProducts(), getProduct(), searchProducts()
    inventory.ts    # getInventory(), checkAvailability()
    pricing.ts      # getPricing(), convertCurrency()
    reviews.ts      # getReviews(), getAverageRating()

Each function handles its own caching, error handling, and retry logic. The components just call the function and trust the data layer to handle the rest.

tsx
// lib/api/pricing.ts
import { unstable_cache } from "next/cache";

export const getPricing = unstable_cache(
  async (partNumber: string) => {
    const response = await fetch(
      `${process.env.PRICING_API_URL}/prices/${partNumber}`,
      {
        headers: { Authorization: `Bearer ${process.env.PRICING_API_KEY}` },
        signal: AbortSignal.timeout(5000),
      }
    );

    if (!response.ok) {
      throw new Error(`Pricing API returned ${response.status}`);
    }

    return response.json();
  },
  ["pricing"],
  { revalidate: 300, tags: ["pricing"] }
);

This pattern scales cleanly. When I need to add a new streamed section, I create the data function, create the async component, create the skeleton, and wrap it in Suspense. Four files, each with a single responsibility.


Key Takeaways

  1. Stream everything by default. If a section fetches data, wrap it in Suspense. The cost is near zero, and the performance gain is substantial.
  1. Skeletons are not optional. A streaming page without proper skeletons causes layout shifts. Build the skeleton first, then the real component.
  1. Separate components, separate boundaries. One Suspense boundary per data source. Never let a slow fetch block a fast one.
  1. Pair Error Boundaries with Suspense. A failed stream should not break the entire page. Isolate failures to their sections.
  1. Parallel by architecture, not by hack. When each component fetches its own data inside its own Suspense boundary, parallelism happens naturally.
  1. Measure before and after. LCP, FCP, TTFB, and CLS are the metrics that tell you if streaming is working. If your LCP did not drop significantly, your boundaries are wrong.
  1. `loading.tsx` for navigation, `<Suspense>` for sections. They solve different problems. Use both.

Streaming is not a nice-to-have optimization. On every project I ship — whether it is an auto parts marketplace or a recipe platform — it is the foundation of the rendering architecture. The patterns are simple, the results are measurable, and once you start building this way, you never go back to blocking renders.

If you want to see these patterns in action, check out my recent work or learn more about how I approach performance-first development on my services page.


*Written by Uvin Vindula — Web3 and AI engineer based between Sri Lanka and the UK. I build production-grade web applications with a focus on performance, security, and clean architecture. Follow my work at @IAMUVIN or reach out at contact@uvin.lk.*

Working on a Web3 or AI project?

Share
Uvin Vindula

Uvin Vindula

Web3 and AI engineer based in Sri Lanka and the UK. Author of The Rise of Bitcoin. Director of Blockchain and Software Solutions at Terra Labz. Founder of uvin.lk — Sri Lanka's Bitcoin education platform with 10,000+ learners.