---
title: "Bulk Lighthouse Testing for Large Sites · Unlighthouse"
meta:
  "og:description": "Scan large websites with thousands of pages efficiently. Configure sampling, URL filtering, and optimization strategies for bulk Lighthouse testing."
  "og:title": "Bulk Lighthouse Testing for Large Sites · Unlighthouse"
  description: "Scan large websites with thousands of pages efficiently. Configure sampling, URL filtering, and optimization strategies for bulk Lighthouse testing."
---

**Recipes**

# **Bulk Lighthouse Testing for Large Sites**

[Copy for LLMs](https://unlighthouse.dev/guide/recipes/large-sites.md)

Scan websites with thousands of pages efficiently. Unlike single-page tools like PageSpeed Insights, Unlighthouse handles large sites with smart sampling, parallel scanning, and configurable limits.

- **Automatic discovery** - Finds all pages via sitemap and crawling
- **Smart sampling** - Tests representative pages from each template
- **Parallel scanning** - Multiple Chrome instances for speed
- **Aggregated results** - Site-wide scores and insights

Unlighthouse includes smart defaults for large sites. Understanding these helps balance completeness with performance.

## [Default Large Site Configuration](#default-large-site-configuration)

These defaults optimize scanning for sites with thousands of pages:

- [**ignoreI18nPages**](https://unlighthouse.dev/api-doc/config#scanner-ignorei18npages) enabled
- [**maxRoutes**](https://unlighthouse.dev/api-doc/config#scanner-maxroutes) set to 200
- [**skipJavascript**](https://unlighthouse.dev/api-doc/config#scanner-skipjavascript) enabled
- [**samples**](https://unlighthouse.dev/api-doc/config#scanner-samples) set to 1
- [**throttling**](https://unlighthouse.dev/api-doc/config#scanner-throttle) disabled
- [**crawler**](https://unlighthouse.dev/api-doc/config#scanner-crawler) enabled
- [**dynamicSampling**](https://unlighthouse.dev/api-doc/config#scanner-dynamicsampling) set to 5

For example, when scanning a blog with thousands of posts, it may be redundant to scan every single blog post, as the DOM is very similar. Using the configuration we can select exactly how many posts should be scanned.

## [Manually select URLs](#manually-select-urls)

You can configure Unlighthouse to use an explicit list of relative paths. This can be useful if you have a fairly complex and large site.

See [**Manually providing URLs**](https://unlighthouse.dev/guide/guides/url-discovery#manually-providing-urls) for more information.

## [Provide Route Definitions (optional)](#provide-route-definitions-optional)

To make the most intelligent sampling decisions, Unlighthouse needs to know which page files are available. When running using the integration API, Unlighthouse will automatically provide this information.

Using the CLI you should follow the [**providing route definitions**](https://unlighthouse.dev/guide/guides/route-definitions) guide.

Note: When no route definitions are provided it will match based on URL fragments, i.e `/blog/post-slug-3` will be mapped to `blog-slug`.

## [Exclude URL Patterns](#exclude-url-patterns)

Paths to ignore from scanning.

For example, if your site has a documentation section, that doesn't need to be scanned.

```
import { defineUnlighthouseConfig } from 'unlighthouse/config'

export default defineUnlighthouseConfig({
  scanner: {
    exclude: [
      '/docs/*',
    ],
  },
})
```

## [Include URL Patterns](#include-url-patterns)

Explicitly include paths; this will exclude any paths not listed here.

For example, if you run a blog and want to only scan your article and author pages.

```
import { defineUnlighthouseConfig } from 'unlighthouse/config'

export default defineUnlighthouseConfig({
  scanner: {
    include: [
      '/articles/*',
      '/authors/*',
    ],
  },
})
```

## [Change Dynamic Sampling Limit](#change-dynamic-sampling-limit)

By default, a URLs will be matched to a specific route definition 5 times.

You can change the sample limit with:

```
import { defineUnlighthouseConfig } from 'unlighthouse/config'

export default defineUnlighthouseConfig({
  scanner: {
    dynamicSampling: 20, // 20 samples per page template
  },
})
```

## [Disabling Sampling](#disabling-sampling)

In cases where the route definitions aren't provided, a less-smart sampling will occur where URLs under the same parent will be sampled.

For these instances you may want to disable the sample as follows:

```
import { defineUnlighthouseConfig } from 'unlighthouse/config'

export default defineUnlighthouseConfig({
  scanner: {
    dynamicSampling: false, // Disable sampling completely
  },
})
```

[Edit this page](https://github.com/harlan-zw/unlighthouse/edit/main/docs/1.guide/recipes/large-sites.md)

[Markdown For LLMs](https://unlighthouse.dev/guide/recipes/large-sites.md)

**Did this page help you? **

Anything that could be done better? :)

Help us improve this page. You can [edit this page](https://github.com/harlan-zw/unlighthouse/edit/main/docs/1.guide/recipes/large-sites.md) on GitHub or provide anonymous feedback below.

[**Improving Accuracy** Optimize Lighthouse scan accuracy with multiple samples and reduced concurrency for more reliable, consistent Core Web Vitals results.](https://unlighthouse.dev/guide/recipes/improving-accuracy) [**SPAs** Configure Unlighthouse to scan single-page applications (SPAs) with client-side routing like React, Vue, and Angular apps.](https://unlighthouse.dev/guide/recipes/spa)

**On this page **

- [Default Large Site Configuration](#default-large-site-configuration)
- [Manually select URLs](#manually-select-urls)
- [Provide Route Definitions (optional)](#provide-route-definitions-optional)
- [Exclude URL Patterns](#exclude-url-patterns)
- [Include URL Patterns](#include-url-patterns)
- [Change Dynamic Sampling Limit](#change-dynamic-sampling-limit)
- [Disabling Sampling](#disabling-sampling)