Bulk Testing with PageSpeed Insights API

Test multiple URLs with the PageSpeed Insights API. Queue implementation, rate limit handling, and practical limits of bulk PSI testing.
Harlan WiltonHarlan Wilton5 min read Published

Test multiple URLs while respecting rate limits. Here's what works and where it breaks down.

The Challenge

PSI API limits make bulk testing complex:

ConstraintValue
URLs per request1
Requests per 100 seconds400
Requests per day25,000
Time per request10-30 seconds

Testing 1,000 URLs? That's at least 250 seconds of API calls, assuming no errors or rate limits. Real-world: expect 30+ minutes.

Queue Implementation (Node.js)

Use p-queue to throttle requests:

import PQueue from 'p-queue'

const API_KEY = process.env.PSI_API_KEY

// 4 concurrent, max 4 per 100ms = 400 per 100s
const queue = new PQueue({
  concurrency: 4,
  interval: 100,
  intervalCap: 4,
})

async function fetchPSI(url) {
  const response = await fetch(
    `https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=${encodeURIComponent(url)}&key=${API_KEY}`
  )

  if (!response.ok) {
    console.error(`Failed: ${url} (${response.status})`)
    return null
  }

  const data = await response.json()
  return {
    url,
    score: Math.round(data.lighthouseResult.categories.performance.score * 100),
  }
}

const urls = [
  'https://example.com',
  'https://example.com/about',
  'https://example.com/contact',
  // ... more URLs
]

const results = await Promise.all(
  urls.map(url => queue.add(() => fetchPSI(url)))
)

console.log(results.filter(Boolean))

Queue Implementation (Python)

import asyncio
import aiohttp
import os
from asyncio import Semaphore

API_KEY = os.environ['PSI_API_KEY']
MAX_CONCURRENT = 4
RATE_LIMIT_DELAY = 0.25  # 4 per second = 400 per 100s

semaphore = Semaphore(MAX_CONCURRENT)

async def fetch_psi(session: aiohttp.ClientSession, url: str) -> dict | None:
    async with semaphore:
        await asyncio.sleep(RATE_LIMIT_DELAY)

        params = {'url': url, 'key': API_KEY}
        async with session.get(
            'https://www.googleapis.com/pagespeedonline/v5/runPagespeed',
            params=params
        ) as response:
            if not response.ok:
                print(f'Failed: {url} ({response.status})')
                return None

            data = await response.json()
            return {
                'url': url,
                'score': round(data['lighthouseResult']['categories']['performance']['score'] * 100),
            }

async def bulk_test(urls: list[str]) -> list[dict]:
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_psi(session, url) for url in urls]
        results = await asyncio.gather(*tasks)
        return [r for r in results if r]

urls = [
    'https://example.com',
    'https://example.com/about',
    'https://example.com/contact',
]

results = asyncio.run(bulk_test(urls))
print(results)

Handling Failures

Add retry logic with exponential backoff:

async function fetchWithRetry(url, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(
      `https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=${encodeURIComponent(url)}&key=${API_KEY}`
    )

    if (response.ok) {
      return response.json()
    }

    if (response.status === 429) {
      const delay = 2 ** attempt * 1000
      console.log(`Rate limited, waiting ${delay}ms`)
      await new Promise(r => setTimeout(r, delay))
      continue
    }

    if (response.status >= 500) {
      const delay = 2 ** attempt * 1000
      console.log(`Server error, retrying in ${delay}ms`)
      await new Promise(r => setTimeout(r, delay))
      continue
    }

    // 4xx errors (except 429) - don't retry
    console.error(`Failed: ${url} (${response.status})`)
    return null
  }

  console.error(`Max retries exceeded: ${url}`)
  return null
}

Storing Results

Save to JSON for later analysis:

import { writeFile } from 'node:fs/promises'

const results = await Promise.all(
  urls.map(url => queue.add(() => fetchPSI(url)))
)

const report = {
  timestamp: new Date().toISOString(),
  totalUrls: urls.length,
  successful: results.filter(Boolean).length,
  results: results.filter(Boolean),
}

await writeFile('psi-report.json', JSON.stringify(report, null, 2))

Or export to CSV:

const csv = [
  'url,score,lcp,cls,tbt',
  ...results
    .filter(Boolean)
    .map(r => `${r.url},${r.score},${r.lcp},${r.cls},${r.tbt}`)
].join('\n')

await writeFile('psi-report.csv', csv)

Progress Tracking

Show progress for long-running jobs:

let completed = 0
const total = urls.length

queue.on('completed', () => {
  completed++
  process.stdout.write(`\r${completed}/${total} URLs processed`)
})

queue.on('error', (error) => {
  console.error('\nQueue error:', error.message)
})

When This Breaks Down

At scale, PSI API bulk testing has fundamental problems:

ProblemImpact
No URL discoveryYou provide every URL manually
Rate limit danceComplex queue management
No historical comparisonBuild your own storage
Slow feedback10-30s per URL
Quota exhaustion25k/day sounds like a lot until it isn't

The Math

Site SizeTime (optimistic)Time (realistic)
100 URLs4 minutes15 minutes
500 URLs20 minutes1 hour
1,000 URLs40 minutes2+ hours
5,000 URLs3+ hours10+ hours

And that's assuming no retries, no 429s, and no server errors.

Skip the Queue Management

Unlighthouse crawls your sitemap, discovers all URLs automatically, manages concurrency, and stores results. One command:

npx unlighthouse --site https://your-site.com
FeaturePSI API BulkUnlighthouse
URL discoveryManualAutomatic crawl
Rate limitsYour problemNone
Queue managementBuild it yourselfBuilt-in
Historical dataBuild it yourselfBuilt-in
Time for 500 URLs1+ hour~10 minutes

For scheduled bulk testing with historical tracking, Unlighthouse Cloud handles everything.

Try Unlighthouse Cloud