Bulk Testing with PageSpeed Insights API
Test multiple URLs while respecting rate limits. Here's what works and where it breaks down.
The Challenge
PSI API limits make bulk testing complex:
| Constraint | Value |
|---|---|
| URLs per request | 1 |
| Requests per minute | ~240 |
| Requests per day | 25,000 |
| Time per request | 10-30 seconds |
Testing 1,000 URLs? That's at least 250 seconds of API calls, assuming no errors or rate limits. Real-world: expect 30+ minutes.
Optimal concurrency: 5-10 parallel requests is the sweet spot. Higher concurrency triggers undocumented throttling with 500 errors after ~450 requests.
Queue Implementation (Node.js)
Use p-queue to throttle requests:
import PQueue from 'p-queue'
const API_KEY = process.env.PSI_API_KEY
// 4 concurrent, max 4 per second (stays under ~240/min limit)
const queue = new PQueue({
concurrency: 4,
interval: 1000,
intervalCap: 4,
})
async function fetchPSI(url) {
const response = await fetch(
`https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=${encodeURIComponent(url)}&key=${API_KEY}`
)
if (!response.ok) {
console.error(`Failed: ${url} (${response.status})`)
return null
}
const data = await response.json()
return {
url,
score: Math.round(data.lighthouseResult.categories.performance.score * 100),
}
}
const urls = [
'https://example.com',
'https://example.com/about',
'https://example.com/contact',
// ... more URLs
]
const results = await Promise.all(
urls.map(url => queue.add(() => fetchPSI(url)))
)
console.log(results.filter(Boolean))
Queue Implementation (Python)
import asyncio
import aiohttp
import os
from asyncio import Semaphore
API_KEY = os.environ['PSI_API_KEY']
MAX_CONCURRENT = 4
RATE_LIMIT_DELAY = 1.0 # 4 concurrent × 1 req/s each (under ~240/min limit)
semaphore = Semaphore(MAX_CONCURRENT)
async def fetch_psi(session: aiohttp.ClientSession, url: str) -> dict | None:
async with semaphore:
await asyncio.sleep(RATE_LIMIT_DELAY)
params = {'url': url, 'key': API_KEY}
async with session.get(
'https://www.googleapis.com/pagespeedonline/v5/runPagespeed',
params=params
) as response:
if not response.ok:
print(f'Failed: {url} ({response.status})')
return None
data = await response.json()
return {
'url': url,
'score': round(data['lighthouseResult']['categories']['performance']['score'] * 100),
}
async def bulk_test(urls: list[str]) -> list[dict]:
async with aiohttp.ClientSession() as session:
tasks = [fetch_psi(session, url) for url in urls]
results = await asyncio.gather(*tasks)
return [r for r in results if r]
urls = [
'https://example.com',
'https://example.com/about',
'https://example.com/contact',
]
results = asyncio.run(bulk_test(urls))
print(results)
Handling Failures
Add retry logic with exponential backoff:
async function fetchWithRetry(url, maxRetries = 3) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const response = await fetch(
`https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=${encodeURIComponent(url)}&key=${API_KEY}`
)
if (response.ok) {
return response.json()
}
if (response.status === 429) {
const delay = 2 ** attempt * 1000
console.log(`Rate limited, waiting ${delay}ms`)
await new Promise(r => setTimeout(r, delay))
continue
}
if (response.status >= 500) {
const delay = 2 ** attempt * 1000
console.log(`Server error, retrying in ${delay}ms`)
await new Promise(r => setTimeout(r, delay))
continue
}
// 4xx errors (except 429) - don't retry
console.error(`Failed: ${url} (${response.status})`)
return null
}
console.error(`Max retries exceeded: ${url}`)
return null
}
Storing Results
Save to JSON for later analysis:
import { writeFile } from 'node:fs/promises'
const results = await Promise.all(
urls.map(url => queue.add(() => fetchPSI(url)))
)
const report = {
timestamp: new Date().toISOString(),
totalUrls: urls.length,
successful: results.filter(Boolean).length,
results: results.filter(Boolean),
}
await writeFile('psi-report.json', JSON.stringify(report, null, 2))
Or export to CSV:
const csv = [
'url,score,lcp,cls,tbt',
...results
.filter(Boolean)
.map(r => `${r.url},${r.score},${r.lcp},${r.cls},${r.tbt}`)
].join('\n')
await writeFile('psi-report.csv', csv)
Progress Tracking
Show progress for long-running jobs:
let completed = 0
const total = urls.length
queue.on('completed', () => {
completed++
process.stdout.write(`\r${completed}/${total} URLs processed`)
})
queue.on('error', (error) => {
console.error('\nQueue error:', error.message)
})
Score Variance
Scores can vary ±5 points between runs due to network conditions and server load. For accurate monitoring:
- Run each URL 3 times and take the median score (recommended practice)
- Test from a consistent location (PSI uses servers in Oregon, S. Carolina, Netherlands, or Taiwan)
- Sites without CDNs see more variance based on test server location
When This Breaks Down
At scale, PSI API bulk testing has fundamental problems:
| Problem | Impact |
|---|---|
| No URL discovery | You provide every URL manually |
| Rate limit dance | Complex queue management |
| No historical comparison | Build your own storage |
| Slow feedback | 10-30s per URL |
| Quota exhaustion | 25k/day sounds like a lot until it isn't |
| Score variance | Need multiple runs per URL for accuracy |
The Math
| Site Size | Time (optimistic) | Time (realistic) |
|---|---|---|
| 100 URLs | 4 minutes | 15 minutes |
| 500 URLs | 20 minutes | 1 hour |
| 1,000 URLs | 40 minutes | 2+ hours |
| 5,000 URLs | 3+ hours | 10+ hours |
And that's assuming no retries, no 429s, and no server errors.
Alternative: CrUX API
If you only need field data (Core Web Vitals from real users), the CrUX API has no daily limit — just 150 requests/minute. It's faster since there's no Lighthouse analysis overhead.
curl "https://chromeuxreport.googleapis.com/v1/records:queryRecord?key=YOUR_KEY" \
-d '{"url": "https://example.com"}'
Note: Google is discontinuing CrUX data in PSI API, recommending the CrUX API instead.
Skip the Queue Management
Unlighthouse crawls your sitemap, discovers all URLs automatically, manages concurrency, and stores results. One command:
npx unlighthouse --site https://your-site.com
| Feature | PSI API Bulk | Unlighthouse |
|---|---|---|
| URL discovery | Manual | Automatic crawl |
| Rate limits | Your problem | None |
| Queue management | Build it yourself | Built-in |
| Historical data | Build it yourself | Built-in |
| Time for 500 URLs | 1+ hour | ~10 minutes |
For scheduled bulk testing with historical tracking, Unlighthouse Cloud handles everything.
Try Unlighthouse Cloud