How to Block Google Read Aloud & Other Bots
Last updated: October 28, 2025
Overview
This guide explains how to prevent unwanted bots — including Google Read Aloud, AI content fetchers, and scraping agents — from accessing your website.
You’ll learn how to:
Configure your site’s
robots.txtrules.Use middleware to actively block requests from specific user agents.
Strengthen your site’s privacy and anti-scraping protections.
These instructions are designed for Next.js applications, but the same principles apply to other frameworks.
1. Add a Robots Policy (app/robots.js)
Your robots.js file tells web crawlers what parts of your site they’re allowed to visit.
Add the following configuration:
// app/robots.js
export default function robots() {
return {
rules: [
{ userAgent: 'Google-Read-Aloud', disallow: '/' },
{ userAgent: 'Googlebot', disallow: '/' },
{ userAgent: 'Bingbot', disallow: '/' },
{ userAgent: 'Slurp', disallow: '/' },
{
userAgent: '*',
disallow: ['/api/', '/_next/'],
allow: ['/robots.txt'],
},
],
}
}
🔍 What this does
Blocks Google Read Aloud, Googlebot, Bingbot, and Yahoo Slurp entirely.
Prevents any crawler from accessing sensitive paths like
/api/or internal Next.js routes.Still allows bots to access your
/robots.txtfile for compliance.
2. Block Unwanted User Agents via Middleware (middleware.js)
Even with a robots.txt file, some bots ignore your rules.
Use middleware to detect and reject those bots at the network level.
// middleware.js
import { NextResponse } from 'next/server';
export function middleware(req) {
const ua = req.headers.get('user-agent') || '';
const res = NextResponse.next();
// 1. Actively block unwanted Google agents
const blockedAgents = [
/Google-Read-Aloud/i, // Google Assistant / Read Aloud
/GoogleOther/i, // Secondary Google crawler
/Google-Speech/i, // Speech systems
/Google-Extended/i, // Generative AI content fetcher
];
if (blockedAgents.some((pattern) => pattern.test(ua))) {
console.warn('🚫 Blocked Google bot attempt:', ua, req.url);
return new NextResponse('Blocked for Google Read Aloud / AI fetch.', { status: 403 });
}
// 2. Strengthen privacy & anti-scraping headers
res.headers.set('X-Robots-Tag', 'noindex, noarchive, nosnippet, noai');
res.headers.set('Cache-Control', 'no-store, no-cache, must-revalidate');
res.headers.set('Pragma', 'no-cache');
// 3. Security hardening
res.headers.set('X-Content-Type-Options', 'nosniff');
res.headers.set('Referrer-Policy', 'no-referrer');
res.headers.set('Permissions-Policy', 'microphone=(), camera=(), geolocation=()');
return res;
}
// 4. Apply globally
export const config = {
matcher: [
'/((?!_next/static|_next/image|favicon.ico|robots.txt).*)',
],
};
⚙ How it works
Blocks known Google agents like
Google-Read-AloudandGoogle-Extendedbefore they reach your app.Adds privacy headers (
X-Robots-Tag,noai, etc.) to discourage content indexing or AI scraping.Strengthens security using policies like
no-referrerandnosniff.
3. Verify Your Setup
After deploying, test your configuration:
Check your robots file:
Visithttps://yourdomain.com/robots.txtand confirm the disallow rules appear correctly.Inspect response headers:
Use browser DevTools → Network tab → Response Headers to confirm:X-Robots-Tag: noindex, noarchive, nosnippet, noaiReferrer-Policy: no-referrer
Simulate blocked bots:
Usecurlto verify bots are blocked:
curl -A "Google-Read-Aloud" https://yourdomain.com4. Additional Notes
Verisoul’s bot detection can complement these rules by verifying real human users in real time.
If your site uses server-side rendering or APIs, apply similar user-agent checks at your backend or edge layer.
To keep your site’s SEO unaffected, whitelist specific crawlers (like Googlebot for indexing) only if desired.