Setting up robots.txt for your website
robots.txt controls access for search engine bots to website pages. Proper configuration prevents indexing of technical pages, duplicates, and private sections.
Basic structure
User-agent: *
Disallow: /admin/
Disallow: /area51/
Disallow: /api/
Disallow: /cart/
Disallow: /checkout/
Disallow: /account/
Disallow: /search?
Disallow: /*?sort=
Disallow: /*?page=
Allow: /
Sitemap: https://example.ru/sitemap.xml
What to block
Required:
- Administration panels (
/admin/,/wp-admin/) - API endpoints (
/api/) - Cart, checkout, account pages
- Search results pages
- Technical pages (login, register, password-reset)
Recommended:
- URLs with filtering and sorting parameters (duplicate content)
- Pagination pages (or allow them if no canonical)
-
/print/,/pdf/versions of pages
Do not block:
- CSS and JS files — Google must see them for rendering
- Images (if you want Google Images indexing)
Directives for Yandex
Yandex supports extended syntax:
User-agent: Yandex
Disallow: /search?
Disallow: /*?utm_
Clean-param: utm_source&utm_medium&utm_campaign&utm_content&utm_term
Clean-param tells Yandex which GET parameters don't create unique content — prevents duplicate appearance.
Dynamic robots.txt in Laravel
Route::get('/robots.txt', function () {
$content = view('robots')->render();
return response($content, 200, ['Content-Type' => 'text/plain']);
});
User-agent: *
@if (app()->environment('production'))
Disallow: /admin/
Disallow: /api/
Allow: /
Sitemap: {{ url('/sitemap.xml') }}
@else
Disallow: /
@endif
On staging/dev environments block everything — to prevent search engines from indexing the test site.
Verification
- Google Search Console → robots.txt Tester Tool
-
curl https://example.ru/robots.txt— verify the file is served correctly - Ensure the file is strictly in the domain root (not
/en/robots.txt)
Setup time: a few hours.







