What is Crawling?
Definition
Crawling is the process by which search engine bots systematically browse the web, following links to discover new and updated pages for potential inclusion in search results.
Why crawling matters
Crawling matters because content that isn't crawled cannot be indexed or ranked. If search engine bots can't discover and access your pages, those pages are invisible to search users regardless of their quality.
Search engines allocate limited crawl resources, so making your site easy to crawl ensures important pages get discovered. Technical barriers like broken links, slow loading, or poor site architecture can prevent valuable content from being found.
Understanding crawling helps identify and fix issues that block search engines, ensuring your content has the opportunity to compete in search results.
Key concepts and types
- •Crawl budget
The number of pages search engines will crawl on your site within a given timeframe. - •Crawl frequency
How often search engine bots return to check for new or updated content. - •Robots.txt
A file that provides instructions to crawlers about which pages they can or cannot access. - •XML sitemap
A file listing important pages to help crawlers discover content efficiently. - •Crawl errors
Issues that prevent bots from accessing pages, reported in search console tools.
Common misconceptions
- ✕All crawled pages will be indexed
- ✕Submitting a sitemap guarantees crawling
- ✕Robots.txt blocks pages from appearing in search results
- ✕Every site needs to worry about crawl budget
- ✕Crawling happens in real-time
Related terms
FAQs
How can you check if a page has been crawled?
Use Google Search Console's URL Inspection tool to see when Google last crawled a page and whether it encountered any issues during crawling.
How can you improve crawl efficiency?
Maintain clean site architecture, fix broken links, improve page speed, submit XML sitemaps, use internal linking to connect pages, and block unnecessary pages from crawling.
Does crawl budget matter for most sites?
For most small to medium sites with under 10,000 pages, crawl budget isn't a concern. It becomes important for very large sites or sites with significant technical issues.