What is crawling?
Crawling (or spidering) is when Google or another Search Engine sends a bot to a web page or web post and "reads" the page. Do not let this be confused with indexing that page. Crawling is the first part where a Search Engine recognizes your page and displays it in the search results. However, having your page crawled does not necessarily mean that your page has been indexed and will be found.
Pages are crawled for a variety of reasons, including:
- An XMLSitemap with the concerned URL, which was transmitted to Google.
- Internal links pointing to the page.
- Have external links pointing to the page.
- Bring an increase in traffic to the site
To ensure that your page is crawled, you should create an XMLSitemap into the Google Search Console (formerly Google Webmaster Tools) to give Google the roadmap for all your new content.
In Google Search Console you can see what has been submitted and what is indexed was.
What crawling means is that Google looks at the page. Depending on whether Google thinks that the content is "new" or otherwise has something to "give to the Internet", it may schedule an indexing, which means that it has the opportunity to create a Ranking to perform.
Also, when Google crawls a page, it looks at the links on that page and schedules the Google Bot to check out those pages as well. The exception is when a nofollow tag is added to the link.
How can I improve the crawlability of my website?
Think of Google's crawlers as little spider robots that "crawl" into your website and look around. The easier you make it for these little guys, the better for you. Here are a few tips to make your website as inviting as possible for the Google spiders.
1. xml sitemap
First, you need a good map of your home, that is, your website. An XMLSitemap Google shows exactly which pages you have and how they are connected. It's like giving your visitor a map of your giant maze so they don't get lost.
2. flat hierarchy
Try to keep the structure of your website as simple as possible. No deeply nested pages that take seven clicks to reach. That would be like sending someone through seven different doors just to find the bathroom.
3. fast loading times
4. responsive design
Your site should look good on all devices and be easy to use. The Google spiders also check how mobile-friendly your site is. Imagine your home also has a miniature version for little guests. They should feel just as comfortable as the "full-size" visitors.
5. internal linking
Make sure you have smart internal links that guide crawlers through your site. But beware of endless loops or "Broken Links". That would be like a door in your house going nowhere or going in a circle.
With a Robots.txt file, you can tell the crawlers which areas to avoid. It's like a "do not disturb" sign on the door of your private room.
7. avoid error pages
404 error pages are like dead ends for crawlers. So try to minimize these errors or replace them with 301-replacing detours. It's like setting up a detour when the main road is closed.
Typical mistakes you should avoid:
Before you roll out the red carpet for Google crawlers, let's talk briefly about the stumbling blocks you're better off avoiding.
1. clogged Robots.txt
Your Robots.txt is like the bouncer of your club. If it's too strict, no one will get in. So, check your Robots.txt and make sure you're not accidentally blocking important areas of your site.
2. too many 404 errors
Imagine inviting guests to your home and half the doors are locked or lead to nowhere. Not cool, right? Too many 404 errors can irritate crawlers and waste your crawl budget.
3. poor internal linking
It's like you're stuck in a maze with no signposts. The crawlers need a clear structure to find their way around. So link relevant pages to each other, but don't overdo it.
4. slow loading times
Slow websites are like restaurants where the food takes forever. Eventually, you lose patience. Google does, too. So, optimize loading times wherever you can.
5. endless URLs and parameters
Imagine a URL like a street address. If the address is forever long and full of weird characters, it will be hard to find. Keep the URLs clean and simple.
Cloaking is like a trap - you show one content to crawlers and another to users. Google hates that and you could be penalized for it. So, just don't do it.
7. duplicate content
That's like hanging the same painting in every room of your house. Google then doesn't know which is the "original" and might devalue all versions.
8. poor mobile optimization
If your site looks bad or loads slowly on mobile, it's like a restaurant that doesn't have room for strollers. Google wants the best experience for all users, so optimize for mobile.
Crawlability is like hospitality in a hotel. You want your guests (and Google crawlers) to feel comfortable, find their way around, and be happy to come back. Make it easy for them, and you'll be rewarded in search results.« Back to Glossary Index