Improve Website Traffic: Test Crawlability & Indexability | SEO Basics

Crawlability and Indexability
You’ve built a great website and made Search Engine Optimization the focus of your efforts. Now your website needs indexing in Google. So, you created unique content, courted high-quality backlinks, and used terms suited for organic keyword searches. Today, high-quality content and backlinks aren‘t enough for search rankings. So, you’ll need to deal with two key factors to shape your SEO  – crawlability and indexability.

Test Website Crawlability

Run a quick crawlability test and find critical issues affecting your site’s indexability.

How Does Google Crawl Websites?

Google has 3 core components, the crawler, index and algorithm. Note, the crawler is also known as a spider, Googlebot and user agent. The crawler’s job is to discover new web content by following links. It follows each link, then follows the links on newly discovered web pages, and so on and so forth. The crawler then brings new content back to Google’s database for cataloging, referred to as indexing.

What Does Crawlability Mean?

The ease of or ability for Google to crawl your website content, discover all site links and their destination pages without encountering dead-ends. We’d like to assume that all your links lead their destinations and your site is easily crawlable. But, if the spider has trouble crawling your site you’ll risk a low ranking.

What Does Indexability Mean?

The ability for Google to catalog your website content properly to relevant keyword search terms. So, crawling of your site directly affects how the search engine indexes your pages.

How to Influence Crawlability and Indexability

Crawlability is influenced by the technical and structural aspects of your website environment. Google may stop crawling if it discovers broken links, technical issues or an inefficient site layout. The goal is to create an efficient site setting to influence the spiders’ ability to crawl your site. Efficient crawling starts with good internal linking, site structure and no crawl errors.

Internal Linking

Your website should be a well-connected network of web pages that link to one another. Your web pages need to be reachable via a hyperlink, otherwise, you risk undiscoverable pages. Note, the crawlers can’t index pages it doesn’t find.

Site Structure

As I mentioned, your site should be inter-connected with relevant page to page links. The crawler (or user) should be able to access any page within 1-2 clicks (3 max). Pages nested too deep presents a poor site structure. To help Google understand you content, the site structure must have link relationships around core topics that link to related sub-topics. Then to help Google further you’ll have sub-topics link back to the core topic showing content priority. {Yoast Image w backlink} Failure to create link relationships leads to undiscoverable pages and content the crawlers have difficulty indexing (classifying).

Fix Crawl Errors

If the search engine follows a link but cannot access the page while crawling, there won’t be anything new for indexing. Your web server may return any of the following errors: 500, 502, 503, 404, etc. Note, crawl errors will show up in Google’s Search Console (Google Webmaster).  So, you’ll want to fix them right away. Website crawlability test

Fix Broken Links

Broken links occur when moving or renaming web pages. Short of making a sitewide search and replace, you may create a dead-end link unintentionally. The crawler can’t access pages from a broken link.

Fix Redirect Loops

A redirect is a server rule where the server will send a user to page B when page A is requested. More specifically, a redirect loop happens when multiple (conflicting) redirect rules. I.e. redirect 1 says Page A points to Page B and redirect 2 says Page B points to Page A. Similar to a broken link, a redirect loop may occur when moving content or renaming page URLs. The crawler can’t access neither Page A nor Page B.

Forms and Scripts

Crawlers can’t access content restricted behind a form. You may have content accessed via a login form or gated content requiring an email before displaying. Outdated technology or third-party scripts can restrict the crawler and prevent indexing.

How to Improve Crawling and Indexing

I’ve mentioned linking, structure and errors influence the crawling and indexing of your site. Now, let’s discuss how we can improve the efficiency of the crawl and create a healthy environment for indexing.

Crawler Access with Robots.txt

Robots.txt is a utility file living with your site and crawled by Google. It has special block/allow indexing instructions which helps crawl efficiency and indexing accuracy. Note, Meta Tags can also deliver per page instructions.

Page Load Speed

Faster loading websites yield a better user experience and improves the bots crawling rate. But, note that increased crawl rate doesn’t always mean better indexed search results. Google considers over 200 factors when determining your search engine rankings.

Sitemap.xml

A sitemap lists important web pages of your site, while telling the search engines about your  content. The sitemap also gives valuable metadata like last page update. A few ways to keep sitemaps organized and crawlable:
  • Update XML sitemap regularly *
  • Eliminate duplicate pages ***
  • Redirect pages properly (when deleting or renaming) **
  • Ensure canonicalized pages *
  • Use consistent, search engine friendly URLs
  • Use a UTF-8 encoded sitemap *
  • Check for sitemap errors regularly ***
* Feature of Yoast SEO free plugin ** Feature of Yoast SEO premium plugin *** Google Search Console

Content Quality

Nothing attracts search engines more than authoritative, high-quality content. But, not all content is created equal. Remember that it must meet the organic keyword litmus test and provide something of value to the consumer.

Prevent Duplicate Content

The same content found on multiple URLs of your website. It can occur on any site especially ecommerce stores and blogs. A few examples:
  • mysite.com/nike-air-max/ and mysite.com/sneakers/nike-air-max/
  • mysite.com/my-cool-blog-post/ and mysite.com/my-category/my-cool-blog-post/
Google doesn’t know your preferred URL which may impact crawling and indexing. It’s easily fixed with the rel=“canonical” Tag telling Google your primary URL. {Get help ranking, download our Complete SEO Starter Kit}

Crawlability Testing and Index Monitoring Tools

Google SEO Tools

Use Google’s go-to list for SEO tools. Here are popular ones:
  • PageSpeedInsights to analyze pages for speed and usability improvement suggestions.
  • Mobile-Friendly Test is another great tool to show mobile performance.
  • Google Search Console provides rich insights into your site‘s crawlability and indexing. A place to submit your XML sitemap, examine structured data and much more.

SEO Site Audit

It’s fair to say the SEMRush Site Audit is our SEO Swiss Army Knife. This tool gives you a comprehensive overview of your site’s overall health. The exhaustive data runs 20 different checks that focus on the ability to crawl and index your site. We use this tool to run automatic weekly audits so we’re always optimizing.

Final Thoughts

Of course, no tool will make any difference if you don’t follow through on its suggestions. Remember to factor both on-page and off-page elements into your strategy. Follow these technical SEO tips and you’ll see improved website crawlability. Fast & FREE Website Grader The Ultimate Guide to SEO
Avatar for Eric Steiner

Eric Steiner

Eric Steiner graduated with an MFA in professional and creative writing from Western Connecticut State University in 2014. He's worked on a number of professional writing projects with clients such as Pearson Education, WatchMojo.com, and Michael Mailer Films. Giving brands a voice is his passion.


Avatar for Eric Steiner

Eric Steiner

Eric Steiner graduated with an MFA in professional and creative writing from Western Connecticut State University in 2014. He's worked on a number of professional writing projects with clients such as Pearson Education, WatchMojo.com, and Michael Mailer Films. Giving brands a voice is his passion.

HubSpot - Certified Agency Partner
HubSpot - Top Digital Agency - New York City
Verified Google Reviews
BBB - A+ Rated - Accredited Business