An important piece of running a website in an SEO campaign involves directing the search engine crawlers as to where to go and what to do there. Though the system – a text file called robots.txt – has been in use for decades, there is no formal documentation detailing the directives crawlers will respect and the ones they won’t. This is in part because different search engines don’t all respect the same directives, and in part because no one has successfully compiled a list of official directives. However, Google has announced that they’ll establish a standard.
Consequently, Google has announced that they’ll no longer support unofficial directives placed in a robots.txt file on websites. These include important, oft-used directives, such as noindex and nofollow. Noindex prevents the crawler from listing the webpage as a search result, though it will still read the page as part of finding the sites it links to. It’s useful for login pages or internal search pages, as well as other such pages that don’t provide helpful content. Nofollow tells the crawler not to look at the websites the page links to. This is useful if there’s a risk the page contains untrusted content, for instance, as a comment section might.
Not A Major Issue
However, this isn’t the crisis it may appear to be. Google has been recommending that website owners not use these unofficial directives for years. Even before this announcement, there were situations in which Google’s crawler wouldn’t respect the noindex directive in robots.txt. Bing’s crawler never respected it in the first place. The standards Google has proposed to the Internet Engineering Task Force just simplify and formalize things. There are other ways to convey directives such as noindex and nofollow, and Google presented five as part of their announcement:
- Include the directives as meta tags.
- 404 and 410 HTTP status codes, which mean that the page does not exist. The page will still be crawled and processed, but it won’t be indexed.
- Pages behind passwords which aren’t clearly delineated to Google as subscriber- or paywall-based will not be indexed.
- Use disallow in robots.txt. Disallow is an official directive, so it will continue to work. If a page has that directive applied, the crawler will neither index it nor follow links on it.
- Google offers a URL removal tool in its search console.
The Changes are Useful
And not only is this not a crisis, it is a very important development. The details of the standardization make building a website simpler. Under the new system, the robots.txt file is meant for large-scale control of where the crawlers look for content. It’s for delineating categories of pages you don’t want indexed. Noindex and nofollow can then be implemented on each page, as you build them, making it easier to apply them to a broader array of pages as they are created. It’s easier to control crawler access to specific pages from those pages than from a file you’ll have to open separately with each change.
What Does all this Mean for Your Website
So, what does this mean for your website? You’ll make a robots.txt file, or edit the default one if provided, and include disallow directives that cover the broad categories of pages you don’t want indexed. Then, as you create new pages, or edit old ones, you can add or remove noindex and nofollow directives in the meta tags. And when something changes that concerns an entire category of pages, you can open your robots.txt file – which will be much shorter and more navigable – and edit the allowed and disallowed pages.
About iRISEmedia | SEO 2019 |Digital Marketing Agency | Toronto
At iRISEmedia.com we are a Digital Marketing agency specialized in social media management, influencer marketing, online reputation management and online branding. Our team helps clients manage and grow their online presence and branding to increase qualified web traffic and online leads. We service clients in Toronto and throughout Canada and the U.S. Give us a call or contact us if you want to discuss any of your social media marketing needs or to conduct a social media audit of your brand at no charge. We would be delighted to hear from you.