Google not indexing WebHelp

Please post all questions and comments regarding Help & Manual 7 here.

Moderators: Alexander Halser, Tim Green

Post Reply
Helen Abbott
Posts: 26
Joined: Mon Dec 17, 2012 3:53 pm

Google not indexing WebHelp

Unread post by Helen Abbott »

Hi there,

Google is currently not indexing my documentation website. I've had indexing problems on and off since putting the documentation on the web in February 2015, where at one point only new topics weren't being indexed, but now no topics are being indexed.

I'm generating WebHelp using H&M 7.0.6 with a customized skin.

I followed the instructions in this guide: https://helpman.it-authoring.com/viewto ... 36c2c03528 but that did not resolve the problem.

Let me know what other info you need for troubleshooting. Thanks in advance for your help.

Helen
User avatar
Tim Green
Site Admin
Posts: 23155
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Google not indexing WebHelp

Unread post by Tim Green »

Hi Helen,

If you've turned off the automatic page redirects that should solve the most important issue that can prevent indexing by Google. The other thing that can happen is that Google can be classing a copy of your documentation as duplicate content. Do you have multiple copies of the help open on your web server and accessible to Google, for example a testing version? You don't get a direct penalty for this if it is all on your own site (you would if you were duplicating content from other sites!) but what happens is that Google doesn't know which version to index and the result is that all the copies can get indexed much slower -- sometimes so much so that it looks as though they are not getting indexed at all.

Also check the robots meta tags in your templates in your skin. The TOC, keyword index and search templates should contain this:

<meta name="robots" content="noindex, nofollow" />

The topic page template(s) should either contain no meta tag or one that explicitly tells search engines to index them and follow links in them:

<meta name="robots" content="index, follow" />

You should also check whether there are any forgotten robots.txt files in action on your website that might be actively preventing search engines from indexing the folders where your documentation is stored.
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Helen Abbott
Posts: 26
Joined: Mon Dec 17, 2012 3:53 pm

Re: Google not indexing WebHelp

Unread post by Helen Abbott »

Hi Tim,

Looks like Google is indexing the site now, though it's also returning my TOC, index & search pages, so I've now added <meta name="robots" content="noindex, nofollow" /> to these templates and will republish.

We do in fact want to host two versions of our documentation on our web site: the current version and a previous version. Part of the indexing problem started when we posted the second version, so what you say about multiple versions makes sense. My IT admin removed the previous version temporarily until we figure out this problem.

For the previous version, what's your advice for setting up the templates? Should they all be set to noindex, nofollow? I'd rather Google returns results for the current version.

Thanks,
Helen
User avatar
Tim Green
Site Admin
Posts: 23155
Joined: Mon Jun 24, 2002 9:11 am
Location: Bruehl, Germany
Contact:

Re: Google not indexing WebHelp

Unread post by Tim Green »

Hi Helen,

I would recommend that you "canonicalize" your current version to give Google and other search engines a clear message that that is the primary one. First see this page on Google for an introduction to the subject:

https://support.google.com/webmasters/a ... 6359?hl=en

The first step is to make sure that all the pages on the version you don't want index have noindex, nofollow attributes. So yes, that is good to do. In addition to that, you want to label the pages in the current version as "canonical" -- i.e. the version that you really do want to be primary and have indexed.

To do that you want to add a "canonical" attribute to your topic pages, by adding this tag to your topic page template in your skin (in the head version, together with the meta tags):

<link rel="canonical" href="http://YOUR PATH GOES HERE/<%HREF_CURRENT_PAGE%>" />

You want to do this to BOTH versions of your project. The old version gets the NEW URL, so that the canonical address is the same for both versions. This gives Googlebot a clear message that you are aware of both versions and only want the one version to be indexed. When it scans the old version it compares the content and knows that the search engine reference for the newer version is correct, because that's where the canonical reference points.
Regards,
Tim (EC Software Documentation & User Support)

Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Helen Abbott
Posts: 26
Joined: Mon Dec 17, 2012 3:53 pm

Re: Google not indexing WebHelp

Unread post by Helen Abbott »

Thanks Tim! Very helpful info.
Post Reply