prevent search engines from indexing my help?
Moderators: Alexander Halser, Tim Green
-
- Posts: 38
- Joined: Fri Jun 15, 2012 9:44 pm
prevent search engines from indexing my help?
I do not wish to make it too easy for non-clients to see my beautiful new HTML Help pages.
How do I set H&M to set my help pages to PREVENT Google and other search engines from indexing them?
Thanks,
Kevin
How do I set H&M to set my help pages to PREVENT Google and other search engines from indexing them?
Thanks,
Kevin
Re: prevent search engines from indexing my help?
For a start: put a robot.txt into the root of your site.
http://en.wikipedia.org/wiki/Robots_exclusion_standard
http://www.mcanerin.com/en/search-engine/robots-txt.asp
Others may have better ideas.
http://en.wikipedia.org/wiki/Robots_exclusion_standard
http://www.mcanerin.com/en/search-engine/robots-txt.asp
Others may have better ideas.
-
- Posts: 38
- Joined: Fri Jun 15, 2012 9:44 pm
Re: prevent search engines from indexing my help?
Yes, adding robots.txt would work fine if I'm creating a website on my own.
But I use a task in H&M to create six versions of my help, in six folders on the server. So, the robots.txt would have to be in all six. The problem is that H&M requires you to empty out a destination folder before running Publish, so that there aren't any stray files leftover. Thus, every time you run Publish, you would have to manually drop robots.txt into each of the destination folders (locally, if I then upload them to server, or to the folders at the destination, if I upload them first).
Ideally, H&M would have some way to include robots.txt into the output as part of the tasks in Publish. And it would be extra nice if H&M took care of the FTP upload as part of the Publish tasks.
But I use a task in H&M to create six versions of my help, in six folders on the server. So, the robots.txt would have to be in all six. The problem is that H&M requires you to empty out a destination folder before running Publish, so that there aren't any stray files leftover. Thus, every time you run Publish, you would have to manually drop robots.txt into each of the destination folders (locally, if I then upload them to server, or to the folders at the destination, if I upload them first).
Ideally, H&M would have some way to include robots.txt into the output as part of the tasks in Publish. And it would be extra nice if H&M took care of the FTP upload as part of the Publish tasks.
-
- Posts: 454
- Joined: Thu Nov 16, 2006 1:29 pm
- Location: London, UK
Re: prevent search engines from indexing my help?
Have you tried adding robots.txt as a baggage file?Kevin Killion wrote:Ideally, H&M would have some way to include robots.txt into the output as part of the tasks in Publish.
-
- Posts: 38
- Joined: Fri Jun 15, 2012 9:44 pm
Re: prevent search engines from indexing my help?
The baggage file idea is a good one -- I had never read about baggage files before.
BUT ... in order to use this I'd have to create a separate domain for my help files, since there can be only one robots.txt file, at least according to this, found at the site that Hendrik recommended earlier:
Hmmmm. Is my only option to do just that, set up a separate domain for my help files?
Thanks.
BUT ... in order to use this I'd have to create a separate domain for my help files, since there can be only one robots.txt file, at least according to this, found at the site that Hendrik recommended earlier:
If I put this at the root, my entire company site would be skipped by Google.You can ONLY have one robots.txt on your site and ONLY in the root directory (where your home page is):
OK: http://www.yourdomain.com/robots.txt
BAD - Won't work: http://www.yourdomain.com/subdirectory/robots.txt
Hmmmm. Is my only option to do just that, set up a separate domain for my help files?
Thanks.
-
- Posts: 454
- Joined: Thu Nov 16, 2006 1:29 pm
- Location: London, UK
Re: prevent search engines from indexing my help?
If you look at the examples in Hendrik's Wikipedia link I think you'll find that the robots.txt file at the root of your domain can ask search bots to exclude specific subdirectories.
It's none of my business (and anyway I'm not an expert in SEO) but you might attract more potential users to your site if you allow your help files be indexed by search engines, especially if other reputable sites can link to your material.
It's none of my business (and anyway I'm not an expert in SEO) but you might attract more potential users to your site if you allow your help files be indexed by search engines, especially if other reputable sites can link to your material.
- Tim Green
- Site Admin
- Posts: 23189
- Joined: Mon Jun 24, 2002 9:11 am
- Location: Bruehl, Germany
- Contact:
Re: prevent search engines from indexing my help?
Hi Kevin,
As Simon pointed out, you can craft your robots.txt to include/exclude specific directories on your site. In addition to this, you can turn the auto-reload function for topic files ON, because this seems to prevent Google from indexing your site as well. In your project (if you are not using a skin), or in your skin (if you are) go to Configuration > Publishing Options > WebHelp > Navigation and switch on the option for reloading the full UI if the topic is loaded without the navigation frame.
In addition to this, you can also add a meta robots command to your HTML templates, telling search bots not to index your pages. Add this line
To all the following templates:
Warning: Like traffic regulations in Bombay, robots directives are only really vague suggestions. Reputable search engines like Google and Bing will honor them, almost all of the others will just ignore them.
As Simon pointed out, you can craft your robots.txt to include/exclude specific directories on your site. In addition to this, you can turn the auto-reload function for topic files ON, because this seems to prevent Google from indexing your site as well. In your project (if you are not using a skin), or in your skin (if you are) go to Configuration > Publishing Options > WebHelp > Navigation and switch on the option for reloading the full UI if the topic is loaded without the navigation frame.
In addition to this, you can also add a meta robots command to your HTML templates, telling search bots not to index your pages. Add this line
Code: Select all
<meta name="robots" content="noindex, nofollow" />
- HTML Page Templates > Default
- Layout
- Table of Contents
- Keyword Index
- Full Text Search
Warning: Like traffic regulations in Bombay, robots directives are only really vague suggestions. Reputable search engines like Google and Bing will honor them, almost all of the others will just ignore them.
Regards,
Tim (EC Software Documentation & User Support)
Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
Tim (EC Software Documentation & User Support)
Private support:
Please do not email or PM me with private support requests -- post to the forum directly.
-
- Posts: 38
- Joined: Fri Jun 15, 2012 9:44 pm
Re: prevent search engines from indexing my help?
Great, perfect! Thanks, Tim!