Page 3 of 3 FirstFirst 123
Results 21 to 28 of 28
Like Tree3Likes

Thread: robots.txt

  1. #21
    msp
    msp is offline AlterBlog User
    Join Date
    Jul 2012
    Posts
    29

    Default

    Among other things I want to prevent the crawling of feeds and searches. Also we recently decided to remove all tags from our pages. Crawlers tend to have long memories so they keep on trying to crawl now not existing pages and I want to prevent that too.

    In most of these cases the URL's refer to another page (the page without /feed or the searched page) or to pages that do not exist (the tag pages). I do not think that in these cases the noindex directive can be used.

    In any case the use of the noindex directive seems to be unpractical as it seems to involve a lot of manual HTML coding.

  2. #22
    alemoppo is offline AlterVista Staff
    Join Date
    Feb 2010
    Location
    IT
    Posts
    679

    Default

    Check in the Yoast SEO options: for example try to disable indexing of tags from SEO -> search apperance -> Taxonomies -> Tags.

    Bye!

  3. #23
    msp
    msp is offline AlterBlog User
    Join Date
    Jul 2012
    Posts
    29

    Default

    That only works if you do have tags but do not want them to be indexed. It does not work if you do not have tags (anymore) and do not want the crawlers to look for those none existing tags.

  4. #24
    msp
    msp is offline AlterBlog User
    Join Date
    Jul 2012
    Posts
    29

    Default

    What if I use the .htaccess to do a rewrite in order to redirect the crawlers to the robots.txt in the blog.pianetadonna.it/mysite/ directory. Do you think that could work?

  5. #25
    alemoppo is offline AlterVista Staff
    Join Date
    Feb 2010
    Location
    IT
    Posts
    679

    Default

    The .htaccess file is self-generated and is restored with each update.

    In any case, that rule should be written in the root .htaccess, where you don't have access.

    Bye!

  6. #26
    msp
    msp is offline AlterBlog User
    Join Date
    Jul 2012
    Posts
    29

    Default

    Any other suggestions then?
    Last edited by msp; 02-19-2020 at 11:41 AM.

  7. #27
    alemoppo is offline AlterVista Staff
    Join Date
    Feb 2010
    Location
    IT
    Posts
    679

    Default

    I can't understand why this solution is not working: the crawler will not index the tags; if you don't have tags, the crawlers will not index them.

    Bye!

  8. #28
    msp
    msp is offline AlterBlog User
    Join Date
    Jul 2012
    Posts
    29

    Default

    From what I read the Yoast solution is focused on the use of the XML sitemap.

    To start with, we do not use an XML sitemap for the tags. So using the Yoast SEO plugin to prevent the use of a sitemap is not really useful.

    Second, without using a sitemap, all kind of crawlers still find it necessary to crawl the tag pages. As we took the tags off we do not have any tag pages. And therefore crawlers try to crawl none existing pages over and over again. We would like to prevent that.

    I do not think that we can achieve that by using or not using a sitemap. So unless that Yoast Seo plugin uses another trick to prevent the indexing that I am not aware of I do not think it can provide a solution for our problem.


    So, summarizing:

    The most straightforward solution seemed to be the exclusion of tags in the robots.txt file as this is the standard (if maybe a bit outdated) way of directing crawlers. But you already pointed out that this is not an option as your setup of the pianetadonna platform prevents the proper working of the robots.txt file.

    Alternatively you could start think about a noindex tag. If you want to place this noindex tag in the header of the individual pages you encounter the problem that the tag pages do not exist anymore. So that is a no no.

    So that is why we started thing about doing a rewrite using the .htaccess file. But you pointed out that, again due to the setup of the pianetadonna platform this will not work as the rewrite has to be done in the .htaccess file in the root of the pianetadonna platform. And like you pointed out, we do not have access to that.

    Any other suggestion?

Page 3 of 3 FirstFirst 123

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

SEO by vBSEO