The message from the search console is related to the link I provided earlier: https://blog.pianetadonna.it/msp/zup...nale-tavola-2/
Regarding the general use of the robots.txt: on the page you want me to read (https://support.google.com/webmaster.../6062608?hl=en) it states:
This seems to support my suggestion that the robots.txt can be used to direct the search engine indexing crawlers.What is robots.txt used for? robots.txt is used primarily to manage crawler traffic to your site.
I think you are quite expert on this subject. For less experienced people I like to suggest reading this thread: https://yoast.com/ultimate-guide-robots-txt/ for reading. As Google for obvious reasons is very unclear about their indexing this text provides some useful insights.
Thanx,
Gert
I hope this one will do the trick:
http://tinyurl.com/uznl9mj
Thank you.
The https://blog.pianetadonna.it/msp/zup...nale-tavola-2/ seems not to be an article (i see a 301 redirect to an image).
I don't see any reference to the "robots.txt" file on the screen, simply the url cannot be indexed because the article does not exist.
Do you have any redirects set by plugins like "Redirection"?
Bye!
Hi,
The problem I try to solve has to do with images and attachments. For now I like to concentrate on the images. Like I said, in the Search Console I have thousands of crawl anomalies with the message "blocked by robots.txt". However, as we established earlier, I cannot have a working robots.txt in a subdirectory of the pianetadonna platform.
http://tinyurl.com/qwyr9fx
http://tinyurl.com/r8n4e4u
http://tinyurl.com/wz2dw94
http://tinyurl.com/saztr2s
http://tinyurl.com/t7c3vaz
And of course I have redirected URL's. None of these redirections have anything to do with images. But even if I did that would not explain the "blocked by robots.txt" message I am getting.
Thanx,
Gert
Hello, this is an incorrect message from the Google Search Console: the URL inspection tool only provides information regarding indexing for web pages and not images.
This behavior occurs on any site, even on other hosting.
It was confirmed unofficially on Twitter by a Google employee: https://twitter.com/JohnMu/status/1129304751160610816
Bye!
I am officially impressed for you to come up with some obscure message like that .
While not solving the problem it at least sets my mind at ease.
Thanx,
Gert
That leaves us with my questions regarding the working (or rather not working) of the robots.txt on the pianetadonna platform.
We established that the sites on the pianetadonna platform cannot use the robots.txt as the robots.txt should be placed in the root directory of the site. For the sites of the pianetadonna platform the root is the blog.pianetadonna.it directory. Therefore a robots.txt placed in the blog.pianetadonna.it/mysite.it directory will not work.
Do you know if there is any other way for me to manage search engine crawler traffic?
Exactly, you can't use the robots.txt file.
According to this page, if you want hide the content from the search engine, you have to use the noindex directives.
Normally, you don't need to use the robots.txt file. What did you want to do in particular?To keep a web page out of Google, you should use noindex directives
Bye!