← Back to context

Comment by cj

8 days ago

Not having a robots.txt is fine as long as it's a 404. If it's a 403, you'll be de-indexed.

I have a feeling there's more to the story than what's in the blog post.

If there's one thing I know about Google search, it's that there's never one behavior you can rely on. De-indexed? It's been decades since Google started drawing a complete distinction between allowing the Googlebot to crawl and presence in the index. Last time I needed to make a page disappear from the index, I learned that crawl permission had nothing to do with keeping a page in the index or not. In fact, disallowing it in robots was actually the worst thing I could do, since it wouldn't let the bot show up to find my new "noindex" metatags, which are now the only way to make your page drop out of the index.

Having a shortcut like 403ing the robots would actually be useful. LOL

I’d say you’re right because I have a custom 404 page returned by the robots.txt route and I’m well indexed by google