503s can help prevent pages dropping from the index due to technical issues
One user described seeing a loss of pages from the index after a technical issue caused their website to be down for around 14 hours. John suggests that the best way to safeguard your site against outages like this is to set up a 503 rule ready for when things go wrong. That way, Google will see that the issue is temporary and will come back later to check whether it’s been resolved. Returning a 404 or another error page as the HTTP status code means that Google could interpret the outage as pages being removed permanently, which is why some pages drop so quickly out of the index if a site is down temporarily.
Google Treats Permanently 503’ing Robots.txt as an Error & Eventually Crawls the Site Normally
If a robots.txt returns a 503 for an extended period of time, Google will treat this as a permanent error and crawl the site normally to see what can be discovered.
Google Doesn’t Crawl Any URLs From Hostname When Robots.txt Temporarily 503s
If Google encounters a 503 when crawling a robots.txt file, it will temporarily not crawl any URLs on that hostname.
Google Checks Status Code Pages Before Attempting to Render
Google checks the status code of a page before doing anything else, such as rendering content. This helps to identify which pages can be indexed and which pages it shouldn’t render. For example, if your page returns a 404, Google won’t render anything from it.
Signals Are Kept For 4xx or 5xx Error Pages Previously Dropped from the Index When They Are Re-added
If your pages displayed a 4xx or 5xx error for a while and were dropped from the index but become available again after a month or so, for example, Google will be able to return them to the search results in the same state they were before. They won’t have to start trying to rank from nothing.
Google Can Periodically Try to Recrawl 5xx Error Pages
If a server error is shown on a page for as long as a week, Google can treat this in a similar way to a 404 error and will reduce the crawling of that page and remove it from the index, but will still access the page every now and again to see if the content is available again. If so, the page will be indexed again.
Avoid Serving a 503 Error for Days in a Row to Keep Site in Search
If a site serves a 503 error for several days in a row, then Google may start to assume that your site might be completely gone rather than temporarily unavailable and start removing pages from search.
Google Won’t Necessarily Follow the Date or Time Set in Retry-After Header
The Retry-After HTTP header is good practice to use alongside a 503 status code, but Google doesn’t always follow this as many sites use it in a generic way, and may retry your site sooner than you’ve specified.