Wayback, archive.today and Google Cache for site recovery
The Wayback Machine is the famous one but not the only one. When archive.today and Google Cache can save what Wayback missed.
The Wayback Machine handles 80% of recovery jobs. The other 20% is when Wayback didn't help and you still need the site. That's where the side options come in.
Wayback Machine — primary
- Best depth — captures going back 25+ years
- Calendar view that shows snapshot frequency
- You can save a page right now if the site is still up
- Weak spot: no pages behind login, and respects robots.txt Disallow
archive.today (also archive.ph)
- Saves on user request — exact pixel-perfect snapshot of what the browser saw
- Ignores robots.txt — sometimes has what Wayback won't show
- Especially useful for articles readers have manually archived
- Limitation: only pages someone explicitly saved
Google Cache
- Freshest version — often only weeks old
- Useful when the site went down recently and Google hasn't dropped it from the index yet
- Downside: Google removes pages from cache 2-4 weeks after the original is gone
- In 2024 Google started winding down public cache access — the cache:URL query still works most of the time
Bing Cache + Yandex
- Bing keeps its cache longer than Google
- Yandex "saved copy" in search results — often fresher than Wayback for Russian-language sites
Recovery strategy
Don't pick one source — work through all of them in order:
- Wayback Machine — the bulk of the pages
- Google Cache — freshest versions of homepage and priority pages
- archive.today — pages missing from Wayback, picked off one by one
- Yandex/Bing — fill remaining gaps for Russian-language content
Real coverage usually comes from combining 2-3 sources, not relying on one.