5 Ways to Use the Wayback Machine for SEO

Sometimes a simple tool can give you incredibly powerful insights.

The Wayback Machine is one such tool.

The Wayback Machine takes historical screenshots of web pages and stores them in its public database. Anyone can use the Wayback Machine to view previous versions of entire pages or sites.

Here are five smart ways you can use the Wayback Machine for SEO.

Get the daily newsletter that marketers rely on.

1. Find old URLs from old versions of the site

One of the most useful ways to use the Wayback Machine is to find historical URLs that were never redirected.

The Wayback Machine collects information about your site over time. So it might have been able to access the URL data from more than 10 years ago.

This is especially important for sites that have been around for a long time. It is possible that the stakeholder who ran the site years ago changed the company or left roles and may not have used SEO best practices during site migrations.

The Wayback Machine here can be a lifesaver. You can quickly find old URLs that were never redirected to the live versions.

For example, the “headphones” page from Bose (http://www.bose.com/products/headphones/) was not redirected from 2003:

With the Wayback Machine, it is easy to discover older versions of the main content from previous versions of the site. You can then find the redirect URLs that you probably would never have discovered otherwise.

Want to take this to the next level? Read Patrick Stokes’ article on using the Wayback Machine API to find historical redirects. By querying the API, you can export old URLs in bulk. This can be more effective for larger sites.

2. Find the content of the previous page

Site content changes over time. This happens for a variety of reasons (eg SEO, CRO, site migration or highlighting different aspects of the product). There is always an inherent risk that any changes to the content will occur, especially if they are significant.

This is where the Wayback Machine comes into play.

If you notice significant losses in ratings after changing content, you can check the Wayback Machine to view previous versions of old pages. Restoring content to its original version can help restore your content to lost visibility.

For example, the NYMag article “Best Pillows for Neck Pain” has lost organic view since mid-2020 for terms like “neck support pillow.” This has resulted in losing organic traffic over time.

Comparing the page to early 2020, we see that they have since changed the content. The 2020 edition included a chiropractor quote from the American Chiropractic Association on the front and kept the products above the page.

However, in the current version, they’ve added more content to the front, pushed products below the fold and moved the American Chiropractic Association quote to the bottom of the page.

Although this may not be the only reason for low rankings, looking at previous content during peak ratings may help them test restoring some content to older versions to see if this helps improve visibility.

3. Find old robots.txt file

Another great use of the Wayback Machine is to check how the robots.txt file has changed from previous versions. This can be especially useful during a site migration if your robots.txt file has changed and you don’t have a copy of the original file.

Fortunately, the Wayback Machine crawls a lot of robots.txt files. Just look at the number of times IBM’s robots.txt file was crawled in 2012:

With this, you can analyze how the robots.txt file has changed over time. For example, IBM’s robots.txt file looks completely different than it did before. Here is the file in 2012:

Looking at the site’s current robots.txt file, you can see that the commands have changed:

Using The Wayback Machine can be a very effective way to search for old versions of your robots.txt file. This is especially useful if information is lost during a site migration.

4. What sections do competitors add to their pages

Sites in competitive spaces routinely add or update content. For high priority keywords, your competitors are more likely to make frequent updates to their pages to try to improve their visibility. It can be difficult to track these changes.

Fortunately, the Wayback Machine allows you to understand the types of updates competitors are making to their content.

For example, we can use The Wayback Machine to look at Serious Eats’ best cast iron skillet page on June 27, 2021:

Looking at the page today, we can immediately see that they made some dramatic changes to the page:

By reviewing the current page, we can see that they have:

  • Added “Editor’s Note” at the top of the article
  • The “Winners” section has moved to the top of the page
  • Implemented internal links at the top of the first paragraph
  • Make the ‘Winners’ section more visible
  • FAQ section added

This is very valuable information when conducting a competitive analysis. These changes can now inform the editing strategy we apply to our own page.

It can be difficult to spot differences in content. Requires manual review. However, you can use tools like Diffchecker to easily detect content changes.

5. How often competitors update content

Use the Wayback Machine to determine how often competitors update content.

This is especially useful if you’re in a SERP scene where freshness of content is important to seeing.

For example, CNET’s highly ranked page for the best Android phone of 2022. At the top of the article, you can see the timestamp of the time the article was last updated:

Since technology is so fast-moving, novelty is probably important for terms like “best android phones” since products change so often. Therefore, we may want to research how often we need to update our content to stay competitive.

Using the Wayback Machine, we can create a timeline of how frequently CNET updates these articles. By looking at the previous timestamp on the page, we can search for the most recent historical copy of the Wayback Machine that precedes that date. For example, to find the update that happened before March 5, 2022, we can look up what the release looked like on March 2, 2022.

By iterating this process, we can develop a timeline of how often CNET updates this page:

Based on the data, it’s safe to say that CNET updates this article on a monthly basis. We may want to apply the same refresh frequency to our content to stay competitive with CNET.

Back to the road

In a world where the web is constantly changing, the Wayback Machine is invaluable.

You can use this tool in many ways to recover lost information and gain insights into the direction of competitors’ strategies.

Make sure the Wayback Machine is in your SEO toolkit.

The opinions expressed in this article are those of the guest author and not necessarily those of the search engine. Staff authors are listed here.

New in search engine land

About the author

Chris Long is Vice President of Marketing at Go Fish Digital. Chris works with unique issues and advanced search situations to help his clients improve organic traffic through a deep understanding of Google’s algorithm and web technology. Chris is a contributor to Moz, Search Engine Land, and The Next Web. He is also a speaker at industry conferences such as SMX East and State Of Search. You can connect with him on Twitter and LinkedIn.

Leave a Comment

Your email address will not be published.