No-code backlink data collection

Introduction

This tutorial explains how to create a backlink profile for each domain selected for the core website dataset. For this pilot report, we used 42 domains. This process must be completed for each of the 42 websites.

Depending on your skill level, this task could take 10 to 30 minutes per domain. We recommend you create a master sheet at the beginning to keep track of the data sheets for each listed domain. This ensures tables aren’t confused or lost.

Date of collection Domain Link to data sheet
2022-09-11 astutenews https://docs.google.com/spreadsheets/d/e/2PACX-1vTD25TXOy4e-6ZHhHJzQ2m8cX6zSp2OQ--rJ81DFuQAj_l13QpzkNjHkyFaougmq489kTfRCl4nBd48/pubhtml?gid=1571247936&single=true

When the tables are completed, it is time for the aggregated data analysis. This can be found in a separate as yet unpublished section. The backlinks in this process come from BuzzSumo, but you can use backlinks from other sources.

Step One: Data collection

Use the backlink search function from Buzzsumo to find links to each site in your list of domains. The domains chosen for this first round fit the criteria outlined in the classification section, an evidence-based categorization algorithm. The domain search returns websites that have backlinked to any URL in the domain in the past year.

Step one - backlink analysis methodology - 12 September 2022.mp4

We selected “export CSV,” and BuzzSumo exported the backlinks. The column headings in the sheet were as follows: title, URL, evergreen_score, total_engagement, total_facebook_engagement, twitter_shares, pinterest_shares, total_reddit_engagement, published_date, article_types, video, article_amplifiers, author_name, num_words, thumbnail, facebook_comments, facebook_shares, facebook_likes, num_linking_domains, wow_count, love_count, haha_count, haha_count, sad_count, and angry_count.

Step Two: Data cleaning and analysis

For this analysis, we opened the data files in Google Sheets. The formula is written for Google Sheets and may not work in other programs. Once the data loads, add a column between columns “URL” and “evergreen_score,” also known as Columns B and C. Highlight Column C, right-click, and select “insert 1 column left.”

2022.07.23___docs.google.com___Backlinks_astutenews.com_-_Google_Sheets (3).webm

In the newly inserted column that now sits in the C column space, insert the formula:

=REGEXREPLACE(REGEXREPLACE(B:B,"(http(s)?://)?(www.)?",""),"/.*","")

The listed formula extracts the domain from each backlinking URL—press enter. Google Sheets should offer to auto-fill the column. Accept autofill so that the domain name listed for all URLs is extracted.

If the autofill option does not appear after entering the formula and pressing enter, use the mouse to drag the blue dot in the lower right-hand corner of the selected cell. Click the blue dot, hold the click, and drag downward until you have dragged the mouse through all cells that need to contain the formula.

Step Three: Pivot table

Next, click “insert.” Then, select “Pivot Table” from the menu. A new page for the pivot table has been created. Switch to the new pivot table sheet to calculate counts for each domain found in the URLs from the backlink data file. Name the new column with the domains in it “domain.”