Thursday, 8 October 2020

Custom Extraction Methods & Screaming Frog

It should not need saying that a website is essential these days for any business to succeed. With literally billions of potential customers online, businesses simply can't afford to miss out. In 2018, global 'e-retail' sales totalled something in the region of $2.8 trillion. The projected figure for 2021 is nearer to $4.8 trillion.

Competition for online business is fierce, as you might expect. The companies with the best SEO strategies are clearly going to win. Those who don't invest in SEO, or who neglect their websites, will suffer - it's as simple as that.

You might have excellent SEO and a great website. But how do you give your business the edge over your competition? There are many tools and tricks out there to boost your chances and help you maximise opportunities, and custom extraction is one.

What is Custom Extraction?

First of all, let's think back to the average office about fifty years ago. Immediately, you'd note the absence of computers in most offices, and where they were present they weren't exactly 'user friendly' or powerful - compared with today's models. When data was required, it would mostly be left to humans to gather. It was often a painstaking, laborious, and thankless task. And then there is the fact of 'human error'. Yes, reports were made, and frequently made well. But it took time and effort that often exceeded its usefulness. As time went on, and more data became available, it became less worthwhile. The internet wasn't around at the time, so you couldn't dive in to extract the data you required.

Jump forward to today, and we have information at our fingertips 24 hours a day. In fact, we have vast, overwhelming amounts of information and data. In a sense, we are faced with a similar problem to our counterparts of half a century ago; sifting through masses of data to find anything of value is not a good use of our time. And people still make mistakes at times. This is where custom extraction can be a useful tool, especially Screaming Frog which is our favorite. Special software is put into action that scours websites and grabs the data you need. It then saves it in a file for you to analyse and present in a range of formats.

And this can save you resources, time, and money.

How does it work?

If you're not familiar with the technical terms, it can all get a bit confusing. Basically, all websites are built using a code called HTML. Other codes are used in the design and presentation, but HTML is the basic framework that all internet browsers recognise. If you were to look at the actual code for a website, you might be amazed at just how much is there. If you were to search through it to find a specific bit of information, it could take a while.

Using a 'web scraper' you can find that information quickly. These scrapers are also sometimes referred to as 'bots', 'spiders', or 'spiderbots' and are the same type of tools used by search engines to scour the millions of pages that make up the internet in order to index relevant sites for any particular search.

In terms of custom extraction, the program can search a website, or even something like a Google SERP (search engine results page), and extract the data you need in seconds, even if the website has multiple URLs (possibly over a million!).

Web data extraction is growing in popularity as businesses are waking up to its importance. Custom data extraction takes this a step further by speeding up the process and automating it. You can download software like Screaming Frog to perform this for you, set the required parameters, and wait for the results.

And these results can seriously boost your company's performance or replace a time consuming method with a fast efficient data export.

An example of custom extraction for stock checks.

A lot of large online stores not only sell their own products but those of small or large suppliers that may not have high tech logistics or even a stock list at all. This can be a problem with customers on your site buying a product that you then cannot fulfill.

So how can we fix this?

You need to have a little bit of code knowledge specifically HTML & CSS but it really is quite simple. Although the code/method you use might differ slightly on different websites due to the structure or validity of the code, or indeed because its behind a login/wall.

You will need a tool like XPather to inspect the code on the page you want to extract.

For example:

A TV site might have some items listed as "out of stock" or "limited stock" or "due back in October" etc along with its product code/identifier on the same page. We can use XPather to give us the CSS path, Text Path or Inner HTML path of those values.

It will look something like this:



These 2 lines of code point to the exact path in the back-end code of the web page that you the visitor can see. We can enter these into the "custom extraction" tool within Screaming Frog and set the bot running (you can define a speed which if too fast is not ethical to the site you are crawling so be careful but this would need to be covered in another post).

Custom extraction using screaming frog for SEO
An example of the custom extraction being added

Once Screaming Frog has finished crawling the site (you can check its progress in the custom extraction tab) you can export the data to a spreadsheet. In this case it would be one column of product codes and in the next column its stock status. We can then use this information to do a mass export/import on the eCommerce site to hide out of stock items or un-hide back in stock items. This was a savior for one of our clients who at the time were manually checking or noticing too late after orders were made.

What are the benefits of Custom Extraction?

Here are a few examples of how this software - and the data it produces- can help:

Website / SEO audits

Your website can hide many technical errors that the human eye may not see which is even more prevalent on large sites with 1000`s of pages. With these tools you can quickly gather all of the data you need to make it easier to spot technical SEO errors.


Find the most suitable keywords for your business of product and see which keywords your competitors are using. Compile a chart to see which are ranking highest.


Regularly keep an eye on your competitors through an automated system, giving you information on their finances, funding, pricing, new developments, technology, and so on. Also, the 'web scraping' software can access content that is usually only available through downloading or creating an account.

Monitoring news sites

Some businesses operate in markets that are affected by events around the globe. This software can automatically gather data that helps them take timely and appropriate action - possibly before their rivals.

Locating a target audience

Custom extraction tools can search social media to discover general trends and to see which products are in demand. They can also explore existing customers to determine satisfaction levels. Both methods help to direct future marketing campaigns, seize opportunities, and maintain customer loyalty.

Brand image and awareness

Image is everything. The World Wide Web gives virtually everyone the opportunity to express their opinion on anything, including your business. Through web data extraction you can find out exactly what people think. Then you can use this data to maintain that image, or take steps to improve it, tracking changes through new reports.

Track Stock Volumes

By performing searches on your own pages, you can keep track of stock levels. One of the most frustrating aspects of online shopping is the message 'out of stock'. This can help out eCommerce companies massively if they are a distributor of suppliers products and do not have up to date stock lists from said supplier. (we have actually used this for a client it worked a treat during the busy Covid period!)

Keep your website updated

In cases of 'time-sensitive' events, like Black Friday, you need to quickly locate instances where wording and details need to be adjusted. Data extraction software keeps track of these locations to ensure they are updated.

Pricing strategy

Instantly keep abreast of fluctuations in prices for service and goods offered by your competitors. Analyze their products, rates, and features.

The benefits and advantages stretch way beyond this list, but it serves as an example of how useful custom extraction can be. Installing this software makes good sense for any business that is serious about SEO.

Based in the North East of England UK Web GeekZ are innovators in the search engine optimisation industry and use tactics that work time and time again, give us a call if you would like to get more traffic or simply a new website design.


5-7 Amity House, Coniscliffe Road, Darlington, County Durham, DL3 7EE

Source Here: Custom Extraction Methods & Screaming Frog
Article Written By