Scrape product features on Google Shopping to structure SEO content

Google as a tool is an excellent way to organize specific SEO tasks. Google Shopping is even more so because it gives us access to unparalleled granularity of information when it comes to consumer products.

I’m not going to hide it any longer, this article (and tool which results from it) is directly inspired by Erlé Alberton’s talk I mentioned in my previous article.

More sophisticated than Search for transactional searches, and so much better because it is its vocation, Google Shopping allows us to have access to exhaustive generic technical characteristics.

For a product category, take TVs for example, Google Shopping is able to give us the following information: features (smart tv, etc.), HDTV formats, display types, screen sizes, resolutions, etc..

This information can be used to structure our content (let’s imagine it as paragraphs) or simply to include terms in our editorial.

How the scraping tool works

Start by getting the archive “Google_shopping.zip“. You can then decompress it on your desktop, access will be even faster.

Before launching the tool, make sure you have the following libraries:

beautifulsoup

requests

re

pandas

excel

tkinter

Once your libraries are installed, simply run the following command :

cd Desktop/google_shopping

python filter.py

Always in the Terminal, you enter your search. In our example, you enter: tv.

A window opens, containing all the characteristics related to the product range.

google shopping

You have two choices:

  1. You click directly on “Search”.
  2. You decide to filter more. For example, you choose to have only the features of Samsung TVs. Then again, you can filter more by choosing only HDTV 4K formats. And you can go even further in filtering. You still end up clicking on “Search”.

In the “data” subfolder, an Excel file has been created containing two tabs: “Products” and “Characteristics”.

google shopping

The first tab, Products, retrieved product information directly: product labels, prices, ratings and associated merchants. In my opinion, this is useful marketing information to ensure that the prices at which your products are sold are consistent with the market, that the nomenclature of your products is also consistent. Logging can allow you, for example, to organize your list pages differently, by putting the products concerned back together if you have them in your catalog. Finally, knowing associated merchants allows you to have a view on your competitors but, in theory, you already knew it before. 🙂

google shopping products

The second tab, Characteristics, came to retrieve the generic information (left column on Shopping) relating to your sorting. Nothing could be simpler.

google shopping characteristics

Feel free to give feedback on the use of the tool in comments, or directly to Pierre who developed it.

Automated content generation: get a first editorial structure

Many methods exist to generate content, including natural language generation from numerical data. Despite all the attractiveness of computational linguistics, our work will lead us to generate content through summarization algorithms.

Perform an analysis of entities and sentiments with your own crawler

Offered by Google Cloud’s Natural Language APIs, entity and sentiment analyses classify terms and extract a general feeling. Dated 2015, a Google patent entitled “Rankings of search results based on entity metrics” demonstrates the value of using it in our SEO actions.

Machine learning for SEO : how to predict rankings with machine learning

In order to be able to predict position changes after possible on-page optimisation measures, we trained a machine learning model with keyword data and on-page optimisation factors. With the help of this model, we can now automatically analyse thousands of potential keywords and select the ones that we have good chances on reaching interesting rankings for, with just a few simple on-page optimisations.