Appearing on Google‘s Top Stories carousel is a great way to get exposure for news content and boost organic web traffic. However, Google is not providing much data (clicks, impressions etc.) that you can receive from this carousel.
From a technical perspective Top Stories carousel is an AI-powered search engine results page (SERP) feature that displays useful and timely articles from a broad range of high–quality and trustworthy news providers.
import requests
import json
import pandas as pd
To scrape the “Top Stories” carousel with SEPRapi you need to create a free registration in SEPRapi which provides 100 free requests per month.
When you create the registration you should copy the API Key and paste into the script.
Also, they are providing a really nice feature SERPapi Playground where you can adjust your parameters based on your needs.
The current example is with the following parameters:
params = {
"engine": "google",
'q':'nfl predictions',
"location_requested": "New York, NY, United States",
"location_used": "New York,NY,United States",
"google_domain": "google.com",
"hl": "en",
"gl": "us",
"device": "mobile",
"api_key":"API KEY"
}
In the variable “response” I am storing the response from the API. From the library, “request” and the function get we can send the request with the concatenate parameters.
response = requests.get("https://serpapi.com/search.json?", params).json()
The first step should be to filter from the whole API response, which contains the information from all SEPR (Search Engine Result Page) and to store only the information from “Top Stories” in a variable.
As you may know, on a mobile device, there is more than one carouse with news (Top stories, close topic and Also in News). That’s why in a second variable we can add only the “Top Stories” carousel in Pandas DataFrame.
top_t = response['top_stories'] carousel = pd.DataFrame(top_t['carousel'])
kwrd = response['search_parameters']['q'] device = response['search_parameters']['device'] date = response["search_metadata"]['processed_at'] carousel['Keyword'] = kwrd carousel['Device'] = device carousel['Date of check'] = date
This is a standard piece of code that you can use to export a dataFrame into a Google Sheets file. This is a great opportunity to use also Looker Studio to visualize the data.
from google.colab import auth
auth.authenticate_user()
import gspread
from google.auth import default
creds, _ = default()
gc = gspread.authorize(creds)
doc_name = 'Top Stories'
from gspread_dataframe import get_as_dataframe, set_with_dataframe
sh = gc.create(doc_name)
worksheet = gc.open(doc_name).sheet1
set_with_dataframe(worksheet, carousel)
That is how should look your exported dataFrame with all needed values from the API.
You can find this simple script in GitHub and or get in touch with me on X or LinkedIn