SEO Keyword Impact Analysis with Ahrefs and Python
SEO specialists are regularly asked the same question: how much organic traffic could a new page, template, or vertical generate? That answer is never exact, but it can be estimated quickly with a repeatable keyword-level model.
This workflow uses an Ahrefs export, average search volume, current keyword positions, and a CTR curve to estimate current traffic and model upside under ranking improvement scenarios. It is intentionally simple, transparent, and easy to explain to stakeholders.
What you need to run the script
- An export of the Organic Keywords report from Ahrefs for the URL or subfolder you want to analyse.
- The export should be saved in UTF-8 format.
- Access to Google Colab if you want to run the workflow in the browser rather than locally.
What the script does
At a high level, the script calculates current traffic based on search volume and estimated CTR for each ranking position. It then applies one or more ranking-improvement scenarios and recalculates the traffic with the adjusted positions.
- Estimate current traffic from keyword positions.
- Simulate ranking improvements with scenario-based position changes.
- Export both the raw keyword table and a summary table to CSV.
- Visualise the estimated gains by keyword difficulty category.
In Google Colab, a slider can be used to change the expected improvement in rankings for scenario one and scenario two.
Libraries used
import pandas as pd import numpy as np from google.colab import files
- Pandas for data cleaning, transformation, and export.
- NumPy for conditional logic and vectorised calculations.
- google.colab.files to upload the Ahrefs export directly into the notebook environment.
Step 1: Upload the Ahrefs export
First, upload the exported keyword file from Ahrefs into your Colab session.
files.upload()
After that, reference the uploaded filename inside the script and remove any columns that are not relevant to the analysis.
Step 2: Define your ranking-improvement scenarios
The improvement values should be set near the top of the script and stored in variables such as scenario_one andscenario_two.
For example, if you are benchmarking against a competitor, one scenario might model what happens if you improve by a single ranking position and overtake them.
scenario_one = 1 scenario_two = 3
Keywords without a valid ranking position should be removed from the model. In most Ahrefs exports these are terms ranking beyond the top 100 or queries that are not meaningfully relevant.
df2 = df2.dropna(subset=["Position"])
Step 3: Calculate current organic traffic
Current traffic is estimated by mapping each keyword position to a CTR and multiplying that CTR by average monthly search volume. The logic is simple, which is one of its strengths: anyone with SEO experience can inspect the assumptions.
In the original workflow, the CTR assumptions are based on the Advanced Web Ranking Google Organic CTR History Study for:
- Device: Desktop
- Market: United States
- Keyword type: Non-branded
This is an important caveat. Real CTR varies heavily based on SERP features, brand presence, intent, geography, and device mix, so the output should be treated as directional rather than literal.
Step 4: Calculate expected traffic for scenario one
Once scenario_one is set, the script adjusts the ranking positions and recalculates the expected traffic under the improved rankings.
Any position that would fall below 1 is automatically capped at position 1, since there is no rank above the first result.
Step 5: Calculate expected traffic for scenario two
Scenario two follows the same logic as scenario one, but applies a different position improvement and stores the result in a separate output column such as exp_traffic2.
This makes it easy to compare conservative and ambitious traffic uplift cases side by side.
Step 6: Segment keywords by difficulty
Keeping CPC and KD in the output makes the model more useful. In this workflow, keywords are grouped into three broad difficulty buckets:
- Easy: KD below 20
- Medium: KD between the easy and hard bands
- Hard: KD above 60
This basic segmentation helps frame where the upside is coming from and whether the forecast depends too heavily on difficult queries.
Step 7: Build the summary analysis
A pivot table can be used to summarise current traffic, scenario-one traffic, and scenario-two traffic by keyword category. Adding a total row makes the final output easier to interpret for non-technical stakeholders.
At this stage it also makes sense to remove columns that are no longer needed in the reporting layer, such as raw positions or intermediate scenario fields.
Step 8: Visualise the result
The final output can be turned into a bar chart comparing current traffic with the two uplift scenarios, broken down by keyword difficulty category.
Even a simple chart makes the analysis much easier to present, especially when the goal is to compare expected value across sections, competitors, or potential new landing pages.
Step 9: Export the outputs
The workflow ends by exporting both the raw keyword-level data and the pivot-table summary in CSV format so they can be reused elsewhere.
Recommended outputs
- Raw keyword data with current and projected traffic
- Pivot-table summary by keyword category
- Optional chart export for reporting decks
Conclusion
This type of keyword impact analysis is deliberately rough, but still useful. CTR shifts with intent, SERP features, market, and device mix, so the forecast should never be treated as a promise. Used properly, though, it gives you a defensible way to estimate upside and prioritise the keywords that matter most.