Where Does Ahrefs Get Its Backlink Data? The Real Source, Explained

By SM Mehedi Hasan

Where Does Ahrefs Get Its Backlink Data?

Ahrefs gets its backlink data from AhrefsBot, its own web crawler that has scanned the internet 24/7 since 2013. It does not buy or borrow link data from Google or any third party.

Every backlink you see comes from Ahrefs crawling and storing pages itself.

 

If you have ever wondered where Ahrefs gets its backlink data, the honest answer is simpler than most people expect. There is no secret partnership with Google. There is no shared link feed.

Ahrefs built its own crawler, index, and infrastructure, which powers everything in the Backlinks report.

 

Let me break down exactly how that works, what the numbers look like in 2026, and a few things most articles on this topic quietly get wrong.

Where does Ahrefs get its backlink data from?

Ahrefs gets its backlink data from AhrefsBot, a proprietary web crawler that visits and re-visits web pages around the clock.

When AhrefsBot lands on a page, it records every link on it, including the anchor text, the target URL, and whether the link is followed or nofollowed.

 

Here is the part worth sitting with for a second. Ahrefs is not pulling backlinks from Google’s index, Bing, or any external provider.

The company has stated plainly that its link data is collected independently and is not affiliated with other search engines.

 

So when you compare backlink counts between Ahrefs and another tool, and they disagree, that is not a bug. Different crawlers see different slices of the web.

What is AhrefsBot and how does it work?

AhrefsBot is the web crawler that powers Ahrefs’ backlink database and also feeds Yep, the independent search engine Ahrefs runs. Think of it as a tireless reader that opens billions of pages, notes every link, and reports back.

 

Most people picture a crawler as something that grabs a page once and moves on. But AhrefsBot keeps coming back. It re-crawls pages it already knows to catch new links, removed links, and changes to existing ones.

 

The bot behaves like a polite guest, not an intruder:

 

  • It reads and obeys your robots.txt rules, both allow and disallow.

     

  • It respects crawl-delay settings, so it does not hammer your server.

     

  • It automatically slows down when a site returns 4xx or 5xx errors.

     

  • It caches frequently used assets, such as images and CSS, to reduce bandwidth usage.

Cloudflare officially lists AhrefsBot as a verified “good” bot, which matters if you have ever worried about whether to block it.

 

Pro tip

 

Want AhrefsBot to find your new links faster? Make sure your site isn’t accidentally blocking user agents in robots.txt, and keep your XML sitemap clean. The crawler reads sitemaps and follows them.

How does AhrefsBot collect backlinks step by step?

How does AhrefsBot collect backlinks step by step?
  1. Crawl a page. AhrefsBot requests the HTML of a URL it has discovered, either through an existing link, a sitemap, or an IndexNow ping.

  2. Extract every link. It pulls all outbound links from that page, along with details such as anchor text, link type, and HTTP status.

  3. Follow the trail. Each new link becomes a new URL to crawl, so the bot spreads across the web like a reader following footnotes.

  4. Store the data. Everything lands in Ahrefs’ massive key-value database, which holds roughly 170 trillion rows.

  5. Re-crawl and update. High-value pages get visited again and again, so the link data stays current.

And here is why that order matters to you. A backlink only appears in Ahrefs after the bot crawls the page that contains it. No crawl, no data. That single fact explains most “why can’t I see my link” confusion.

How big is Ahrefs' backlink index in 2026?

How big is Ahrefs' backlink index in 2026?

Ahrefs runs what it calls the world’s largest index of live backlinks, and the 2026 numbers are genuinely hard to picture. The scale is the whole point, because a small index would miss most of the links pointing at any given site.

Here are the current figures straight from Ahrefs’ own data page:

To feed all of that, Ahrefs built its own infrastructure rather than renting cloud servers.

The setup runs on hundreds of thousands of CPU cores and petabytes of storage, powered in part by Yep1, a supercomputer that ranks among the 50 fastest in the world. But raw size is only half the story. Freshness is the other half.

Metric 2026 Figure
Pages in backlink index 493.9 billion
External backlink records (historical) 35 trillion
Internal backlinks 28.7 trillion
Referring domains (post-vetting) 500.4 million
Pages crawled per minute 5 million
Rows in key-value database 170 trillion

To feed all of that, Ahrefs built its own infrastructure rather than renting cloud servers.

The setup runs on hundreds of thousands of CPU cores and petabytes of storage, powered in part by Yep1, a supercomputer that ranks among the 50 fastest in the world.

But raw size is only half the story. Freshness is the other half.

How often does Ahrefs update its backlink data?

Ahrefs updates its backlink index with fresh data every 15 to 30 minutes.

New links discovered by the crawler are added to the live database almost continuously, which is why the dashboard numbers shift even when you have not changed anything.

There is a longer cycle underneath that, though. A complete refresh of the entire internet’s backlinks takes roughly two months. During that window, some pages are re-crawled 60 times while others are visited only once.

Why the gap? Crawl priority. Ahrefs decides how often to visit a page based on its strength:

  • High Domain Rating (DR) sites get crawled more often and more deeply.

  • High URL Rating (UR) pages get re-checked frequently for new links.

  • Low-authority pages with thousands of URLs may only get partially crawled.

So a link from a strong, frequently updated site loads quickly. A link buried on a weak page that few crawlers visit can take weeks to crawl.

In My Experience

Honestly, when I first started tracking a client’s link-building campaign, I expected new backlinks to appear in Ahrefs within a day or two.

They did not. A guest post we placed on a mid-authority blog took almost 11 days to be registered.

I checked the page myself using the AhrefsBot status checker, and the issue was clear: the blog had a low DR and thousands of thin pages, so the crawler simply hadn’t gotten back to it.

Once I understood crawl priority was tied to DR and UR, the delays stopped feeling random.

The thing that surprised me most was how fast links from strong sites appeared by contrast.

A mention from a DR 80 news site landed in the report the same afternoon. Same campaign, wildly different timing, all because of where the link lived.

Does Ahrefs use clickstream data for backlinks?

No. Ahrefs does not use clickstream data for backlinks. This is the single biggest mix-up I see repeated across other articles, and it is worth clearing up directly.

Clickstream data, the anonymized browsing behavior collected from real users, feeds Ahrefs’ keyword and traffic estimates. It helps the tool guess search volumes and clicks. It has nothing to do with the backlink index.

Backlinks come from only one place: AhrefsBot crawling pages. Keyword and search volume data come from a blend of Google Keyword Planner, Google Trends, Search Console signals, and clickstream data.

Most people assume one giant data pipeline powers the whole tool. But Ahrefs actually runs separate systems for separate jobs, and conflating them leads to bad conclusions about how reliable the link data is.

Data type Where it comes from
Backlinks AhrefsBot crawler only
Keyword volumes Keyword Planner + clickstream + GSC
Traffic estimates Clickstream + ranking data

AhrefsBot vs AhrefsSiteAudit: which one powers backlinks?

AhrefsBot powers the backlink database. AhrefsSiteAudit does not. They are two separate crawlers with two separate jobs, and confusing them leads people to block the wrong one.

Crawler What it does
AhrefsBot Builds the global backlink and content index
AhrefsSiteAudit Crawls your own site for technical SEO issues

If you block AhrefsBot in robots.txt, your site stops contributing to the link index, and your own backlinks may take longer to surface.

If you block AhrefsSiteAudit, you only lose the ability to run a site audit. They are not interchangeable.

A quick note on setting this up for clients

I once watched a developer block “Ahrefs” with a blanket rule, thinking it would protect server resources.

It actually cut the site out of audit reports while leaving the heavier crawler partly active, exactly the opposite of what they wanted. Specificity in robots.txt matters here.

Why isn't your new backlink showing in Ahrefs yet?

Your new backlink is not showing yet because AhrefsBot has not re-crawled the page that contains it. Ahrefs cannot report a link it has not personally seen, and crawl timing depends entirely on that page’s authority.

A few practical causes:

  • The linking page sits on a low-DR domain that is rarely crawled.

  • The linking page is new, and AhrefsBot has not yet discovered it.

  • The site is accidentally blocking AhrefsBot in the robots.txt file.

  • The link lives deep in a large, low-authority site with limited crawl budget.

So the fix is usually patience plus a nudge. You can speed discovery by getting the linking page indexed, building a few internal links to it, or making sure nothing is blocking the crawler.

A real workflow example: tracing where a backlink came from

Let me walk through the full flow, the way it actually happens, from the web page to your Ahrefs report.

  • Input: A blogger publishes an article and links to your homepage with the anchor “best budget laptops.”

  • Process: AhrefsBot, on its regular crawl schedule, requests that blog page. It extracts your link, the anchor text, the followed status, and the HTTP response code.

  • Output: That link record is written into Ahrefs’ key-value database during the next 15 to 30 minute refresh window.

  • Result: You open Site Explorer, check the Backlinks report, and the new link appears with its anchor text and DR. The whole chain ran without Google ever being involved.

That is the entire journey. A crawler saw a page, read a link, stored it, and showed it to you.

Common pitfalls people run into

Beginners make the same handful of mistakes when they try to understand Ahrefs’ backlink data. Here are the ones I see most, and why they happen.

  • Assuming Ahrefs uses Google’s links. It does not. People assume this because Google is the link authority, but Ahrefs crawls independently. That is why counts differ across tools.

  • Blaming the tool for missing links. A missing link usually means the page was not crawled yet, not that Ahrefs is broken. Crawl priority is the real reason.

  • Mixing up clickstream and crawl data. Folks read that Ahrefs uses clickstream and assume it applies to backlinks. It only applies to keywords and traffic estimates.

  • Blocking the crawler by accident. A sloppy robots.txt rule can hide your own site from the index, then people wonder why their backlinks lag.

  • Comparing counts at different timestamps. Two tools pulled at different moments will never match. Compare active links over the same window, not raw totals.

Pro tip

Before you trust any backlink comparison between tools, check the “last seen” dates and filter for live links only. Comparing a fresh Ahrefs pull against a stale export from another tool tells you nothing useful.

Why this matters for your SEO in 2026

Understanding the source of Ahrefs’ backlink data changes how you read every report.

Once you know the data comes from a crawler with priorities, delays, gaps, and mismatches, stop treating them like errors and treat them as predictable behavior.

 

It also helps you set expectations. New links from strong sites appear fast. Links from weak pages crawl slowly. Neither is a flaw, and neither means your link building failed.

 

And as AI search and answer engines reshape how visibility works, Ahrefs has pushed the same crawling muscle into tracking brand mentions across AI Overviews and chat tools.

The crawler that built the link index is now doing double duty, a sign that the underlying data engine is becoming increasingly central, not less.

Frequently Asked Questions

No. Ahrefs collects all backlink data through its own crawler, AhrefsBot, and is not affiliated with Google or any other search engine.

Usually, a few days to a few weeks. Timing depends on the linking page’s Domain Rating and URL Rating, since stronger pages get crawled more often.

Yes. Cloudflare verifies AhrefsBot as a good bot. It obeys robots.txt, respects crawl-delay, and slows down automatically if your server struggles.

It is among the most complete in the industry because of the index size, but no crawler sees every link. Some backlinks on rarely crawled pages will be missed.

Each tool runs its own crawler and updates at different times. Different crawlers cover different parts of the web, so totals rarely match exactly.

Scroll to Top