Bot traffic is a normal part of running a website. If you do not manage it, it can distort your reports, increase your action volume, and make it harder to understand real user behavior.
In this article, we’ll explain what bot traffic is, how it affects your data, and what you can do about it.
What bot traffic is
Bot traffic comes from automated programs rather than real people.
Not all bots are harmful. Some support useful services, while others are designed to misuse your site or data. AI crawlers and agents are also becoming a more common source of automated traffic.
Legitimate bots
Legitimate bots support services that website owners often rely on, such as:
- search engine crawlers, such as Googlebot and Bingbot, that index your content for search results
- uptime and performance monitoring services
- SEO audit and archiving services
- compliance scanners used by consent management platforms
Legitimate bots usually follow the instructions in a robots.txt file. This is a simple text file placed at the root of your domain, for example, yoursite.com/robots.txt, that tells bots which pages or sections they can crawl.
Following robots.txt is optional, not technically enforced, but trusted crawlers such as Googlebot generally respect it. It won’t stop malicious bots, but it is a simple and widely used way to guide legitimate bots.
Malicious bots
Malicious bots ignore your site’s rules or try to abuse it.
The examples of such bots:
- content scrapers
- form spammers
- ad fraud bots
- credential stuffing tools
AI crawlers and agents
AI crawlers and agents don’t fit neatly into either category. Some behave like legitimate crawlers, while others can generate automated traffic that may appear in your reports.
How bot traffic affects your data
Bot traffic can affect your reports in several ways:
- Page views, sessions, and actions may appear higher than they really are.
- Engagement metrics such as bounce rate, time on page, and scroll depth may become less reliable.
- Conversion data may be affected, including form submissions, funnel completions, and goals.
- Campaign data may become less reliable when visits, clicks, and tagged sessions are inflated.
If bot traffic is not identified and filtered, your reports may not reflect real user behavior.
How Piwik PRO manages bot traffic
Piwik PRO includes built-in bot filtering that can exclude many known crawlers from your analytics reports. This means common bots, such as major search engine crawlers, are usually not counted as long as bot filtering is turned on.
If you still see suspicious traffic in your reports, you can take a few extra steps:
1. Set up custom crawler filters
Exclude additional bots by adding their user agent strings in your site or account-level settings. You can do it in Menu > Administration > Data collection > Filters > Add crawlers. Read more
2. Use global site settings
Check that bot filtering is turned on in your settings. Go to Menu > Administration > Account > Global site & app settings. Read more
Bot filtering improves data quality in your reports, but it doesn’t stop bots from reaching your website. If bots are causing heavy traffic, fraud, or other issues, you need to block them before they reach your site.
How to spot suspicious traffic
If you notice unusual activity in your reports, look for patterns such as:
- sudden spikes in sessions or page views
- very low or zero engagement
- unusual locations or device distributions
- high traffic from one internet service provider (ISP), IP range, or user agent
- unexpected referrer sources
You can use segments to separate suspicious traffic from other traffic and see how it affects your data.
For example, you can group traffic by user agent, internet service provider, or behavior. This can help you understand the problem and share useful details with your IT or security team.
Bot filtering is only part of bot traffic management
Bot filtering in Analytics helps keep your data more accurate, but it does not address all bot-related issues. To manage bot traffic more effectively, you can also use:
| Method | What it does | Owner |
|---|---|---|
| robots.txt | Tells well-behaved crawlers which pages or sections to avoid. Simple to set up, but voluntary and not effective against malicious bots. | Web team/site administrator |
| WAF / CDN (for example, Cloudflare, Akamai, AWS) | Blocks or challenges suspicious traffic before it reaches your site. This may include rate limiting, IP blocklists, or JavaScript challenges. | Security / IT |
| Bot management platforms | Detects more advanced bots, including bots that try to behave like real users. | Security / IT |
| Analytics (Piwik PRO) | Filters known bots from reports, helps you isolate suspicious traffic, and gives you data you can share with other teams for follow-up action. | Analytics / data |
Keep in mind that Analytics filtering is reactive. It works after traffic has already reached your site and triggered data collection.
If bot traffic is causing major issues, the most effective fix usually happens earlier, at the WAF, CDN, or bot management layer.
How to manage AI-driven traffic
AI-driven traffic is a newer type of automated traffic. Some teams choose to exclude it to keep reports focused on human journeys. Others prefer to monitor it separately to understand how AI systems interact with their content. The right approach depends on your measurement strategy.
In Piwik PRO, you can choose whether to exclude AI-driven traffic or monitor it separately, depending on your measurement strategy.
The table below shows recommended actions, who should handle them, and where to apply them.
| Action | Who | Where |
|---|---|---|
| Turn on built-in bot filtering | Analytics team | Global site settings |
| Add custom crawler exclusions | Analytics team | Site or account settings |
| Review and monitor suspicious traffic | Analytics team | Piwik PRO reports |
| Share identified bot patterns with IT | Analytics team and IT | WAF / CDN controls |
| Set up bot protection before traffic reaches your site | IT / Security | WAF, CDN, or bot management platform |