Introduction
DDoS attacks have long been a thorn in the side of network operators—but AMS-IX faced a particularly unusual challenge. Unlike the massive volumetric attacks that make headlines, the attacks targeting their management network were low-bandwidth but high-flow, exploiting vulnerabilities in session tables, firewall logging, and internal routing. Even small bursts of traffic could cascade into network-wide disruptions, impacting office connectivity, VPN users, internal services, and DNS resolution.
This deep dive explores the full journey of how AMS-IX, in an implementation led by Stavros Konstantaras, built an automated, resilient DDoS mitigation system from scratch. In this detailed technical analysis, you’ll learn:
- How a few megabits per second of traffic brought the network to its knees, and the chain reactions that caused the collapse.
- The architecture of AMS-IX’s management network, including routers, firewalls, and the spine-leaf management fabric, and how traffic flows under normal and attack conditions.
- The challenges and limitations of manual mitigation, including firewall tuning, session table management, and security team coordination.
- How AMS-IX designed a fully automated mitigation pipeline, combining FastNetMon detection, Python/BERT/BGP orchestration, and NAVAS scrubbing.
- Testing strategies, lessons learned, and results, including real-world attacks that were mitigated automatically without human intervention.
- Future improvements, including router upgrades, migration from NetFlow to IPFIX, smarter mitigation algorithms, and IPv6 strategies.
This deep dive is written for network engineers, NOC teams, and technical decision-makers, providing a full account of the architecture, workflows, and operational considerations that went into protecting a critical Internet exchange.
Special thanks to Stavros Konstantaras, Senior Network Engineer at AMS-IX, for designing, implementing, and stress-testing this system, and for sharing his experience with us. We also want to thank AMS-IX for enabling this case study and providing a detailed real-world example of complex network defence.
The Problem: When Small Attacks Cause Big Collapses
The saga began when a few megabits of UDP traffic targeting AMS-IX’s public DNS servers brought the management network to a halt. VPN users were disconnected, internal email and messaging stopped, NAT and DNS transit were disrupted, and internal services became unreachable. Oddly, the production network—the one carrying customer traffic—remained unaffected.
Stavros and the team quickly realised: this wasn’t a typical volumetric DDoS attack. Very modest traffic volumes of a few Mbps caused cascading failures in the admin network that ended up in sudden disruption of office connectivity, leaving the entire team without access to internal and external resources for several minutes at a time.
Step by step, the investigation revealed the chain reactions:
Traffic volumes appeared harmless: Graphs showed only a few megabits per second reaching internal servers:
Firewalls became the bottleneck: CPU and session tables maxed out because each DNS query, even valid ones, consumed firewall state. As you can see in the graph below, the timestamps CPU overload match with attack time stamps:
Cascading failures ensued: Overloaded firewalls triggered LACP drops, OSPF session failures, and lost default gateways, leading to internal applications spiking syslog traffic. This created a feedback loop, compounding the problem.
NetFlow overhead worsened the situation: Enabling NetFlow on Palo Altos added CPU load, accelerating the collapse.
The team tried built-in firewall mitigations, including zone protection and session limits, but nothing prevented the downtime. Manual intervention proved too slow—the network needed a fully automated detection and mitigation system to stop the attacks in real time.
Understanding the Network Architecture
To better illustrate the problem and the solution, it is useful to understand exactly how traffic flows through the AMX-IX management network. The architecture looks rather straightforward:
Border routers: Two Cisco ASR1001 routers handled initial packet inspection and forwarding.
Firewall clusters: Two clusters of Palo Alto 3050 firewalls in active-passive mode managed security inspection and session tracking.
Management layer: Dell switches running Pluribus in a spine-leaf fabric connected the network internally.
Internal services: DNS, HTTP, mail, and other internal services resided behind the firewalls in the DMZ.
Traffic flow into the network looks as follows:
- Packet arrives from transit providers and hits a border router.
- Border router verifies the packet and forwards it to the management fabric.
- The firewall inspects the payload, establishes a session, and forwards it to the internal DNS server.
- DNS response is returned through the same path.
Yet this “simple” flow hid a problem: firewalls logged sessions for every packet, creating a massive state burden. Even small attack bursts quickly consumed session tables, triggering the chain reaction described above.
Early Mitigation Attempts and Limitations
The first attempts to mitigate the attacks were purely manual:
- Firewall tuning (Palo Alto zone protection, session limits)
- Manual activation of scrubbing services from NBIP/NAVAS
- Splitting public services into the cloud to absorb some traffic
Despite these efforts, attacks continued to cause downtime. The flow rate was too high for the firewall to handle, and manual mitigation took too long—by the time an engineer reacted, firewalls were already maxed out. The need for automation became clear.
Designing the Automated DDoS Mitigation Pipeline
The solution combined three key elements: detection (the brain), mitigation (the shield), and orchestration (the glue).
1. The Brain: FastNetMon Detection
FastNetMon provided:
- Reliable, automated detection of flow-based DDoS attacks
- Support for multiple sampling methods (NetFlow, sFlow, IPFIX)
- Integration with custom scripts to trigger mitigation
2. The Shield: NAVAS Scrubbing
NAVAS served as a scrubbing centre:
- Traffic is redirected during attacks to NAVAS for cleaning
- Both good and bad traffic are scrubbed and returned clean
- Existing contract and infrastructure allowed AMS-IX to use it without extra cost
3. The Glue: BGP Orchestration with Python and BERT
Automation relied on a custom pipeline:
- Traffic sampling: Border routers send flow data to FastNetMon via the peering fabric.
- Attack detection: FastNetMon detects an attack and triggers notify_about_attack.sh.
- Python orchestration: Custom script determines which prefixes are under attack and configures BERT.
- BGP signalling: BERT communicates with routers to advertise prefixes to NAVAS for scrubbing.
Testing the System
To ensure reliability, Stavros and the team:
- Launched internal DDoS tests using a virtual machine hosted several hops away to mimic real-world conditions
- Generated millions of packets for both IPv4 and IPv6 DNS queries
- Measured reaction time from detection to mitigation, achieving full mitigation in ~45 seconds for IPv4 traffic
Summary and results
The story paints a picture of a painful reality many engineers can relate to: even small amounts of UDP DNS traffic—just a few megabits per second—were enough to bring down the network. Firewalls hit session limits, CPUs spiked, and internal services failed. Engineers scrambled with manual mitigation steps, often arriving too late, leading to repeated periods of downtime and operational disruption.
The solution, meticulously designed and implemented by Stavros and his team, turned this around. FastNetMon became the brain of the operation, automatically detecting suspicious traffic flows and signalling exactly which prefixes were under attack. Python scripts and BERT handled the orchestration, sending only the affected subnets to NAVAS for scrubbing, while clean traffic continued uninterrupted.
The solution was clean, simple, and it works. During internal testing, millions of attack packets were sent at once, yet the system responded in just 45 seconds. Later, in live conditions, multiple DDoS attempts were fully mitigated without the NOC even noticing—engineers no longer had to use time and resources to respond manually.
In short, what had once been painful disruption is now a seamless, automated defence. The management network remains stable under attack, firewalls no longer collapse, and services stay online. The solution not only restored reliability but also gave the engineering team confidence and peace of mind: the network could now reliably withstand the kinds of bot-driven, rampant attacks that had previously been very problematic.
This case illustrates how deep technical understanding, combined with automation and carefully integrated tools, can transform a reactive, fragile network into a resilient, self-defending system. It is a blueprint for engineering teams facing similar DDoS challenges: identify the weak points, automate detection, and integrate mitigation tightly with routing and scrubbing mechanisms.
About FastNetMon
FastNetMon is a leading solution for network security, offering advanced DDoS detection and mitigation. With real-time analytics and rapid response capabilities, FastNetMon helps organisations protect their infrastructure from evolving cyber threats. For more information, visit https://fastnetmon.com

