Learning DPDK: Eliminating NIC Receive Drops at 100GbE

FastNetMon



June 17, 2026

Blue-tinted close-up of a computer motherboard showing connectors and capacitors, with a 'GUEST POST' banner in the corner.

This post is a repost of technical blog originally published by Denys Haryachyy, shared here with permission as part of ongoing research and engineering work around FastNetMon’s inline traffic processing capabilities.

The article examines the underlying performance mechanics needed to run DPDK-based packet processing at 100GbE without packet loss, and how subtle operating system and interrupt behaviour can directly impact stability under extreme load. It is closely related to our R&D project, code-named FastACL, which builds on these same principles at the system level to deliver a VPP-based inline DDoS filtering engine.

Learning DPDK: Eliminating NIC Receive Drops at 100GbE

TL;DR — everything that matters, in one paragraph. A DPDK poll-mode application can lose packets at 100GbE even when the CPU has spare cycles — because a hardware IRQ landing on a busy-poll worker core stalls it for a few microseconds, and at 100+ Mpps that’s enough to overflow the NIC’s RX descriptor ring before the worker can drain it. The fix is isolation, not more CPU. Pinning the NIC’s completion IRQs off the worker cores (onto a housekeeping core) cut the receive-miss rate from 1 in 2,275 packets to 1 in 16,000,000 — about a 7,000× reduction. To take the residual to essentially zero, isolate the worker cores from the kernel entirely with boot parameters: isolcpus, nohz_full, rcu_nocbs, irqaffinity=0, and processor.max_cstate=1. None of this costs throughput — it just stops anything from interrupting the cores that poll the NIC.

A DPDK worker is a tight loop that does nothing but poll the NIC and process packets. It has no slack: if the OS steals it for even a few microseconds, the NIC keeps filling the RX ring with no one draining it. At low rates that’s invisible; at 100GbE the ring overflows and you get silent drops.

Default: a NIC completion IRQ on a busy-poll worker core preempts it, the RX ring overflows, and rx_missed rises. — Figure 1: By default a NIC completion IRQ can land on a busy-poll worker core, preempting it for microseconds — long enough to overflow the RX ring during a burst, so `rx_missed` rises.

Fixed: NIC IRQs pinned to housekeeping core 0, workers poll uninterrupted, rx_missed near zero. — Figure 2: Pin the NIC IRQs to a housekeeping core (core 0) and the workers poll uninterrupted — `rx_missed` drops to ≈ 0.

The Symptom: `rx_missed` Under Load

The tell is the NIC’s rx_missed (a.k.a. rx_missed_errors / PHY discards) counter rising under load while CPU utilization shows headroom. The packets never reach the application — the NIC dropped them because the RX descriptor ring was full when they arrived. More workers won’t help; the cores aren’t saturated. Something is interrupting them.

The Cause: Hardware IRQs on Poll Cores

Even with a bifurcated or poll-mode driver, the NIC still raises completion/async interrupts, and by default the kernel is free to deliver them to any core — including the ones running your DPDK workers. On the box behind these numbers, the mlx5 completion IRQs defaulted onto cores 11–13, right on top of busy-poll workers (Figure 1).

Each IRQ preempts the poll loop for only a few microseconds. But do the math: at ~30 Mpps per queue, a single 100 µs stall is **~3,000 packets** — and an 8,192-entry RX ring fills in well under that. One stray interrupt during a burst is a ring overflow.

Pinning IRQs Off the Workers

The first and biggest win is to keep NIC IRQs away from worker cores. Steer every device IRQ to a housekeeping core (core 0) by writing its smp_affinity:

# Send every mlx5 IRQ to core 0 (mask 0x1)for irq in $(grep -l mlx5 /proc/irq/*/* 2>/dev/null | grep -o '[0-9]\+' ); do  echo 1 > /proc/irq/$irq/smp_affinity 2>/dev/nulldone# and stop irqbalance from moving them backsystemctl stop irqbalance

The effect on this hardware:

State	rx-miss rate	Drop rate
Before IRQ pin	1 / 2,275 pkts	0.044 %
After IRQ pin	1 / 16,000,000 pkts	0.0000062 %

That’s a ~7,000× reduction from one change — and irqbalance must be stopped, or it will quietly reassign the IRQs back onto the workers a minute later.

Full Isolation: `isolcpus`, `nohz_full`, `rcu_nocbs`

To take the residual misses to essentially zero, remove the worker cores from the kernel’s reach at boot. Append to the kernel command line (/etc/default/grub, then update-grub and reboot):

isolcpus=1-32 nohz_full=1-32 rcu_nocbs=1-32 \irqaffinity=0 processor.max_cstate=1

Each one closes a different interruption source:

isolcpus=1-32 — keep the scheduler from placing any other task on the worker cores.
nohz_full=1-32 — stop the periodic 1 kHz timer tick on those cores (no per-millisecond interrupt).
rcu_nocbs=1-32 — move RCU callback processing off the worker cores onto housekeeping cores.
irqaffinity=0 — default all IRQs to core 0 from boot, before userspace even starts.
processor.max_cstate=1 — forbid deep C-states, so a core never takes 50–200 µs to wake from idle.

Use the same core range as your DPDK workers. Together these guarantee that the only thing ever running on a worker core is the poll loop — which is the entire point.

Summary

NIC drops at 100GbE are usually interruption, not CPU saturation — rx_missed rises with cores to spare.
A hardware IRQ stalls a busy-poll worker for µs; at 30 Mpps/queue that overflows the RX ring.
Pin NIC IRQs to a housekeeping core (smp_affinity) and stop irqbalance — here a ~7,000× drop reduction.
Isolate the worker cores at boot: isolcpus, nohz_full, rcu_nocbs, irqaffinity=0, processor.max_cstate=1.
None of it costs throughput — it removes everything that competes with the poll loop.

References

DPDK — Linux core isolation for performance — isolcpus and related boot options for DPDK apps.
Linux kernel — nohz_full (NO_HZ) documentation — tickless operation on isolated cores.
Linux kernel parameters (isolcpus, rcu_nocbs, irqaffinity) — the boot-cmdline reference.

Learning DPDK: Eliminating NIC Receive Drops at 100GbE

Learning DPDK: Eliminating NIC Receive Drops at 100GbE

The Symptom: `rx_missed` Under Load

The Cause: Hardware IRQs on Poll Cores

Pinning IRQs Off the Workers

Full Isolation: `isolcpus`, `nohz_full`, `rcu_nocbs`

Summary

References

Latest Posts

Learning Mellanox ConnectX-5: CQE Compression Tuning

FastNetMon Now Supports HTTP and HTTPS Proxies

NetUK3 – Event Recap

Introducing Netomics: a self-hosted routing intelligence platform for network operations

Get started

Automate your DDoS Defence with FastNetMon

Start a Trial

Talk to our Team

Read our Docs

Learning DPDK: Eliminating NIC Receive Drops at 100GbE

Learning DPDK: Eliminating NIC Receive Drops at 100GbE

The Symptom: rx_missed Under Load

The Cause: Hardware IRQs on Poll Cores

Pinning IRQs Off the Workers

Full Isolation: isolcpus, nohz_full, rcu_nocbs

Summary

References

Latest Posts

Learning Mellanox ConnectX-5: CQE Compression Tuning

FastNetMon Now Supports HTTP and HTTPS Proxies

NetUK3 – Event Recap

Introducing Netomics: a self-hosted routing intelligence platform for network operations

The Symptom: `rx_missed` Under Load

Full Isolation: `isolcpus`, `nohz_full`, `rcu_nocbs`