How Cloudflare Killed a Critical Linux Kernel Bug Across 330 Cities

On April 29, 2026, a Linux kernel escalation vulnerability surfaced publicly under the name “Copy Fail” (CVE-2026-31431). The flaw lets an unprivileged attacker inject shellcode into a setuid-root binary and gain full root access, no special permissions needed. It was a serious threat. Cloudflare’s response, however, was swift and precise.
By the time disclosure hit, Cloudflare’s teams had already begun assessing exposure across their global fleet spanning 330 cities. Within hours, they confirmed no customer data was at risk and no services were disrupted.
Understanding the flaw helps explain why the response mattered so much. The bug lives inside the algif_aead kernel module. That module handles Authenticated Encryption with Associated Data (AEAD) ciphers via the Linux kernel’s AF_ALG socket family. A 2017 optimization introduced an in-place crypto operation without proper boundary enforcement. As a result, calling recvmsg() triggers a 4-byte out-of-bounds write into the page cache.
By combining splice() system calls with crafted sendmsg() parameters, an attacker can target a specific file. The default target is /usr/bin/su, a setuid-root binary on virtually every Linux distribution. The injected shellcode then executes with root privileges. The upstream fix, commit a664bf3d603d, reverts the 2017 optimization entirely.
Cloudflare runs a custom Linux kernel based on Long-Term Support (LTS) versions. Builds update roughly every week through automated pipelines and staged rollouts. When a CVE goes public, the fix has typically been in LTS releases for weeks already. At disclosure time, most infrastructure ran the 6.12 LTS line, with a subset transitioning to 6.18.
The team’s first move was validating detection. Cloudflare’s servers run behavioral detection that monitors process execution patterns continuously, without relying on known vulnerability signatures. When engineers internally validated the exploit, the platform flagged the full execution chain within minutes. No signature update, no rule change, no human intervention required. That coverage existed before the team wrote a single vulnerability-specific rule.
Simultaneously, security began proactive threat hunting. The team searched 48 hours of fleet-wide logs for any trace of exploitation before public disclosure. The exploit leaves a distinctive kernel log signature. Access logs were pulled, binaries were validated against cryptographic hashes, and network connections were audited. Everything came back clean.
Engineering then pursued two parallel mitigation paths. Removing the algif_aead module was the simplest fix, but too disruptive. Internal services legitimately depended on the kernel crypto API. Instead, the team turned to bpf-lsm, a tool Cloudflare had already built for exactly this scenario. The approach denies the socket, bind LSM hook for any binary not on an explicit allow-list. This blocks the exploit path while leaving the module loaded for legitimate users.
Before enforcing, the team used prometheus-ebpf-exporter to map AF_ALG usage per binary across hundreds of thousands of servers. Results confirmed one known internal service was the sole legitimate user. Only then did enforcement roll out, first visibility, then blocking.
The timeline was tight but deliberate. By the evening of April 30, the bpf-lsm mitigation was live fleet-wide. A previously-vulnerable test node confirmed the exploit no longer worked. By May 4, reboot automation resumed at normal pace with a fully patched kernel. Remaining machines updated through standard reboot cycles.
Cloudflare’s post-incident review flagged three improvement areas. First, better visibility into which production services depend on specific kernel APIs. Second, faster bpf-lsm deployment pipelines with richer logging. Third, a proactive audit of unused kernel modules to reduce attack surface permanently.
The “Copy Fail” Linux kernel escalation response demonstrated that responsible disclosure, in-kernel visibility tooling, and runtime mitigation primitives like eBPF pay off when it matters most.






