The bug was likely discovered by studying the Linux kernel source code. Exploitation requires the creation of a massive, deep directory structure to trigger an out-of-bounds write. The latter can then be transformed into an arbitrary read and write of kernel memory, allowing full privilege escalation. The attack requires the creation of roughly 1 million nested directories, with a total path length of over 1 gigabyte.
Qualys claims it's an extremely reliable attack that can be performed in about three minutes. The biggest hurdle is that it requires about 5GB of memory and 1 million inodes.
How the exploit works:
1/ We mkdir() a deep directory structure (roughly 1M nested directories) whose total path length exceeds 1GB, we bind-mount it in an unprivileged user namespace, and rmdir() it.ARS Technica reports most Linux distros are rolling out patches as we speak.
2/ We create a thread that vmalloc()ates a small eBPF program (via BPF_PROG_LOAD), and we block this thread (via userfaultfd or FUSE) after our eBPF program has been validated by the kernel eBPF verifier but before it is JIT-compiled by the kernel.
3/ We open() /proc/self/mountinfo in our unprivileged user namespace and start read()ing the long path of our bind-mounted directory, thereby writing the string "//deleted" to an offset of exactly -2GB-10B below the beginning of a vmalloc()ated buffer.
4/ We arrange for this "//deleted" string to overwrite an instruction of our validated eBPF program (and therefore nullify the security checks of the kernel eBPF verifier) and transform this uncontrolled out-of-bounds write into an information disclosure and into a limited but controlled out-of-bounds write.
5/ We transform this limited out-of-bounds write into an arbitrary read and write of kernel memory by reusing Manfred Paul's beautiful btf and map_push_elem techniques from: