Yet another strange job scheduler bug
Summary
A deep dive into a Linux job scheduler bug where coalesced SIGCHLD signals can cause delayed reaping of child processes. The article explains the observed behavior, the root cause (signal coalescing), and a proposed fix to loop on wait4 with WNOHANG to reap all children.