Have you ever wanted to send a signal, which normally produces a core file, but the process has one of those annoying signal handlers setup to catch the signal you’re sending? The nerve of that application trying to intelligently handle signals! I actually have a real need to remove the signal handler of a process which I’ll describe shortly. Normally, it is a bad idea to remove another process’s signal handler and under normal circumstances I do not suggest following the procedure which I am going to describe.
I have been struggling with a production issue at work with a process which has been less than cooperative. You see, I have a java process which gets crazy and starts consuming CPU cycles. When you run a
strace against the process the only system call you will see is a
sched_yield() call. The java thread is most likely stuck on a spinlock in user space and the process/thread which owns the lock has died or something else, but for my runaway process all it cares about it is checking for it’s lock and yielding execution back to the kernel to schedule another task. Ofcourse, it just gets the CPU again and continues to pound it.
My company pays alot of money for support and we actually have had a case now open for eight months now. The problem is we are unable to gather sufficient data for their level2 and level3 support teams. They would like a javacore to be generated, which can be done by sending a signal 3 to the java process. In addition to a javacore, they recommend sending a signal 11 (SEGV) to the process to prompt the generation of a normal binary core file. Either one would be invaluable for the support team in ascertaining what is going wrong. Unfortunately, it seems that once the process is stuck in this tight,
sched_yield() loop any of the signals we send to it are being ignored. In short, that is my problem.
During my Linux Kernel Internals training with RedHat, I had an idea of writing a kernel module to strip the signal handler from the java process so I can finally generate that elusive core file. The kernel module sets up an entry under /proc named
stripsignal_pid. If you read the value, it will tell you a quick one-liner about using this interface. To use the module, you write a process ID into that file and that process’s SIGABRT signal handler will be reset to the SIG_DFL. At this point, if you send a SIGABRT signal to the process the result will be writing out of it’s core file.
Download the source along with a helper test program here: stripsignal.tar.gz.
But if all you are interested in is reviewing the short source code, then feel free to browse the stripsignal.c source online.
I tell you what, the best thing I learned from the class was just familiarizing myself with the source code and actually learning some new emacs tricks for navigating large source code projects. Next writeup will be about my experiences with the GNU global tagging system.