Last week I was dealing with one bug that caused kernel panic during the boot time. However, it showed to me only on the amd64 platform. The problem could be quickly summed up as a “null pointer dereference”. It happened when a sleeping thread woke up. At this time, a linked list of all CPUs was traversed and for every CPU I looked at the location where the pc_curthread pointer was pointing. This pc_curthread pointer is a part of the struct pcpu which holds per-cpu specific information – such as a pointer to the currently running thread on that particular CPU. The problem was, that sometimes it was a null pointer. So as a bug fix, I just added a test for the null pointer.
The reason why it took me so long to find this bug source is that in the FreeBSD kernel there must be some memory mapped at virtual address zero. So dereferencing a null pointer does not just cause panic. (Probably it is also important to note here that I was just trying to read from that memory. Maybe if I tried to write something there, I would have been immediately killed) The kernel panicked later when the error propagated itself further to the other kernel interfaces.
What I am working on now
Now I am working on per process CPU accounting information, so that the percentage column in the top command output shows some reasonable value. This is done differently in the 4BSD scheduler and in the ULE scheduler. Currently it looks like I will follow the way how the ULE does it.