Today I stumpled upon a strange problem with my Cyrus Imapd. Suddenly new connections just stalled and timed out. The daemon was still running and listening on the specified ports. I connected with telnet to the imapd (143/tcp) and got the correct banner, but connections made with Thunderbird and other clients timed out while Thunderbird was saying "sending authentication data".
My troubleshooting procedure was as follows:
Restart the daemon: Problem still exists.
Check the logfiles: Nothing interesting in the logs.
Telnet to the Imapd: Everything seams to work.
Check more logfiles and restart the service another time: Just in case I missed something before. After this step I knew that something was a bit more skrewed, than I first thougt.
Trace an Imapd process while connecting to it:
This Step revealed the cause of the problem. Take a look at the end of the strace output:
open("/dev/random", O_RDONLY|O_LARGEFILE) = 11 read(11,
To calculate hashes for the secure authentication (I use only CRAM-MD5 and DIGEST-MD5) the imapd opened
/dev/random, which is in contrast to
/dev/urandom, a blocking device. That means, if you read from
/dev/randomand it's empty the process will wait forever - and at this point the tcp connections timed out.
To verify that the entropy was drained i checked
/proc/sys/kernel/random/entropy_availwich showed me that no entropy was available:
cat /proc/sys/kernel/random/entropy_avail 0
Now I had to fix this problem, but how can you produce enough entropy on a colocated server without hardware random number generator support. The Linux kernel mainly gets the entropy from keyboard and mouse input, which doesn't happen on a rackmounted server, so i checked the entropy hint from Robert Connolly, which was discussed on the hlfs mailing list a time ago, for other solutions. I needed a quick fix, because all connections to the Imapd were timing out for about 40 minutes. I decided to give the rngd-tools a try.
I downloaded the tools and compiled the rngd daemon. This daemon should use
/dev/hwrandom to feed entropy from a hardware random number generator to `
the kernel (
/dev/random), but I don't have compatible hardware, so I
decided to use the daemon to feed some entropy from
is non-blocking) to
rngd -r /dev/urandom -o /dev/random
This may not be the best aproach, but instantly
non-zero values and connecting to the Imapd worked again. The problem with
this solution is, that the the entropy from
/dev/urandom may not be truly
random (only pseudo random), but I will dive deeper into this topic another
time, for now it is important, that the service is functional again.
I needed aproximately 1 hour to troubleshoot and fix this problem, but most of the time, about 45 minutes, was needed to find the root of the problem, implementing the fix (or maybe it's only work-around) was rather quick.