FreeBSD On Splunk: Stuck!

I broke my Splunk server by accidentally deleting the Linux libc6 libraries, and it ended up being easier to install FreeBSD than to fix it. I wasn’t going to reinstall Linux…with Splunk available for FreeBSD, why would I do that? ;)

The install went fairly well after installing the compat6x port. Splunk say that the software works with 6.0 “or higher”. Thats a white lie. It doesn’t work natively on 7.0 (yet).

Anyway I got it running, started configuring it, and all seemed sweet. Had it index a bit of data and what not all good. When I started adding in the sources from my other servers, things went weird. Ok there was about 600M of logs, but I black listed well over half that. I figured I might go over the 500M limit of the free licence while importing everything but oh well. Anyway the server kept churning at 100% CPU usage for about 5 hours. This took me over midnight, which allowed me to see how much data had been indexed. Apparently it indexed 8G, which is really weird when there is not 8G of log files.

I tried fine tuning the black list to remove more files and limiting the time stamp information that was getting collected, but I couldn’t make Splunk finish indexing. I knew it was indexing because that’s the only thing other than searching that really ramps up the CPU usage in Splunk…it’s a fairly single minded application. I watched the splunkd log file for a while and couldn’t see anything too wrong. I wound up editing splunk/etc/system/log.cfg and setting category.FileInputTracker=WARN to info like I’d done before to see what phantom files it was indexing. It turns out there were 2 files it was getting stuck on. One was the original Debian installer log, which is about 2300 lines or so. The other was the syslog.0 file, which was about 5000 lines. What it looks like in the splunkd.log is this (the change in CRC on the splunkd.log file is because of all the info being pumped out to it):

01-19-2009 06:26:17.906 INFO  FileInputTracker - Computing CRC for seekPtr=5d188000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:17.911 INFO  FileInputTracker - Computing CRC for seekPtr=939298 filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log
01-19-2009 06:26:17.982 INFO  FileInputTracker - Computing CRC for seekPtr=5d190000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:17.988 INFO  FileInputTracker - Computing CRC for seekPtr=93939d filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log
01-19-2009 06:26:18.060 INFO  FileInputTracker - Computing CRC for seekPtr=5d198000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:18.068 INFO  FileInputTracker - Computing CRC for seekPtr=9394a2 filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log
01-19-2009 06:26:18.139 INFO  FileInputTracker - Computing CRC for seekPtr=5d1a0000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:18.145 INFO  FileInputTracker - Computing CRC for seekPtr=9395a7 filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log
01-19-2009 06:26:18.221 INFO  FileInputTracker - Computing CRC for seekPtr=5d1a8000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:18.226 INFO  FileInputTracker - Computing CRC for seekPtr=9396ac filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log
01-19-2009 06:26:18.295 INFO  FileInputTracker - Computing CRC for seekPtr=5d1b0000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:18.300 INFO  FileInputTracker - Computing CRC for seekPtr=9397b1 filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log
01-19-2009 06:26:18.371 INFO  FileInputTracker - Computing CRC for seekPtr=5d1b8000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:18.376 INFO  FileInputTracker - Computing CRC for seekPtr=9398b6 filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log
01-19-2009 06:26:18.446 INFO  FileInputTracker - Computing CRC for seekPtr=5d1c0000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:18.451 INFO  FileInputTracker - Computing CRC for seekPtr=9399bb filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log
01-19-2009 06:26:18.523 INFO  FileInputTracker - Computing CRC for seekPtr=5d1c8000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:18.529 INFO  FileInputTracker - Computing CRC for seekPtr=939ac0 filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log
01-19-2009 06:26:18.599 INFO  FileInputTracker - Computing CRC for seekPtr=5d1d0000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:18.604 INFO  FileInputTracker - Computing CRC for seekPtr=939bc5 filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log
01-19-2009 06:26:18.686 INFO  FileInputTracker - Computing CRC for seekPtr=5d1d8000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:18.696 INFO  FileInputTracker - Computing CRC for seekPtr=939cca filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log
01-19-2009 06:26:18.784 INFO  FileInputTracker - Computing CRC for seekPtr=5d1e0000 filename=/mnt/spanky_log/syslog.0
01-19-2009 06:26:18.790 INFO  FileInputTracker - Computing CRC for seekPtr=939dcf filename=/usr/local/splunk/splunk/var/log/splunk/splunkd.log

And it did that pretty much ad infinitum. The seekPtr DID go up, but it never completes the file. How do I know it wasn’t just taking it’s time? Because it did a whole bunch of files larger than 5000 lines each in about 4 seconds. An hour for the syslog file didn’t make sense. I deleted the syslog.0 file, so I don’t know wha the deal was there. I did keep the Debian installer log file that did the same thing, but I’m not in touch with any of the developers so it’ll probably just sit on my HDD. At least on this blog Google will pick it up for other people to know! By the way, don’t forget to change log.cfg and set category.FileInputTracker back to WARN else you will fill up your splunkd log file with self-replicating entries!

Oh yeah, another note, in FreeBSD 7.0, Splunkd will not show up with the correct CPU usage. It will say 1 to 9% on mine, but the system says there is 99% in user. if you run:

top -IS

You will see only the processes using CPU. I know splunkd isn’t displayed properly because it’s the only process displayed as running when there is 99% user!

EDIT 26th-Jan: After running Splunk over the weekend and watching the issues, I’ve discovered that it DOES come out of the loop eventually, it just re-reads the file many times. Case in point, I had a test Icecast server turned on, but it wasn’t doing anything, so the /var/log/icecast/stats.log file was 606 bytes…Splunk had indexed 202 MEGABYTES of it (found by looking at the index stats dashboard plugin). I’m also exceeding my quota many times because of these problems that crop up…the easiest way I can see to avoid them is still just to keep an eye on the index and just delete them as you see them.

Leave a Reply