I’ve been working on a logging standards document for our team to use. We are having a lot of desktop-software developers contributing software to the Web now, and it is making me take a step back and re-explain some things I consider basics. I did some Googling for inspiration and I have to say, there’s not a lot of coherent bodies of information on what makes logging “good” especially from an operations point of view. So I’m going to share some chunks of my thoughts here, and would love to hear feedback.
You get a lot of opinions around logging, including very negative ones that some developers believe. “Never log! Just attach a debugger! It has a performance hit! It will fill up disks!” But to an operations person, logs are the lifeblood of figuring out what is going on with a complex system. So without further ado, for your review…
Logging is often an afterthought in code. But what you log and when and how you log it is critical to later support of the product. You will find that good logging not only helps operations and support staff resolve issues quickly, but helps you root-cause problems when they are found in development (or when you are pulled in to figure out a production problem!). “Attach a debugger” is often not possible if it’s a customer site or production server, and even in an internal development environment as systems grow larger and more complex, logs can help diagnose intermittent problems and issues with external dependencies very effectively. Here are some logging best practices devised over years of supporting production applications.
Consider using a logging framework to help you with implementing these. Log4j is a full-featured and popular logging package that has been ported to .NET (Log4net) and about a billion other languages and it gives you a lot of this functionality for free. If you use a framework, then logging correctly is quick and easy. You don’t have to use a framework, but if you try to implement a nontrivial set of the below best practices, you’ll probably be sorry you didn’t.
The Log File
- Give the log a meaningful name, ideally containing the name of the product and/or component that’s logging to it and its intent. “nifarm_error.log” for example is obviously the error log for NIFarm. “my.log” is… Who knows.
- For the filename, to ensure compatibility cross-Windows and UNIX, use all lower case, no spaces, etc. in the log filenames.
- Logs should use a .log suffix to distinguish themselves from everything else on the system (not .txt, .xml, etc.). They can then be found easily and mapped to something appropriate for their often-large size. (Note that the .log needs to come after other stuff, like the datetime stamp recommended below)
- Logs targeted at a systems environment should never delete or overwrite themselves. They should always append and never lose information. Let operations worry about log file deletion and disk space – do tell them about the log files so they know to handle it though. All systems-centric software, from Apache on up, logs append-only by default.
- Logs targeted at a desktop environment should log by default, but use size-restricted logging so that the log size does not grow without bound.
- Logs should roll so they don’t grow without bound. A good best practice is to roll daily and add a .YYYYMMDD(.log) suffix to the log name so that a directory full of logs is easily navigable. The DailyRollingFileAppender in the log4 packages does this automatically.
- Logs should always have a configurable location. Applications that write into their own program directory are a security risk. Systems people prefer to make logs (and temp files and other stuff like that) write to a specific disk/disk location away from the installed product to the point where they could even set the program’s directory/disk to be read only.
- Put all your logs together. Don’t scatter them throughout an installation where they’re hard to find and manage (if you make their locations configurable per above, you get this for free, but the default locations shouldn’t be scattered).
Much more after the jump! Continue reading