I know that many people think along the lines of ignoring warnings in log files. But I am responsible for an environment where we run applications that are sort-of business critical. So I need to take warnings seriously, especially if their wording is not really clear. The consequence is that any unnecessary warning causes operational problems. Because who can tell me, kind-of “written in blood”, that I can absolutely always and forever ignore this warning? Only in that case could I consider adding an exception to the log monitoring system and of course to the system documentation, the operations manual, etc. So for me, and from my consulting past I know many customer think the same, this is not just a small nuisance but a real issue.
On the other hand I have had many discussions with people who told me that I could just ignore this or that entry. In many cases it turned out after some discussion, that the log level was actually chosen badly and
INFO would have been more appropriate. In that respect the semantics of the commonly used log levels deserve a closer look. Here are two good links (link 1, link2) for definitions. When I first read them, my initial thought was that I might have overreached with the first paragraph of this post. But looking at the example from link 1 about
WARN a bit closer, I think my concerns are still valid.
So what can be done? Reality is that you rarely have the ability to get a log statement changed. So you do need a scalable approach to deal with log messages that you consciously choose to ignore. It involves primarily two things: Firstly, you need to have documentation why the decision was made that a given log message is not critical. Secondly, there should be an automated link with your log file monitoring system, that configures an exception in it. Depending on your business this whole area might also be regulated, so the legal side may very well play a role as well. But that is outside the scope of this post.
I know this post is not really actionable, but still wanted to share my thoughts.
For a while now I have been contemplating various aspects on logging and related areas. Some of them have found their way into this post. I look purely at application logging, leaving out the underlying layers. In particular those comprise the operating system and other software that from the application’s perspective can be considered infrastructure (e.g. databases, middleware).
There is a bunch of different groups that are affected by the logging aspects of an application. They all have specific requirements and some of those aspects will be looked at now.
- Developer: Yes, also the developer has some demands towards logging. For me two things are particularly relevant: During the creation of code I don’t want to be distracted from writing the actual logic. And later I don’t want to wade through lots of boiler-plate code.
- Application administrator: This person knows the internals of the application quite well and helps users when they run into problems with the application; often he or she also serves as a liaison with the system’s administrator. They must be able to quickly find out if a quirk comes from a real problem within the application, is a result of some external problem, or perhaps stems from “wrong” usage.
- Operations: These guys have to ensure that the entire IT landscape is running smoothly. All too often stuff is thrown upon them that has been designed with very little thought on daily usage. They must be given the possibility to quickly see whether everything is ok or things need to be escalated with application support. In particular this requires integration into system management tools. Those are usually working with JMX, SNMP and log file monitoring.
There is certainly a lot more to say here, but this should give a first overview and make clear that a variety of requirements needs to be fulfilled.
Logging vs. Monitoring vs. Management
One way how to look at the above line is that it describes a hierarchy of aspects that built upon each other: A set of log entries allows me to monitor my application; and monitoring is the basis for management. So logging is indeed a very important step for a smooth, efficient and compliant operation of my organization. The more you move towards the higher-level facets, the more important it is to abstract from the single, raw “event” and see the bigger picture. What also becomes important is correlation of events. Perhaps my application becomes less stable whenever database response times exceed a certain threshold. And Cloud Computing will certainly add something here, too.
Most developers think of logging as an unloved necessity. But why is that? In my post about Asynchronous Communication I made the point that poor tooling does not make a pattern bad, it is just poor tooling. Likewise, I think many people simply don’t have the proper tools for logging. Last year, while developing a small system management component, I conducted a small experiment on myself. Instead of hard-coding log messages I went the extra mile and wrote my own message catalog. The result is that my (implicit) workflow has changed. Whenever an additional log statement is needed, I now only have to do the following:
- Check the message catalog for a message I can re-use; if yes I’m done already (and perhaps have to wire in some parameters).
- If a new message is needed I need to decide on a new key. This should be done with some confidence that I don’t need to change it later (although this would already be much easier than plain text).
- Also a message and log level need to be chosen for the key. Those can both be done in a pretty quick-and-dirty approach, since changing them later in the catalog file is easy.
At first this may sound more complex than just putting in a plain text statement. But when putting everything directly into the code, all these things must be chosen with careful consideration because changing them later is much more effort. This distracts me a great deal, since coming up with a good log message and the appropriate log level is often far from trivial. While my initial rational for the message catalog had mainly been automated log file watching, this ease of development proved to be the real “killer” for me.