The risks associated with mismanaged logs is measured in financial costs, penalties for non-compliance, lost opportunities, and missed discovery of indicators of compromise that would otherwise have been detected. This article discusses the log life-cycle trend in todays complex computing environments.
The log generators (the source system) we’d consider part of a logging program include on-premises servers and resources, and cloud servers and resources,. But more specifically, work-flows, processes, and observations (both technical and non-technical).
It used to be the case that systems would log only to them selves. This may still be the case for small organisations and home environments, organisations with low-maturity logging programs, and even where the system/data owner simply doesn’t know how or where to send logs. This is common in cloud solutions.
More mature organisations (in terms of their logging programs) would likely default to the target-state of having each capable system directing their logs to a central log repository solution. This could include using native log collection and transport solutions such as Windows Event Logs, and Syslog. It could also be the case that the central location is more than just a repository, but a SIEM where the logs are processed and assessed for indicators of compromise (or any negative scenario).
The most mature organisation (again, in terms of their logging programs) will go to far greater lengths to ensure the log life-cycle is most positively contributing to their organisations success. The remainder of this article is focused on what that could look like.
Consider these log life-cycle commandments for a mature logging program:
- All systems are assessed for their inclusion in the logging program. That is, the system/data owners are consulted on how their systems and information should or could be included. Cost-benefit analysis could be used to determine the viability of inclusion. A risk assessment could be used to determine the risk and cost of no inclusion. And an impact assessment could be used to consider to the extent of inclusion.
- Included systems are configured to send their logs to a data lake or log proxy/filter solution. A data lake is used as a target for unstructured log-data (in their original form) generally, and made available for a duration of time where other systems can retrieve logs for processing. A proxy or filter is a system that receives logs in their original form, and optionally filters them before sending them to their final destination or otherwise their next phase of their life-cycle. Regarding filtering, the logs could have fields removed, altered, added, enhanced/enriched, before being forwarded on. It is also common for logs to be forwarded onto multiple location (ie, a single log could be sent to multiple next-stage solutions). For example, the log proxy/filter might receive a server log that is then forwarded to a) Azure Log Analytics, and b) Elasticsearch. It is also common for logs to be tagged at this point. This helps identify the owner of the logs, the system/environment they originated in, and their significance. The tagging could happen in the first phase, but most often happens in this phase because the original system may not have tagging capabilities.
- Extending on point 2 above, the filtering capability is important. This phase of the log life-cycle assist with transforming logs in to a format suitable for processing down-stream by the SIEM. For example, it might be necessary to convert a log from Syslog format to JSON before the SIEM will ingest it.
- A SIEM solution must be able to either a) retrieve logs from a data lake, or b) receive logs from the log proxy/filter. In both cases, the processing that happens on the SIEM is the same. The SIEM should accept the logs in the appropriate format, process it by comparing the logs to threat feeds and other indicators of compromise (considering both automated and manual rules), and finally alerting the responsible role. The role responsible is those who will accept the incident report (usually by a support ticket system) and then action the issue to a point where the system/data owner is satisfied. It’s generally a good idea to keep the support ticket system external to the SIEM – this allows the SIEM to be replaced at any point without loosing the history of incidents.
At this point you can see the differences between an immature logging program, and a mature one. It can be expensive, time consuming, and confusing to design and implement a well considered logging program. But the benefits come when an incident or opportunity arises and the organisation is well positioned to respond.