I Heart Logs, But Still Have to Live With Them: A Log Integration Guide
Jay Kreps, CEO of Confluent, is what you might call a “log evangelist.” He wrote a book about the importance of logs in 2014 – “I Heart Logs.” The book explains how logs work in distributed systems and what kinds of use cases exist for them.
Kreps is correct – logs are a critical part of IT systems management. Log integration is another crucial component – read on to learn how log integration works as well as some real-life use cases for it.
How Does Log Integration Work?
Logs record what happened and when within a system. In the event of a crash, it’s the authoritative source to restore all other persistent structures.
Nowadays, logs are everywhere – event data encompasses a wide variety of information, including financial information, RFID tags, and IoT information streams. People need to be able to make sense of this information, which is where log integration comes in.
Kreps notes in “I Heart Logs” that the log is the answer. By putting all of the organization’s information into a central log, you can enable real-time subscription. “I Heart Logs’” author explains that a data source could be an application that logs events or a database table that logs modifications. Subscribers could be any kind of data system, such as a database or a cache.
You need to extract log information from its origins and then load it into a distributed streaming platform. Subscribers then consume information from the distributed streaming platform.
Real-Life Uses for Log Integration
Kreps also cites some real-life uses for log integration in “I Heart Logs.”
The first example he brings up is from the field of finance. Finance professionals rely upon real-time streaming of information to make decisions. Log integration gives them the data they need.
Another instance of log integration at work comes from the search engine giant Google. Several years ago, Google rebuilt its web crawling, processing, and indexing pipeline atop a stream processing system. Google undertook this process to enable better, faster search results for its users.
Kreps also brings up the example of stateful real-time processing. Stateful real-time processing allows for the enrichment of an event stream (we’ll use a stream of clicks as an example with information about the user who clicks on it). This kind of processing would normally require the processor to maintain some kind of state; however, logs can convert streams to tables co-located with our processing. It also provides a mechanism for handling fault tolerance for these tables.
“I Heart Logs” is a compact, yet compelling treatise on the importance of logs. They enable log integration, which in and of itself is the key for so many important functions (most importantly, getting more out of your information). To learn more about this year’s data trends, check out our eBook.