I’ve been having a lot of conversations with security professionals about big data security analytics. In some cases, I present to a large audience or I’m on the phone with a single CISO in others.
While big data security analytics content varies from discussion to discussion, I consistently come across a lot of misunderstanding around the topic as a whole. This is understandable since “big data” is really a marketing term that the industry has all but coopted. Worse yet, security vendors have glue the mystery of “big data” and, the misconceptions of security analytics, and marketing hype together. No wonder why security professionals remain confused!
Based upon my experience, I believe it’s time to add some clarity to big data security analytics. Here is my FAQ in pursuit of this objective:
- What is big data? Okay, this question is important to establish a baseline of understanding across the industry. Here’s the definition I use:
In information technology, big data is defined as a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The challenges include capture, processing, storage, search, sharing, analysis, and visualization.
There is the “4 Vs” corollary to this definition (i.e., volume, velocity, variety, veracity) but I think the definition above is sufficient for starters.
- What is big data security analytics?
Add the words “information security” (or “cybersecurity” if you like) before the term “data sets” in the definition above. Security and IT operations tools spit out an avalanche of data like logs, events, packets, flow data, asset data, configuration data, and assortment of other things on a daily basis. Security professionals need to be able to access and analyze this data in real-time in order to mitigate risk, detect incidents, and respond to breaches. These tasks have come to the point where they are “difficult to process using on-hand data management tools or traditional (security) data processing applications.”
- What is a security analytic?
A very good question--here’s my own attempt at a definition. First, security analysis is the examination of a multitude of phenomena for the purpose of detecting and/or responding to security incidents capable of impacting the confidentiality, integrity, or availability of IT assets.
I would then define a security analytic as: A deduction based upon the results of interactions of multiple simultaneous security phenomena.
The thing that big data security analytics technologies allow us to do is capture more data and perform multi-variable security analytics. In the past we relied on simple security analytics to help us trigger a response. For example: “Trigger a security alarm when someone has 3 failed log-in attempts on a critical system.” Effective but too simple and way too many false positives. With big data security analytics, we can generate security analytics that get much deeper: “Trigger a security alert when someone has 3 failed log-in attempts on a critical system when this activity is executed after hours from an employee device, the employee’s job responsibility is such that he or she should not be logging into this system, and the physical security system indicates that the employee is not in the building.” This is the kind of stuff that companies like Click Security, Lancope, and Solera Networks are working on.
- Do big data security analytics require Hadoop?
Easy answer – no. Hadoop technologies are certainly built into some big data security analytics solutions from vendors like IBM and RSA, but there is no requirement for Hadoop per se. Lots of vendors have developed their own data repositories (in lieu of Hadoop) that collect, store, and analyze security data.
In the future, it is likely that Hadoop and other big data technologies will find their way into big data security analytics solutions but there are plenty of leading big data security analytics solutions that don’t use or integrate with Hadoop at this time.
- Isn’t big data security analytics only good for analysis of massive amounts of historical data?
This is certainly one of the primary use cases but there are others as well. Many big data security analytics solutions are built using “stream processing” to accommodate the high I/O rate needed to process massive amounts of security data. In simple terms, stream processing distributes the processing load over a number of distributed nodes. Each node can provide local security analytics and the nodes combine to form a computing grid for more global security data analysis value.
Big data security analytics built using this type of stream processing and grid architecture are designed for instant event detection and forensics. ESG calls this model, “real-time big data security analytics solutions.” ESG calls big data security analytics designed for the historical use “asymmetric big data security analytics solutions.”
- Is SIEM considered big data security analytics?
This is a tough one because the true answer is “it depends.” In the past, SIEM solutions were built for perimeter security management and/or compliance. Throw terabytes of data and complex queries at these platforms and they lack the appropriate scale, processing power, and usability needed. However some vendors (i.e., ArcSight, eIQ, IBM, and LogRhythm for example) have added the distributed stream processing capabilities described above. These newer SIEM systems have the scale and capacity for big data security analytics as well as traditional SIEM functions.
- Isn’t big data security analytics for big companies with lots of security skills and resources?
Yes, those are the types of organizations on the leading edge but I would argue that all medium to large organizations need this type of security intelligence. Big companies will likely buy products and solutions while smaller companies will reach out to service providers like Arbor Networks (PacketLoop), Dell/SecureWorks, or the new SAIC spin-out Leidos. The best products and services will bake-in intelligent algorithms, intuitive visualization, and process automation.
- How do I get started with big data security analytics? My suggestion is to download open source tools like BigSnarf, PacketPig, or sqrll.
This isn’t an exhaustive list but I’ve hit the major areas. Hopefully, this will help security professionals move beyond the hype and start to understand how big data security analytics can deliver real value.