So, what’s a data guy doing at a security conference? Three things come to mind:
- Security is increasingly about using massive volumes of disparate data to model user or application access to sensitive info, then identify and investigate anomalous behavior.
- The concept of an enterprise data hub or data lake is particularly appealing to attackers (external or internal) as it concentrates valuable info in one place.
- Big data often starts as an experiment and the security and governance models are still relatively immature, compounded by rapid innovation and updates.
Most people I met at the show were talking about the first topic. The traditional security vendors are eager to paint themselves as “next gen” with big data analytics to find the subtle patterns that may indicate a problem. Frankly, they use the concept extremely loosely, with one claiming just counting applications and devices into the hundreds was a big data approach. The combination of machine learning and advanced analytics on many data sources to find the baseline, the context, and the worrisome exception is pretty solid though, particularly when built on Hadoop or NoSQL databases to handle the load. The major variation in theme was only what layer of infrastructure they targeted: network and applications being the most popular.
A few were starting to think about the security of a big data repository. Who should have access, how that should be controlled, how it could be masked or tokenized, and the like. This hits an important gap in the market, as the rush to bring out the fastest model user friendly big data and analytics tools hasn’t necessarily thought about the enterprise implications and requirements. I expect to see this changing as big data moves into widespread production, and IT operations teams think beyond the data science analysts to evaluate the inherent risks like data protection and security. By the way, saying it’s a test-bed or sandbox doesn’t mean the data is any less sensitive.
Last, the sheer pace of innovation, the number of new connections, and the rate of updates both proprietary and open source will make it even harder to ensure the big data environments are secure. With components ranging from storage to servers to databases to analytics to applications… and each of these pushing out new code monthly, someone needs to figure out the challenge of building and maintaining a secure technology stack.
More to come, but nice to see the market taking notice of the impacts of big data on security and security on big data.