ESG's Dave Gruber talks with D.J. Sampath of Armorblox about Natural Language Processing.
Watch part one of this ESG360 series:
Dave: Hi, we're back talking email security with D.J. Sampath, CEO, and Founder of Armorblox. Welcome D.J.
D.J.: Thanks for having me.
Dave: I want to talk about natural language processing. At Armorblox, I know you've invested heavily with these techniques to solve some of these problems, but let me set this up. There are many out in the world today that are talking about machine learning. Few understands when people talk machine learning exactly how machine learning techniques are applied to solve problems.
Now, we're going to talk about natural language processing and deep learning. Can you help our audiences understand a little bit, first the technology, the language that's being used to describe the way these problems are being solved? And then let's talk specifically about what Armorblox is doing.
D.J.: I like to think about this as a Renaissance moment. A very similar thing happened to machine classification and image classification about 10 years ago. It's part of the reason why your Apple Face ID sort of works automatically right now. You know, it's the same level of maturity that's happened in understanding textual data.
Stanford calls it the Gender Language Understanding and Evaluation Score where 100% indicates machines being able to completely understand textual information. Humans score 87% and for the longest time the machines were in the low 30s and 40s, but about two years ago they started approaching 80s. They are within a striking distance of human capability right now. At Armorblox, we are bringing this technology, this signal to the security stack.
We're helping security organizations understand textual information, which by the way, 70% of all data inside of the enterprises are largely textual, whether they're emails, documents, spreadsheets, what have you. We're in the business of helping organizations understand that data and then protecting them.
Dave: That's great. So, when you understand that data, then I assume you can look for patterns of what's expected, expected language, expected content versus what might be abnormal.
D.J.: Precisely. And that's where the machine learning part of the technology comes into play. We understand what's normal, we construct those baselines and then we detect what's abnormal when it goes beyond what is expected, you know, by two orders of magnitude and we say, "Hey, that's unusual." We take that information, provide the context using statistical information and that's part of what we built as a natural language understanding platform.
Dave: That's great. So, just to be perfectly clear, so natural language processing and machine learning are used together.
D.J.: Correct.
Dave: In this case, where the natural language processing piece of this equation is the actual language understanding.
D.J.: Right.
Dave: And then once we understand those patterns or we understand what the content actually is, we can then apply machine learning on top of that and then we can look for normal and abnormal and other patterns of behavior and then apply security controls? Is that the concept?
D.J.: You're spot on. And just to add one last piece to that is if you've never sent to me a 1040, you know, a tax form, but you normally do send me a W2, say that, hey, now I work for you and you are sending me W2s. And you know, we now have the capacity and the capability of being able to recognize and distinguish those two types of documents to the extent where when a machine or an algorithm looks at a W2, it says, "Hey, that's associated with payroll."
Dave: That's very interesting.
D.J.: 1040 insists that's taxes. And so, being able to do that is truly a game-changer when it comes to security.
Dave: Wow, it sounds like it. So, it's not just the body content that we're talking about, it's also potentially what's enclosed in the attachment.
D.J.: Correct.
Dave: And then matching those up and looking for patterns accordingly there.
D.J.: That's right.
Dave: That's terrific. So, attackers are smart people, right? So, they pick up on these things over time here. What makes this, I won't call it attacker proof, but different than other security controls that the attackers have been able to game us on.
D.J.: No, I think it's always going to be an escalation game, right? And you know, every single time you come up with a security control that protects something you got to have, you know, you've always seen attackers sort of step up the game. And the challenges with some of these types of, you know, technologies is that you are now likely to see modeling version attacks. People are going to try to game the machine learning models.
We're likely to see escalation of those types of attacks and we'll just have to stay on top of our game to protect against those types as well.
Dave: Yeah. These things they all sound game-changing. Where does Armorblox fit in from a solution? Is this a layered control? Is this a replacement control?
D.J.: The security controls have to be layered. There is no one control that's going to solve for all pain points. So, Armorblox is a layered control. It's a defense in depth. You know, we believe in it and it's exactly, it's part of that philosophy and it provides a new control that does not exist, has not existed inside of enterprises.
Dave: Got it. So, as people think about applying your technology, they should think about this as a net add to their existing security controls.
D.J.: That's correct.
Dave: Wonderful. Wonderful. Well, D.J., this has been enormously informative and it feels like this could be potentially a breaking point in what we're seeing today in email security controls, especially when it comes to business email compromise and some of the really tougher phishing attacks that we have. So, thank you so much for being part of the interview and we'll look forward to seeing more from Amorblox in helping us solve some of these really tough problems.
D.J.: Absolutely. Thank you so much for having me, Dave.
Dave: Great. Thanks.