ESG's Mike Leone and Scott Sinclair discuss the impact of AI on Storage Infrastructure.
Scott: Hello. I'm joined today by resident artificial intelligence and machine learning guru, Mike Leone. Mike, thank you for joining us. You know, one question that I really want to understand is AI, machine learning, these emergent workloads, they tend to be the story everywhere. It seems like everyone has an AI-based startup these days. What sorts of things, from a data storage aspect, because data is so important, what are some of the storage-related trends that storage administrators need to be thinking about when supporting AI-based workloads?
Mike: Yeah, that's a great question. You know, a lot of times, you're so focused on the data and that foundational piece of the infrastructure is the storage, right? So first and foremost, we're seeing a big focus on the object storage side, right? You need to provide some context around data, especially once you start factoring in that unstructured data. You need to apply metadata to it to be able to classify it, to be able to search. So that's really that first aspect of it. Then it comes into capacity requirements and scalability requirements. So when we're training, obviously you need these massive data sets to be able to accurately train a model. Then you shift to the deployment of that, right, A/B testing, into production, and that's where those models start generating their own data and generating a lot of data. So you're going to need to be able to provide not even just basic capacity but the ability to scale easily and efficiently to address some of those scalability demands.
Scott: Okay, so, you know, you hit on almost everything. We need to have massive scalability. You also have to figure out how to handle the performance. But I think context is really one of the really fascinating things that you said because it's about infrastructure that can understand our data. And so I think those are some of the major trends. Anything else or are those the top three?
Mike: Yeah, I mean, there's a couple other things around performance and I'd say cost, right? So on the performance side, really the last thing you want to do is make a massive investment into some of these bigger processing type infrastructures that have GPUs in them, right? And you don't want those GPUs to have wasted cycles, have wait time. And ensuring that you have a powerful enough or highly-performing storage infrastructure to keep up with the speed of those GPUs is really essential. So you don't want to waste cycles. You don't want to have any inefficiencies and that's not even just on the storage, that's across the whole stack. And that kind of goes into the cost angle of it, right? The number one challenge organizations face right now is the cost of infrastructure required to support AI and ML and it really starts with that storage infrastructure, making sure you have that solid foundational piece that you know you don't have to worry about anymore. So start with storage and then move from there.
Scott: I think that's great advice. Thank you, Mike. This is incredibly insightful. Thank you very much.