Time to Stop Creating Aggregation Points. Everywhere.

I complained recently about how great it was that servers could scale out ad infinitum these days, but why was it that we still keep edging down and down until we are forced to traverse ONE single RAID controller in order to get to the 8000 disks with the data behind it? It makes no sense. Sooner or later, we need to scale RAID controllers linearly just like we do servers – after all, each is really the same thing running different applications.

The same holds true in networking. To get from point A to point B, we have to traverse a zillion ports until we get to a CORE (think RAID controller) to get back out to what it is we really want, which is on another port(s) somewhere else. Why not just have everything connected to everything and take a direct path?

It’s not that crazy.

When you cram down many into one, you force that one to become super “thing.” You have to deal with scheduling and caching and queue depths and buffering, etc. just to try to mitigate the inevitable performance problem that you will get when you throttle many things down to one thing. That means that by default you need a lot of extra horsepower, technology, intelligence, and COST at that “choke point.”

What if networks (or RAID controllers) were horizontal versus edge/core vertical? What if we didn’t even have switches? What if the server did its own switching and used a big mesh fabric to create a transient direct connected tunnel from point A to whatever point B is – another node, storage, whatever. You wouldn’t need buffering or queueing, etc. You would just open the pipe and rock and roll.

This is the beauty (potentially) of what true scale out storage and SDN networking (the Cisco ACI/Plexxi/Rockport Networks way) potentially bring to the party in the new wave data center.

What if there was a massive interconnect (think a virtual ring) between every “node” no matter what those nodes were? And what if that interconnect was so massive that each node that needed to communicate with another node was able to instantly set up a dedicated path from A to B to C – complete its task, and close the path?

It would mean I really wouldn’t worry too much about protocols anymore (at least network ones). Or layers.

Then, purely theoretically, we would let all switching function happen at the source, and really wouldn’t need anything but glorified junction boxes to connect things to. That’s what Rockport is attempting to do (let the server use some of its zillion MIPS or ZIPS to run software to switch). Plexxi has a massive fat backbone that can do the standing up and tearing down of transient pipes between nodes whereby theoretically you could never be more than 3 hops away from what you want even in a 10,000 node system – with latency that is way below merchant silicon. Cisco is a weird one as they LOVE the edge/core model, but their ACI initiative is a preemptive strike that could very well fit into this picture.

There are few scale out storage guys already, but more will have to follow. If Microsoft has its way, the concept of scale out direct attached storage could very well mean an end to monolithic RAID systems as we know it.

Yeah, yeah, I know – this is gross speculation but inevitably things that should happen do – sometimes it takes a decade or four.

Topics: IT Infrastructure Networking