Mixing your own big data solution often seems like a bit of alchemy. A venerated wizard seeks the rare ingredients, carefully measures out the quantities, lets it bubble for a few months in a POC, then with an "abracadabra," finds some poor users to taste the potiion. If they turn green and keel over, it's back to the old cauldron. The common lack of skills makes this all the more fraught. Now Oracle wants to change up the recipe.
Today, Oracle announced its own Big Data SQL for Hadoop and NoSQL as a way to demystify the process of querying big data, and instead use the familiar to join together traditional relational database data with newer emerging technologies. The basic idea is to bring together all data into a common approach for SQL analysis. To achieve this, a few development requirements were defined:
- Full SQL (not a limited set of functions) must be supported on Hadoop and NoSQL
- No changes to either applications or Hadoop
- Common access to all enterprise data through understood metadata
- No compromises on performance
Initial indications are very positive. By collecting all the external Hive metadata into the Oracle catalog, the "unstructured" becomes a known quantity, and through the combination of some SmartScan filtering, indexing of the storage, and caching the expected speed is delivered. The complete SQL feature set means existing applications and queries should work well without needing rewrite or tweaking around gaps for all the traditional RDBMS, Hadoop, and even NoSQL in the enterprise. This new capability will mean Oracle remains in a very strong, central position even as new data platforms gain traction with customers of all stripes. Presto!
You can read more about the announcement here: http://www.oracle.com/us/corporate/pressrelease/big-data-sql-071514