Cloud Dataproc is a highly available, cloud-native Hadoop and Spark service that provides organizations with a cost-effective, high-performance solution that is easy to deploy, scale, and manage. Dataproc nodes can be deployed and spun up in less than 90 seconds and can be easily customized and resized with the optimal resources required for individual jobs. The clusters access data stored in Google Cloud Storage (GCS), and can be leveraged in conjunction with Google’s other big data solutions such as BigQuery, Dataflow, and TensorFlow to deliver a single platform for data processing, analytics, and machine learning.
ESG’s economic analysis revealed that an effective deployment of Dataproc can provide significant cost, administration, and agility benefits when compared with on-prem Hadoop and Spark deployments. Additionally, ESG found the flexibility and cost structure of Dataproc to provide savings and benefits even when compared with using Amazon Web Services Elastic MapReduce (EMR) managed Hadoop service offering. ESG found that Dataproc provided its customers with significant savings and benefits in the following categories:
- Hardware investment – The drastically reduced need for onsite hardware results in lower upfront hardware cost while the simplified scalability of a Dataproc environment eliminates the need to over-purchase hardware.
- Simplified administration – ESG validated that administration, maintenance, and operation of a Dataproc environment is easier, faster, and more effective than managing Hadoop on-prem or with EMR.
- Business agility – Customers reported that Dataproc enabled enhanced business agility that allowed them to harness the value of their data to better service existing customers and open new revenue streams.