BlueData Taps AWS Cloud for Hybrid Big Data-as-a-Service
BlueData Software Inc. tapped the Amazon Web Services Inc. (AWS) cloud to add support for hybrid architectures to the latest release of its Big-Data-as-a-Service (BDaaS) software.
With the new offering, organizations can run Big Data analytics workloads both on-premises and in the AWS cloud using a common self-service UI and administrative console.
While noting that legacy Apache Hadoop- and Big Data-based almost exclusively run in on-premises deployments, BlueData said hybrid implementations leveraging public clouds are becoming increasingly viable and popular options. It referenced research from Wikibon that indicated Big Data spending in the public cloud is expected to grow from 5 percent of the overall market share in 2015 to 24 percent by 2026.
Research firm Gartner Inc. also noted the increasing encroachment of the cloud in Big Data analytics in its recent report on the "Data Management Solutions for Analytics" market.
"Expectations are now turning to the cloud as an alternative deployment option, because of its flexibility, agility and operational pricing models," Gartner said. "As the use of a combined cloud and on-premises hybrid is quickly becoming the norm, so organizations expect vendors to support them in enabling such deployments."
As if on cue, BlueData today said it's providing exactly such support in the spring release of its BlueData EPIC platform, a self-service, on-demand offering.
"BlueData offers the first and only BDaaS platform that supports this hybrid model, leveraging the inherent infrastructure portability and flexibility of Docker containers," the company said in a statement. "With BlueData, data science teams can spin up instant Hadoop and Spark clusters either on-premises or in the public cloud from a 'single pane of glass.'
"They can iterate quickly and fail fast -- using their tools of choice and focusing on insights instead of the infrastructure. They can easily share their data, models, and code in secure multi-tenant environments powered by fully managed and embedded Docker containers. They can achieve faster time-to-market and lower TCO for their data pipelines, while ensuring enterprise-class IT governance and control."
The hybrid approach -- enabled by BlueData's recently announced support of the AWS cloud -- lets organizations support data science teams no matter what underlying infrastructure is used, offering self-service analytics in elastic and secure environments, BlueData said.
"Now, our customers can easily use the power of Amazon Web Services (and other public clouds in the future) as an extension to their own Big Data infrastructure," BlueData's Anant Chintamaneni said in a blog post. "Conversely, our customers can tap into on-premises data from their Big Data deployments on the AWS public cloud. They can provide self-service, elastic, and secure environments for Big Data analytics whether in an enterprise data center, in the public cloud, or some combination of the two."
In addition to a common UI, other features in the EPIC platform introduced to support hybrid's unique requirements include:
- Unified multi-tenant security model: BlueData provides enterprise-class security and access controls for multi-tenant Big Data deployments. These same capabilities now extend to AWS and hybrid implementations -- including authentication, Kerberos security, LDAP / AD user management, and integration with AWS Identity and Access Management to control access to Amazon resources (such as Amazon EC2 and S3).
- Workload portability: Leveraging the inherent portability of Docker containers, the same Docker-based application images in the BlueData EPIC App Store can be used with any infrastructure. This flexibility makes it easy to deploy identical Big Data environments -- either in the public cloud or on-premises -- for each stage of the software development lifecycle (for example, dev/test QA/production) or for back-up and disaster recovery.
- Hybrid data access: BlueData reportedly provides the only BDaaS solution with the ability to tap into on-premises storage (for example a Hadoop data lake) from clusters running in the public cloud -- as well as the ability to securely access cloud storage (for example, Amazon S3) from an on-premises deployment. This flexibility allows enterprises to deploy their Big Data workloads based on where the data is generated and/or stored, to minimize data duplication and data transfer costs.
The Santa Clara, Calif.-based company said it would demonstrate the BlueData EPIC spring release at next week's Strata+Hadoop World conference in San Jose, Calif.
David Ramel is an editor and writer for 1105 Media.