Databricks Now on AWS

Databricks, the data analytics provider founded by the creators of the Apache Spark analytics engine, announced a new partnership with Google Cloud to enable the deployment of its namesake data engineering solution on another leading cloud platform.

With this move, the San Francisco-based company has reportedly "scored a hat trick" or "won the triple crown" (pick your metaphor), because its solution is now available on Amazon Web Services (AWS), as well as Google Cloud and Microsoft Azure. Databricks is billing its solution as "the only unified data platform across all three clouds." The AWS implementation is available on Amazon Marketplace.

This comes on the heels of news that the company has raised $1 billion in Series G funding. The funding round was led by a new investor, Franklin Templeton, which joins strategic investors AWS, CapitalG and Salesforce Ventures, among others. Microsoft led the list of existing investors participating in the round.

Databricks users will be able to create a data "lakehouse" (which combines the capabilities of a data lake and a data warehouse) on Google Cloud's elastic network that is capable of data engineering, data science, machine learning (ML) and analytics. Databricks now integrates with Google BigQuery's open platform and leverages the Google Kubernetes Engine (GKE), the companies said in a statement, enabling its users to deploy Databricks in a fully containerized cloud environment for the first time.

"Built on a modern lakehouse architecture in the cloud, Databricks helps organizations eliminate the cost and complexity that is inherent in legacy data architectures," said Ali Ghodsi, CEO and co-founder of Databricks, in a statement, "so that data teams can collaborate and innovate faster. This lakehouse paradigm is what's fueling our growth, and it's great to see how excited our investors are to be a part of it."

The new integrations between Databricks and Google Cloud include:

  • Tight integration of Databricks with Google Cloud's analytics solutions, which makes it easier to extend "AI-driven insights" across data lakes, data warehouses and multiple business intelligence tools.
  • Prebuilt connectors for integrating Databricks with BigQuery, Google Cloud Storage, Looker and Pub/Sub.
  • Fast and scalable model training with Google Cloud's AI Platform using the data workflows created in Databricks, and simplified deployment of models built in Databricks using AI Platform Prediction.

Both Databricks and Google have long employed strategies with strong support for open source, and with this announcement, they threw a spotlight on "a commitment to open innovation and open source software."

"Under this new partnership, the two companies will continue to support the open source community, encourage open innovation and collaboration, making it easier for joint customers to build on open-source technologies," they said.

Last year, Databricks contributed its open source MLflow machine learning platform for managing the lifecycle of ML models to the Linux Foundation.

Other vendors, whose partnerships with the two companies form a Databricks/Google Cloud joint ecosystem, have committed to ensuring "seamless integrations" with Databricks on Google Cloud, including Accenture, Cognizant, Collibra, Confluent, Deloitte, Fishtown Analytics, Fivetran, Immuta, Informatica, Infoworks, Insight, MongoDB, Privacera, Qlik, SADA, SoftServe, Slalom, Tableau, TCS and Trifacta. 

Databricks was founded by the creators of the Spark research project at UC Berkeley that later became Apache Spark. The company's namesake unified analytics platform is powered by the Spark big data distributed processing engine. Data science teams use that platform to collaborate with data engineering and lines of business to build data products.

About the Author

John K. Waters is the editor in chief of a number of sites, with a focus on high-end development, AI and future tech. He's been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he's written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS.  He can be reached at


Subscribe on YouTube