AWS Launches No-Code ETL Tool with Glue DataBrew
Amazon Web Services (AWS) this week announced the launch of Glue DataBrew, a tool that lets organizations prepare their data for machine learning projects using a simple point-and-click interface -- with no coding required.
Glue DataBrew is an extension of AWS' original Glue product, first introduced in 2017. Glue was originally designed to automate the extract, transform and load (ETL) tasks related to preparing data ahead of a machine learning project. Typically, that process could take months, according to AWS. Glue promised to complete it in minutes.
Glue DataBrew, which AWS touts as a "visual data preparation tool," simplifies ETL even further by helping organizations "clean and normalize data up to 80% faster" using its visual interface,
according to the announcement.
The tool is able to access data from various AWS resources, including Amazon Redshift, Amazon Relational Database Service, Simple Storage Service (S3) and the Glue metadata store. It also works with data stores that are accessible by the Java Database Connectivity API.
Organizations can use the Glue DataBrew console to quickly organize, combine and manage their data. The tool comes with 250 "transformations" that automate common data cleanup tasks -- for instance, detecting anomalies, correcting non-standard formatting or removing invalid characters.
Users can automatically apply whatever transformations they choose to future data, as well. "Each transformation is automatically added as a step to build a recipe," AWS said in its announcement. "You can then save, publish, and version recipes, and automate the data preparation tasks by applying recipes on all incoming data."
Glue DataBrew also provides a "lineage" map that visually tracks the transformations that have been applied to a given set of data. "In this way, you can understand how data flows and what are the changes." AWS said. "This information is called data lineage and can help you find the root cause in case of errors in your output."
Glue DataBrew is available in the following AWS regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Asia Pacific (Tokyo) and Asia Pacific (Sydney). Users pay as they use the service, with "no upfront commitment," according to AWS. More information on the service is available here.
Gladys Rama is the senior site producer for Redmondmag.com, RCPmag.com and MCPmag.com.