PolyBase – How SQL Server does Big data!

Developer Support

In this post, Senior App Dev Manager Nayan Patel introduces SQL Server PolyBase and shares some insights from a recent engagement using the technology.

I was recently involved in helping my customer complete a successful PoC around SQL Server 2016’s PolyBase feature.  They have invested a signification amount of budget and resources standing up their Hadoop environment to house massive amounts of unstructured/semi-structured data. However, they have been unable to retrieve meaningful insights and value from it, especially due to the required ETL processes and complex MapReduce actions.   As part of the PoC, the SQL team was successfully able to join data from their SQL data warehouse and their Hadoop environment using the PolyBase feature and run T-SQL queries on the data.

In today’s business environment, data is varied and distributed.  It ranges from structured data like customer records stored in relational databases to semi-structured data like streaming data feeds in Hadoop and cloud blob storage.   However, in order to gain insights, data has to be moved around or transformed into one format which is time consuming and expensive. PolyBase enables users to run queries on external data in Hadoop or to import/export data from Azure Blob Storage using standard T-SQL.  It minimizes data movement by pushing queries to the external data source and returns only the results, generating Hadoop jobs automatically. 


Some of the benefits of using PolyBase include:

  • Directly query Hadoop from SQL server using T-SQL queries
  • No knowledge of Hadoop or MapReduce required
  • No additional software is needed in the users Hadoop or Azure environment
  • Optimal performance using distributed query processing
  • Provides seamless integration with SQL BI tools
  • Organizations can reduce time to insight. For example, an auto insurance company can easily combine car telemetry data with customer records to deliver personalized insurance quotes

Learn more about PolyBase here.

Premier Support for Developers provides strategic technology guidance, critical support coverage, and a range of essential services to help teams optimize development lifecycles and improve software quality.  Contact your Application Development Manager (ADM) or email us to learn more about what we can do for you.


Discussion is closed.

Feedback usabilla icon