{"id":198,"date":"2020-08-06T16:21:56","date_gmt":"2020-08-06T23:21:56","guid":{"rendered":"https:\/\/devblogs.microsoft.com\/azure-sql\/?p=198"},"modified":"2020-08-24T14:08:08","modified_gmt":"2020-08-24T21:08:08","slug":"developing-in-the-cloud-with-sql-server-big-data-clusters-getting-started","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/azure-sql\/developing-in-the-cloud-with-sql-server-big-data-clusters-getting-started\/","title":{"rendered":"Developing in the cloud with SQL Server Big Data Clusters: Getting Started"},"content":{"rendered":"<p>When it comes to innovation, we realized that the world\u2019s most valuable resource is no longer oil, but data. One of the biggest challenges we\u2019re having today is how to integrate disparate data sources from many different places. If you&#8217;re looking into how to work on Big Data Analytics solutions in Kubernetes, that&#8217;s where Big Data Clusters (BDC) comes in.<\/p>\n<p><a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/big-data-cluster-overview?view=sql-server-ver15#:~:text=A%20SQL%20Server%20big%20data,it%20with%20your%20relational%20data.\">SQL Server Big Data Clusters (BDC)<\/a> is a cloud-native, platform-agnostic, open data platform for analytics at any scale orchestrated by Kubernetes, it unites SQL Server with Apache Spark to deliver the best data analytics and machine learning experience.<\/p>\n<p>Now, you are maybe thinking you misunderstood what you just read. No, let me be clear about that. With Big Data Clusters <strong>you run SQL Server in Kubernetes, along with Apache Spark, to get the best of both platforms<\/strong>. You\u2019re not dreaming, and yes, this opens a huge universe of opportunities for developers. Read on!<\/p>\n<h2>SQL Server on Kubernetes. With Apache Spark. And more.<\/h2>\n<p>Kubernetes is the most popular open-source container orchestrator today in the community, it has gained traction incredibly quickly and keeps growing every day.\u00a0 Community makes it even greater with the help of the whole cloud-native ecosystem to resolve the challenges around security, compliance, networking, monitoring, and more whilst working with Kubernetes.<\/p>\n<p>From those developers are working with Kubernetes for their various workloads built up in microservices-based architecture or better saying as of today as correctly-sized services-based architecture since some customers found it\u2019s becoming more and more struggling to maintain overloaded microservices in the enterprise environment. How to consume big data from different data sources in effectively and efficiently is coming into question.<\/p>\n<div>The challenge in question can be addressed easily by SQL Server Big Data Clusters (BDC) deployment in various flavors of Kubernetes, for example: on Azure Kubernetes Services or Red Hat OpenShift. Once you deployed BDC in your Kubernetes cluster, you can interact with your high-volume big data with TSQL or Spark. Your application will be able to consume data from external SQL Server, Oracle, Teradata, MongoDB, etc, any big data stored in HDFS. Also, since your SQL Server instance sits in the cluster, there are dramatically fewer concerns around latency and throughput.\u00a0 BDC is not only a big data analytics platform but also an integrated AI and machine learning platform, it enables AI and machine learning tasks on the data stored in HDFS and SQL Server.<\/div>\n<p>Your modern cloud-native application is not only up and running platform agnostically, take advantage of all-natural benefits from Kubernetes for its high availability, scalability, and full support from its ecosystem, but also packed with the capability to work on big data analytics and consume the machine learning models.<\/p>\n<h2>Developing applications with BDC<\/h2>\n<p>What\u2019s in for a developer you may be asking to yourself. This post and the ones will follow, will answer this question. To start let\u2019s focus on how to deploy your very first HelloWorld cloud-native application in BDC. \u00a0We\u2019ll begin with creating the BDC clusters and get access to it, then explore the key task such as the following in the same blog series:<\/p>\n<ul>\n<li><strong>Develop and deploy the application<\/strong>: <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/deploy-install-azdata?view=sql-server-ver15\">azdata utility<\/a> provides to create an app skeleton to develop which facilitates meanwhile <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/big-data-cluster-create-apps?view=sql-server-ver15\">application deployment<\/a> in BDC.<\/li>\n<li><strong>Run and test application<\/strong>: azdata utility provides commands to help you run and test the application in BDC.<\/li>\n<li><strong>Consume application<\/strong>: You\u2019ll be able to obtain an endpoint with azdata utility and a JWT access token and since Swagger is integrated, you\u2019ll be able to <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/big-data-cluster-consume-apps?view=sql-server-ver15\">test and run the application<\/a> with Swagger UI or Postman.<\/li>\n<li><strong>Monitor application<\/strong>: with <strong>Grafana<\/strong> integrated into BDC, you\u2019ll be able to keep monitoring your applications with various metrics.<\/li>\n<\/ul>\n<p>To wrap up, the whole process would look like the following diagram:<\/p>\n<p><img decoding=\"async\" class=\"aligncenter wp-image-203 size-full\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/Walk-through-Diagram.png\" alt=\"Image Walk through Diagram\" width=\"1377\" height=\"920\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/Walk-through-Diagram.png 1377w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/Walk-through-Diagram-300x200.png 300w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/Walk-through-Diagram-1024x684.png 1024w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/Walk-through-Diagram-768x513.png 768w\" sizes=\"(max-width: 1377px) 100vw, 1377px\" \/><\/p>\n<h2><strong>Deploy and connect to BDC cluster<\/strong><\/h2>\n<p>There are a couple of ways to deploy BDC in your cluster, and you need to <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/deploy-big-data-tools?view=sql-server-linux-ver15\">install SQL Server 2019 big data tools<\/a> beforehand, \u00a0you can check SQL Server documentation about how to <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/deploy-get-started?view=sql-server-linux-ver15\">get started with Big Data Clusters<\/a>.<\/p>\n<p>In this article, we\u2019re going for <a href=\"https:\/\/azure.microsoft.com\/en-gb\/services\/kubernetes-service\/\">Azure Kubernetes Service (AKS)<\/a> as our preferable deployment platform. Once you get the cluster set up, the next step is to gain access to your cluster where you can use <strong>az aks get-credentials<\/strong> command as follows :<\/p>\n<pre class=\"prettyprint\">az aks get-credentials -n &lt;the name of your AKS cluster where deployed your BDC&gt; -g &lt;your resource group&gt;<\/pre>\n<p>Your output of the above command would be as follows:<\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-206 size-full\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/access-to-cluster-e1592584054614.png\" alt=\"Image access to cluster\" width=\"1362\" height=\"125\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/access-to-cluster-e1592584054614.png 1362w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/access-to-cluster-e1592584054614-300x28.png 300w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/access-to-cluster-e1592584054614-1024x94.png 1024w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/access-to-cluster-e1592584054614-768x70.png 768w\" sizes=\"(max-width: 1362px) 100vw, 1362px\" \/><\/p>\n<p>Besides, you need to log in to the BDC <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/concept-controller?view=sql-server-ver15\">controller endpoint<\/a> and set your active <strong>azdata context <\/strong>to deploy or interact with BDC. \u00a0Here <strong>azdata context<\/strong> defines your privilege and access control settings, which means your access level and permission on a specific namespace in your current cluster. To know more about authentication and authorization of BDC, please check <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/concept-security?view=sql-server-ver15\">this<\/a> documentation.<\/p>\n<p>There are two ways to log in, you need to use <strong>azdata login<\/strong> command, the following is an example of login interactively where you need to fill your namespace name which is the name of your BDC cluster this time, in addition to the username and password when you created you BDC :<\/p>\n<pre class=\"prettyprint\">azdata login<code>\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/code><\/pre>\n<p>Your output of the above command would be as follows:<\/p>\n<p><img decoding=\"async\" class=\"alignnone wp-image-208 size-full\" src=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/azdata-login-e1592584667214.png\" alt=\"Image azdata login\" width=\"1732\" height=\"359\" srcset=\"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/azdata-login-e1592584667214.png 1732w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/azdata-login-e1592584667214-300x62.png 300w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/azdata-login-e1592584667214-1024x212.png 1024w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/azdata-login-e1592584667214-768x159.png 768w, https:\/\/devblogs.microsoft.com\/azure-sql\/wp-content\/uploads\/sites\/56\/2020\/06\/azdata-login-e1592584667214-1536x318.png 1536w\" sizes=\"(max-width: 1732px) 100vw, 1732px\" \/><\/p>\n<p>Alternatively, to log in non-interactively you can specify the authentication such as Basic or Active Directory (AD) with an explicit principal method and the controller endpoint, the command would look as follows :<\/p>\n<pre class=\"prettyprint\">azdata login --auth basic --username &lt;your BDC username&gt; --endpoint https:\/\/&lt;ip or domain name&gt;:30080<\/pre>\n<p>Please check <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/big-data-cluster\/reference-azdata?view=sql-server-ver15#azdata-login\">here<\/a> to know how to use basic or Active Directory authentication to log in.<\/p>\n<p><strong>Next Steps<\/strong><\/p>\n<p>Next article we\u2019ll walk through <a href=\"https:\/\/devblogs.microsoft.com\/azure-sql\/developing-in-the-cloud-with-sql-server-big-data-clusters-develop-and-deploy-apps\">how to develop and deploy Apps to SQL Server Big Data Clusters (BDC)<\/a>. Let\u2019s stay tuned!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>When it comes to innovation, we realized that the world\u2019s most valuable resource is no longer oil, but data. One of the biggest challenges we\u2019re having today is how to integrate disparate data sources from many different places. If you&#8217;re looking into how to work on Big Data Analytics solutions in Kubernetes, that&#8217;s where Big [&hellip;]<\/p>\n","protected":false},"author":32492,"featured_media":346,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1,283],"tags":[279,284,278,277],"class_list":["post-198","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-azure-sql","category-bdc","tag-aks","tag-bdc","tag-kubernetes","tag-sql-server-big-data-clusters"],"acf":[],"blog_post_summary":"<p>When it comes to innovation, we realized that the world\u2019s most valuable resource is no longer oil, but data. One of the biggest challenges we\u2019re having today is how to integrate disparate data sources from many different places. If you&#8217;re looking into how to work on Big Data Analytics solutions in Kubernetes, that&#8217;s where Big [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/198","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/users\/32492"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/comments?post=198"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/posts\/198\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media\/346"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/media?parent=198"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/categories?post=198"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/azure-sql\/wp-json\/wp\/v2\/tags?post=198"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}