{"id":5871,"date":"2016-03-22T09:15:00","date_gmt":"2016-03-22T16:15:00","guid":{"rendered":"https:\/\/blogs.msdn.microsoft.com\/visualstudio\/?p=5871"},"modified":"2019-03-19T23:30:38","modified_gmt":"2019-03-20T06:30:38","slug":"introducing-r-tools-for-visual-studio-3","status":"publish","type":"post","link":"https:\/\/devblogs.microsoft.com\/visualstudio\/introducing-r-tools-for-visual-studio-3\/","title":{"rendered":"Introducing R Tools for Visual Studio"},"content":{"rendered":"<p><a href=\"https:\/\/www.r-project.org\/\">R is a programming language<\/a> that is widely used by data scientists, and developers seeking a more powerful tool to work with data. While data scientists use R to write programs, their work product is rarely the program itself. Instead, they produce reports or presentations from the results generated by their R program to help influence or drive business decisions.<\/p>\n<p>R Tools for Visual Studio (RTVS), currently available as a Public Preview release, is a new tool from Microsoft for creating R programs using Visual Studio. RTVS is free, and <a href=\"https:\/\/github.com\/microsoft\/rtvs\">Open Sourced under the MIT license<\/a>. It can be <a href=\"https:\/\/microsoft.github.io\/RTVS-docs\/installation.html\">downloaded by following the instructions here<\/a>, and you can <a href=\"https:\/\/microsoft.github.io\/RTVS-docs\/\">read our documentation here<\/a>.<\/p>\n<p>If you prefer videos, here is a walkthrough of some of the top features of RTVS:<\/p>\n<p><iframe src=\"https:\/\/www.youtube.com\/embed\/KPS0ytrt9SA\" width=\"600\" height=\"315\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\">\n<\/iframe><\/p>\n<h2>A Quick Tour of R<\/h2>\n<p>R is a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Strong_and_weak_typing\">strong<\/a>, <a href=\"http:\/\/c2.com\/cgi\/wiki?DynamicTyping\">dynamically typed<\/a>, interpreted language that draws a lot of inspiration from other languages. It is a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Functional_programming\">functional language<\/a> that heavily draws from <a href=\"https:\/\/en.wikipedia.org\/wiki\/Scheme_(programming_language)\">Scheme<\/a> and <a href=\"https:\/\/en.wikipedia.org\/wiki\/S_(programming_language)\">S<\/a>. It is beyond the scope of this blog post to discuss the semantics of the language, but I strongly encourage you to read these two freely available online books for a deep introduction to the language:<\/p>\n<ol>\n<li><a href=\"http:\/\/adv-r.had.co.nz\/\">Advanced R<\/a><\/li>\n<li><a href=\"http:\/\/www.burns-stat.com\/documents\/books\/the-r-inferno\/\">The R Inferno<\/a><\/li>\n<\/ol>\n<p>The remainder of this blog post is a quick tour of R, its libraries, and RTVS with the goal of inspiring you to learn more about the language, its libraries, and how it can be a useful addition to your toolbox for analyzing data.<\/p>\n<p>The quickest way to get started with R is through its <a href=\"https:\/\/en.wikipedia.org\/wiki\/Read%E2%80%93eval%E2%80%93print_loop\">Read-Eval-Print Loop (REPL)<\/a>, which lets you send commands <em>interactively<\/em> to the R interpreter. In RTVS, we surface the R REPL through the R Interactive Window.<\/p>\n<p>As you can see, you can type 3 + 4 and have the result immediately computed by R; no compilation step necessary:<\/p>\n<pre><code>3 + 4<\/code><\/pre>\n<pre><code>[1] 7<\/code><\/pre>\n<p>R\u2019s strength is working with data. Therefore, it\u2019s not surprising that the most heavily used data structure in R is the <a href=\"http:\/\/www.r-tutor.com\/r-introduction\/data-frame\">R dataframe<\/a>, which is a convenient way of working with tabular datasets. There are many ways of getting data into an R dataframe, but perhaps the easiest is to read it from a URI. Below, you\u2019re reading a CSV file containing data about locations of airports in the United States from Github:<\/p>\n<pre><code>usa_airports &lt;- read.csv(\"https:\/\/raw.githubusercontent.com\/jflam\/VSBlogPost\/master\/usa_airports.dat\", stringsAsFactors = TRUE)<\/code><\/pre>\n<p>In R, you assign variables using the <code>&lt;-<\/code> operator, and you invoke functions using parenthesis. So in the code above, you\u2019re invoking the<code>read.csv()<\/code> R library function, passing in the URI to the CSV file.<\/p>\n<p>You can get help on any R library function by using the <code>?<\/code> operator from the REPL. For example, to get help on the <code>read.csv<\/code> API, just type<code>?read.csv<\/code> in the REPL.<\/p>\n<p>Next, you\u2019re using another R function, <code>head()<\/code> to display a summary of the first 5 lines of the file:<\/p>\n<pre><code>head(usa_airports)<\/code><\/pre>\n<pre><\/pre>\n<pre><code>     X   ID                            name         city       country\r\n1  318 6891           Putnam County Airport  Greencastle United States\r\n2 1104 6890      Dowagiac Municipal Airport     Dowagiac United States\r\n3 1121 6889     Cambridge Municipal Airport    Cambridge United States\r\n4 1470 6885  Door County Cherryland Airport Sturgeon Bay United States\r\n5 1507 6884    Shoestring Aviation Airfield Stewartstown United States\r\n6 1617 6883 Eastern Oregon Regional Airport    Pendleton United States\r\n  IATA_FAA ICAO      lat        lon altitude timezone DST\r\n1      4I7  \\N 39.63356  -86.81381      842       -5   U\r\n2      C91  \\N 41.99293  -86.12801      748       -5   U\r\n3      CDI  \\N 39.97503  -81.57758      799       -5   U\r\n4      SUE  \\N 44.84367  -87.42156      725       -6   U\r\n5      0P2  \\N 39.79482  -76.64719     1000       -5   U\r\n6      PDT KPDT 45.69500 -118.84139     1497       -8   A\r\n               Region\r\n1    America\/New_York\r\n2    America\/New_York\r\n3    America\/New_York\r\n4     America\/Chicago\r\n5    America\/New_York\r\n6 America\/Los_Angeles<\/code><\/pre>\n<p>The <code>head<\/code> function is fairly primitive, as it just generates text-based output. That\u2019s not surprising since <a href=\"https:\/\/en.wikipedia.org\/wiki\/R_(programming_language)\">R has been around since 1993<\/a>. Surely we can do better in 2016?<\/p>\n<p>As it turns out, we can. There are a lot of libraries in R that bind the R programming language to the most powerful hardware-accelerated rendering platform on the planet: HTML. In R, this is accomplished through a set of Open Source libraries known as <a href=\"http:\/\/www.htmlwidgets.org\/\">htmlwidgets for R<\/a>. Below is the same dataframe rendered using the DataTable widget. We generate an HTML page that contains all of the data from the usa_airports dataframe, and open up a browser window using the default browser that shows an interactive table containing the data. The data really is interactive; try typing \u201cSeattle\u201d into the search box to see it filter the data to only airports in Seattle in real time, or click on column headings to sort by that column.<\/p>\n<pre><code>library(DT)\r\ndatatable(usa_airports[,c(\"name\", \"city\", \"country\", \"IATA_FAA\", \"lat\", \"lon\", \"altitude\")])<\/code><\/pre>\n<p>(to get to the interactive table, please click on the image below)<\/p>\n<p><a href=\"https:\/\/rawgit.com\/jflam\/VSBlogPost\/master\/post.html\"><img decoding=\"async\" title=\"\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2016\/03\/3-21-2016-RTVS-Preview-Database1.png\" alt=\"dataframe rendered using the DataTable widget\" width=\"983\" height=\"635\" \/><\/a><\/p>\n<p>If you prefer to manipulate your data programmatically, you can easily do so as well. A popular library for manipulating data is the <a href=\"https:\/\/cran.rstudio.com\/web\/packages\/dplyr\/vignettes\/introduction.html\">dplyr library by Hadley Wickham<\/a>. Let\u2019s say that we wanted to generate a list of airports located near New York city. You can do this easily via the <code>subset<\/code> function from <code>dplyr<\/code>:<\/p>\n<pre><code>library(dplyr)\r\nnew_york_airports &lt;- subset(usa_airports, city == \"New York\")\r\ndatatable(new_york_airports[,c(\"name\", \"city\", \"country\", \"IATA_FAA\", \"lat\", \"lon\", \"altitude\")])<\/code><\/pre>\n<p>(to get to the interactive table, please click on the image below)<\/p>\n<p><a href=\"https:\/\/rawgit.com\/jflam\/VSBlogPost\/master\/post.html\"><img decoding=\"async\" title=\"\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2016\/03\/3-21-2016-RTVS-Preview-Database2.png\" alt=\"Example of interactive table \" width=\"974\" height=\"652\" \/><\/a><\/p>\n<p>You can also do more sophisticated filtering: e.g., select all the airports in NYC at below 25 feet elevation, ordering the rows by altitude and selecting only the name, altitude, latitude and longitude of the airport:<\/p>\n<pre><code>low_nyc &lt;- \r\n    usa_airports %&gt;% \r\n    filter(city == \"New York\" &amp; altitude &lt; 25) %&gt;% \r\n    arrange(altitude) %&gt;% \r\n    select(name, altitude, lat, lon)\r\ndatatable(low_nyc)<\/code><\/pre>\n<p>(to get to the interactive table, please click on the image below)<\/p>\n<p><a href=\"https:\/\/rawgit.com\/jflam\/VSBlogPost\/master\/post.html\"><img decoding=\"async\" title=\"\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2016\/03\/3-21-2016-RTVS-Preview-Database3.png\" alt=\"Table with ordering\" width=\"970\" height=\"422\" \/><\/a><\/p>\n<p>Here, you see a more sophisticated use of R syntax via the <code>%&gt;%<\/code> or \u201cpipe\u201d operator. This operator lets you naturally compose operations and read them from left to right. So in the above example, you take the <code>usa_airports<\/code> dataframe, filtering all of the rows where the condition<code>city == \"New York\" &amp; altitude &lt; 25<\/code> holds true, sorting the rows by the <code>altitude<\/code> column, and selecting only the columns <code>name<\/code>,<code>altitude<\/code>, <code>lat<\/code>, and <code>lon<\/code> for the result dataset which is stored in the <code>low_nyc<\/code> variable.<\/p>\n<p>If you\u2019re curious about the implementation of the pipe operator, see the <a href=\"https:\/\/github.com\/smbache\/magrittr\">magrittr<\/a> package, as well as this excellent blog post on how <a href=\"http:\/\/www.r-statistics.com\/2014\/08\/simpler-r-coding-with-pipes-the-present-and-future-of-the-magrittr-package\/\">magrittr was influenced by the forward pipe operator from F#<\/a>.<\/p>\n<h2>Plotting data on maps<\/h2>\n<p>Once you have your dataset, you can plot it on an interactive map. The <a href=\"http:\/\/www.htmlwidgets.org\/showcase_leaflet.html\">leaflet HtmlWidget<\/a> is an excellent library for generating interactive maps. In the code fragment below, you take the dataframe that contains low altitude New York City airports that you generated via <code>dplyr<\/code> in the previous step, and using the now-familiar pipe operator send it to the <code>leaflet library<\/code>, asking it to generate map tiles and plotting circles on them using the <code>lon<\/code> and <code>lat<\/code> columns for the positions of the circles, and using the <code>name<\/code> column for the popup that appears when the user clicks on a circle.<\/p>\n<pre><code>library(leaflet)\r\nmap &lt;- \r\n    new_york_airports %&gt;% \r\n    leaflet() %&gt;% \r\n    addTiles() %&gt;% \r\n    addCircles(~lon, ~lat, popup = ~name, radius = 200, color=\"blue\", opacity = 0.8)\r\nmap<\/code><\/pre>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/4\/2019\/06\/3-21-2016-RTVS-Preview-VisualStudio.png\"><img decoding=\"async\" title=\"\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2016\/03\/3-21-2016-RTVS-Preview-Map.png\" alt=\"Map showing low altitude New York City airports\" width=\"1180\" height=\"649\" \/><\/a><\/p>\n<h2>Wrapping up the tour<\/h2>\n<p>There is lots more to learn about R than I have time or space for in this blog post. However, hopefully what I\u2019ve done is whet your appetite to learn more about R. There are many, many things that I haven\u2019t covered in this blog post, so I\u2019ve included a bunch of resources below to help you better understand R and its libraries.<\/p>\n<h2>Introduction to the R Programming Language<\/h2>\n<ol>\n<li><a href=\"https:\/\/cran.r-project.org\/doc\/manuals\/R-intro.pdf\">An Introduction to R<\/a>: written by <a href=\"https:\/\/twitter.com\/revodavid?lang=en\">David Smith<\/a>, who currently works at Microsoft on the R team.<\/li>\n<li><a href=\"https:\/\/www.edx.org\/course\/introduction-r-programming-microsoft-dat204x-1\">Introduction to R Programming<\/a>: a free online class created by Microsoft to help you learn R.<\/li>\n<\/ol>\n<h2>Key R Libraries<\/h2>\n<ol>\n<li><a href=\"https:\/\/cran.rstudio.com\/web\/packages\/dplyr\/vignettes\/introduction.html\">dplyr<\/a> is the data manipulation \u201cd plyer\u201d library that is a key tool for helping you quickly manipulate your data into a form that you can analyze.<\/li>\n<li><a href=\"http:\/\/ggplot2.org\/\">ggplot2<\/a> is a plotting library that builds on the <a href=\"https:\/\/rawgit.com\/jflam\/VSBlogPost\/master\/post.html\">grammar of graphics<\/a> ideas by Hadley Wickham<\/li>\n<li><a href=\"http:\/\/ggvis.rstudio.com\/\">ggvis<\/a> is a plotting library that generates plots on an HTML canvas, using the same <a href=\"http:\/\/vita.had.co.nz\/papers\/layered-grammar.pdf\">grammar of graphics<\/a> semantics as ggplot2<\/li>\n<li><a href=\"https:\/\/cran.r-project.org\/web\/packages\/RODBC\/index.html\">rodbc<\/a> lets you read data from an ODBC compliant database like SQL Server<\/li>\n<\/ol>\n<h2>Microsoft R products<\/h2>\n<p>Microsoft has a deep commitment to R, and provides a full-stack R solution for your applications, complete with tooling, runtimes and libraries.<\/p>\n<ol>\n<li><a href=\"https:\/\/www.visualstudio.com\/en-us\/features\/rtvs-vs.aspx\/\">R Tools for Visual Studio<\/a> is Microsoft\u2019s free, Open Source tooling for R development in Visual Studio.<\/li>\n<li><a href=\"https:\/\/mran.revolutionanalytics.com\/open\/\">Microsoft R Open<\/a> is Microsoft\u2019s cross-platform (Windows, OS X, Linux) distribution of R. It combines integration with <a href=\"https:\/\/software.intel.com\/en-us\/intel-mkl\">Intel\u2019s Math Kernel Library<\/a> for <a href=\"https:\/\/mran.revolutionanalytics.com\/rro\/#intelmkl1\">accelerated linear algebra computations<\/a>, as well as integration with the <a href=\"https:\/\/cran.r-project.org\/web\/packages\/checkpoint\/index.html\">checkpoint package<\/a> to ensure that users of your R programs will be guaranteed to be able to run your R program <a href=\"https:\/\/mran.revolutionanalytics.com\/rro\/#reproducibility\">using the same version of the R libraries<\/a> that you used to create it.<\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/server-cloud\/products\/r-server\/\">Microsoft R Server<\/a> is Microsoft\u2019s libraries for accelerated R computation on datasets that don\u2019t fit in system memory. It builds on top of the benefits of Microsoft R Open, and adds<\/li>\n<\/ol>\n<h2>One more thing \u2026<\/h2>\n<p>We\u2019ve talked about a bunch of things in this brief blog post. However, perhaps the coolest thing about this blog post is \u2026 <em>I wrote it in Visual Studio<\/em>. The document was written in <a href=\"http:\/\/rmarkdown.rstudio.com\/\">RMarkdown<\/a>, a dialect of the popular <a href=\"https:\/\/daringfireball.net\/projects\/markdown\/\">Markdown markup language<\/a>, which supports embedding executable R code snippets within it.<\/p>\n<p>If you want to look at the source code for it, you can <a href=\"https:\/\/github.com\/jflam\/VSBlogPost\">get it at my Github<\/a>.<\/p>\n<p><a href=\"https:\/\/devblogs.microsoft.com\/wp-content\/uploads\/sites\/4\/2019\/06\/3-21-2016-RTVS-Preview-VisualStudio.png\"><img decoding=\"async\" title=\"\" src=\"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-content\/uploads\/sites\/4\/2016\/03\/3-21-2016-RTVS-Preview-VisualStudio.png\" alt=\"RTVS in Visual Studio\" width=\"1310\" height=\"927\" \/><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>R is a programming language that is widely used by data scientists, and developers seeking a more powerful tool to work with data. While data scientists use R to write programs, their work product is rarely the program itself. Instead, they produce reports or presentations from the results generated by their R program to help [&hellip;]<\/p>\n","protected":false},"author":383,"featured_media":255385,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[1085,1195,561,155],"tags":[237,242,547,137,172,585,357,12],"class_list":["post-5871","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cloud","category-cross-platform","category-open-source","category-visual-studio","tag-net","tag-azure","tag-f","tag-html","tag-python","tag-r","tag-sql","tag-visual-studio"],"acf":[],"blog_post_summary":"<p>R is a programming language that is widely used by data scientists, and developers seeking a more powerful tool to work with data. While data scientists use R to write programs, their work product is rarely the program itself. Instead, they produce reports or presentations from the results generated by their R program to help [&hellip;]<\/p>\n","_links":{"self":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/posts\/5871","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/users\/383"}],"replies":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/comments?post=5871"}],"version-history":[{"count":0,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/posts\/5871\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/media\/255385"}],"wp:attachment":[{"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/media?parent=5871"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/categories?post=5871"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/devblogs.microsoft.com\/visualstudio\/wp-json\/wp\/v2\/tags?post=5871"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}