Monday, August 29, 2016

R Integration in SQL Server 2016

R is a popular programming language that is used in data science to analyse data. R has capabilities for data analytics. Data analytics needs special data structures such as arrays, vectors, matrix. Traditional RDBMS products do not have capabilities for data analytics because they do not have such data structures stated above. 

The latest release of MS SQL Server 2016 has R integration so that data processing and data analytics can be done in the same data management platform. 

This blog post highlights R integration in SQL Server 2016 by using a simple example. 

Let's assume you have a requirement to generate sequence of numbers. There are various methods you can do in SQL Server using T-SQL.  The typical solution would be using a WHILE loop and few control variables which is mentioned below;

There is an interesting  msdn forum question which has many alternative ways to achieve this. It is worth to read the discussion. 

The same thing can be accomplished using R in a very simple way. R has a function named seq, and below code snippet shows R seq to generate a sequence of numbers from 1 to 10. You can use RStudio to use full R functionalities. 

> seq(1, 10)
 [1]  1  2  3  4  5  6  7  8  9 10

Now, lets see how we can integrate the above simple R function with SQL Server 2016. First you need to enable R integration in SQL Server 2016. 

I had to restart SQL Server service after executing the above code which should not be the case. Below is the version of the SQL Server that I'm using for this demo. 

Microsoft SQL Server 2016 (RTM) - 13.0.1601.5 (X64)

I had to restart SQL Server Launchpad service too which is a new service introduced in MS SQL Server 2016. 

Then you can use sp_execute_external_script programming interface to call an external script written in R or in other language which recognized by SQL Server

Below is the output of the above code. 

There are many complex data analytics work you can do in R and the same thing you can do inside SQL Server by combing the data stored in SQL Server. 


No comments:

Post a Comment

How to interpret Disk Latency

I was analyzing IO stats in one of our SQL Servers and noticed that IO latency (Read / Write) are very high. As a rule of thumb, we know tha...