Tuesday, March 7, 2017

SQL Server errors are too generic

Below is an actual error captured in SQL Server Error log during an incident. 

SQL Server must shut down in order to recover a database (database ID 1). The database is either a user database that could not be shut down or a system database. 
Restart SQL Server. If the database fails to recover after another startup, repair or restore the database.

This error is not quite accurate because it says SQL Server must shut down in order to recover a database which is database id 1. In SQL Server, database id 1 means "master" database. The next statement says the database is either a user database or system database. In this particular case, it can not be a user database. 

The error message is quite generic. It appears that SQL Server does not change the error messages depending on the database id in this case, maybe it could be due to a performance impact. 

Monday, February 27, 2017

Understanding SQL Server backups

One of my friends and colleagues (Jayaruwan alias JB) recently asked me a question regarding SQL Server backup. The question was, during the backup operation, what will happen to the data changes? Are those data recorded in the backup? If so till what point?

I knew for a fact that the data changes during the backup operation will get written to the backup. To demo this, I did the following test.

You need to open two query windows in SSMS. In this demo, I used SQL Server 2016 and bit large sample database named, AdventureWorksDW2016CTP3.

To run the demo yourself, you can follow the below steps.
Open two query windows in SSMS and copy and paste the below code in each query window.


--Window 1

--part 1
USE AdventureworksDW2016CTP3;
CREATE TABLE backuptest
    Id int identity(1,1) primary key,
    Test_Desc varchar(1000)

--part 2
DECLARE @msg3 binary(128);
SET @msg3=CAST('Kick off the full backup of AdventureworksDW2016CTP3' AS binary(128));

INSERT INTO backuptest (Test_Desc)
SELECT command + ' ' + CAST(percent_complete AS varchar) + ' ' + CONVERT(varchar, GETDATE() ,121) FROM sys.dm_exec_requests
WHERE context_info=@msg3;

--part 3
USE AdventureworksDW2016CTP3
SELECT * FROM backuptest

--part 4
USE master;


Copy and paste the below code snippet in the 2nd Window.


--Window 2

--part 1
USE master;
DECLARE @msg3 binary(128);
SET @msg3=CAST('Kick off the full backup of AdventureworksDW2016CTP3' AS binary(128));

PRINT 'Backup Database Started at : ' + CONVERT(varchar, GETDATE() ,121);
BACKUP DATABASE AdventureworksDW2016CTP3 TO DISK='C:\temp\AdventureWorksDW2016CTP3_full.bak'
PRINT 'Backup Database finished at : ' + CONVERT(varchar, GETDATE() ,121);

--part 2
RESTORE DATABASE AdventureworksDW2016CTP3 FROM DISK='C:\temp\AdventureWorksDW2016CTP3_full.bak'


Execute the Part 1 and part 2 queries in Window 1.
Immediately move to the window 2 and execute the part 1. Watch the execution of the second Window, which is BACKUP DATABASE statement and once it's  completed, you need to stop the execution of Query Window 1.

You need to copy and paste the result of the backup operation which is similar to the following.

Backup Database Started at : 2017-02-27 22:21:58.270
Processed 186680 pages for database 'AdventureworksDW2016CTP3', file 'AdventureWorksDW2014_Data' on file 1.
Processed 39 pages for database 'AdventureworksDW2016CTP3', file 'AdventureWorksDW2014_Log' on file 1.
BACKUP DATABASE successfully processed 186719 pages in 14.817 seconds (98.450 MB/sec).
Backup Database finished at : 2017-02-27 22:22:13.200

It has the information of what time the backup started and what time it finished.

Now, you can check the data in backuptest table by executing the part 3 of the query in Window 1. See below figure for sample data set;




You can notice there are many rows inserted along with the percent completed of the backup statement which executed in parallel to the BACKUP DATABASE statement in Query Window 2. For later reference, copy and paste the result set to an Excel sheet. After carefully observing the data set, you need to change the database to master in Query Window 1 (part 4)


At this point, you've a full backup file in the local directory and we know that while the backup operation in progress we inserted the data to a table in Query Window 1.


Now switch back to the Query Window 2 and restore the database using the part 2. Upon completion of the restore, observe the data in backuptest table using part 3 of the Query in Window 1.


You can notice, most of the data changes happened during the backup process is still there. See the below figure;



The last entry recorded in the table is “BACKUP DATABASE 99.9967 2017-02-27 22:22:13.060”. At that point, database backup statement is completed 99% and the time is 22:22:13:060. We also have the backup statement completed time which is, “2017-02-27 22:22:13.200”. So actually it has missed 140 milliseconds worth of data.


You can do this demo yourself and get more understanding.


Monday, February 20, 2017

SQLSaturday582 - Melbourne

It was an honor to speak at SQLSaturday at Melbourne on 11th Feb, 2017. My session was about SQL Server Statistics and New Cardinality Estimator model. However, I could not go through some of the demos due to time restriction. I received much positive feedback so I'm finally happy about the session overall. 

You can find the presentation artifacts here.

Thanks, everyone who attended my session.

Friday, December 16, 2016

Dedicated Administrator Connection (DAC)

The dedicated administrator connection (DAC) allows an administrator to connect to an SQL Server instance to run diagnostic functions or T-SQL statements even if the server is not responding to normal connections. This feature was introduced in SQL Server 2012 and by default this option is disabled. 

You can enable DAC using below two methods;

Using SSMS
  1. Open 

  2. In object explorer, connect to the SQL Server instance that you want to enable 

  3. Right click on Server instance and select 

  4. In 
    View Facets
     window select 
    Server Configuration
     using the drop down box. See below figure

Using T-SQL

sp_configure 'show advanced options',1;
sp_configure 'remote admin connections', 1; 

How to connect via DAC
Once you enabled the DAC for an SQL Server instance, it is pretty easy to connect to the Server via DAC. You just need to mention ADMIN: in front of the server name that your connecting. See below figure:

Note that you can establish a single successful DAC connection to a SQL Server. If you try another DAC connection attempt, you get the following error message;

"A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: TCP Provider, error: 0 - The specified network name is no longer available.) (Microsoft SQL Server, Error: 64)"

There is a possibility that you will get the above error when you try to login to a server via DAC to perform some urgent troubleshooting task. That means someone else has already logged into the server using remote DAC and you may immediately do to panic mode. In such cases, it is very important to know who is using DAC connection. 

Use the below T-SQL code to get the information of already established DAC connection. 

SELECT CASE WHEN ses.session_id= @@SPID THEN 'It''s me! ' ELSE '' END + 
COALESCE(ses.login_name, '???') AS WhosGotTheDAC, 
FROM   sys.endpoints AS en 
JOIN sys.dm_exec_sessions ses 
ON en.endpoint_id = ses.endpoint_id 
WHERE  en.NAME = 'Dedicated Admin Connection'

If your using the server name when connecting, make sure SQL Server Browser service is running. Otherwise you will receive the most common log on error which is "A Network-related or instance-specific error occurred…" which basically says it cannot find the server. 

General guideline

It is always a good practice to enable

AC in all production servers. This option is extremely useful when you want to troubleshoot server issues. 


Friday, December 2, 2016

Persistent Log Buffer with SQL Server 2016 SP1

You can get significant performance gain by just moving the existing applications to SQL Server 2016. There are several case studies to prove this story. You can even get more performance gain (2X - 4X) with combination of SQL Server 2016 SP1 and Windows Server 2016 (not yet released). This performance gain achieved through Persistent (non-volatile) Log Buffer technique. This performance enhancement has been released to SQL Server with SP1. SQL Server utilizes Windows Server 2016 direct memory access (DAX). With this new feature SQL Server can write transaction log data in memory access speed over disk IO speed earlier. With the new type of memory which is persistent, the moments SQL Server writes to log cache (memory area for t-log), the data is persistent. 

For more details:

Tuesday, November 29, 2016

How SQL Server runs on Linux?

I wrote this blog post after Microsoft announced that SQL Server is going to support Linux. I had many questions in mind on how MS is going to design SQL Server to run on Linux. The major question I had was, Is it complete re-write to work on Linux kernel? Now Microsoft has released some details on, how they have designed SQL Server to run on Linux. It was not a complete rewrite of SQL Server instead Microsoft has used a virtualization technique called Drawbridge to run SQL Server on Linux. Drawbridge contains Windows OS Library versions which consists of a user-mode kernel of Windows NT as a process called picoprocess. Picoprocess is a container service and it communicates with Linux OS. 

​For more details refer below links;


Friday, November 18, 2016

SQL Server on Mac

I'm super excited to hear about the news, the first release of CTP (Community Technology Preview) of SQL Server on Linux at Microsoft connect() 2016 on 11/16/2016. This is definitely an unforgettable day for SQL Server enthusiasts around the world and I could not wait to install it on my Mac and give it a try. I thought of writing a blog post about the installation steps that I used to setup SQL Server on Mac. 

However there are few NEW things that you want to know before jump into the installation. As SQL Server DBAs, I believe many are not familiar much with Linux OS as well as new development methods like Docker containers. The CTP released for Linux is available as a Docker container and there are some prerequisites that you need to set up before the SQL Server. It is same for Windows as well. The CTP version of SQL Server is named as SQL Server v.Next. 

​Install Docker on Mac

What is Docker? Docker site defines it as: "Docker containers wrap a piece of software in a complete filesystem that contains everything needed to run: code, runtime, system tools, system libraries – anything that can be installed on a server. This guarantees that the software will always run the same, regardless of its environment."

Installing Docker as easy as other program in Mac. You just need to download the Docker.dmg and follow the instructions. Below are the screenshots that I took during my installation. 

Once the installation is complete, you can see the Docker icon in Taskbar of the Mac. You can also see the status as "Docker is running". I presume this as a thin virtualization layer which you can host what is call, containers. You can view additional settings of Docker by getting its Preferences (like Properties Window in Windows OS).

By default Docker for Mac has 2GB memory allocated. However, to run SQL Server you need 4GB of minimum memory and by clicking on Preferences you can configure the memory allocation as mentioned below;

After that click on "Apply & Restart" button to take the changes in effect. You can notice that the status will change to "Docker restarting..." after a while it shows as "Docker is running".

With that you have completed the infrastructure necessary to run the SQL Server docker container. 

You also need to familiar with some Docker terminal commands for some administrative work. Later part of the blog has those infomation.  

Download and Install SQL Server Docker Container for Mac

For this task, we need to open Mac Terminal Window and enter the following commands. 

1. Pull the Docker image from Docker hub. 

    sudo is required since this needs admin privileges. 

    sudo docker pull microsoft/mssql-server-linux

2. Run the Docker image.

docker run –e 'ACCEPT_EULA=Y' –e 'SA_PASSWORD=P@$$w0rd!' -p 1433:1433 -d microsoft/mssql-server-linux

Note: You can give any password for the SA account. 

Once completed;

Note: Sometimes copy and paste commands to Terminal Window does weird things to formatting. After copying, hyphens and quotation marks changes to different characters. If that happens you will get errors stating some characters are NOT VALID. DO NOT panic, just delete them and re-type the correct character in the Terminal Window itself. Then all should work without any issues.

Once you come to this level, you have installed SQL Server and its running as Docker image. Now you need to know few Docker commands to see the status of Docker. Again you need to run the those commands in Mac Terminal.

Docker commands

1. docker ps

List of containers which are running at the moment.

2. docker ps -a

To list all containers that are attached to the Docker. Below screenshot shows the output of this command. You can see, the STATUS column is "Exited", meaning the SQL Server image is not running at the moment. So you need to start the SQL Server docker container.

3. docker start <container id>

You can see that now, after starting the particular container id for SQL Server docker image, the SQL Server is running. 

4. docker stop <container id>

If you want to stop the docker container. I believe this is like you stop SQL Server service in Windows machine. 

What About SQL Server Client Tools?

The most favorite tool for SQL Server DBAs is the SQL Server Management Studio (SSMS). The next question is how to install SSMS in Linux. Unfortunately native build of SSMS for Linux is NOT YET available as of today. However you can use some other command line tools such as Powershell or SQLCMD. I've installed SQL-CLI in Mac so that I can connect to SQL Server on Mac. 

Install SQL-CLI on Mac

First you need to install node.js on your Mac using the link mentioned under References. It's a dmg package and you can install it same way you did in Docker. Its super easy. After that you need to run below command in Terminal to install the sql-cli tool. 

npm install -g sql-cli

See below screenshot for the output. 

With that you have a client tool (command line) to connect to the SQL Server on Mac. 

If you just type MSSQL in the terminal, you will get a login failure that is because you've not specified the credentials. Let's see how you can connect to the SQL Server instance with proper credentials. 

Connecting to SQL Server on Mac

Type the below command in Terminal. 

> mssql -u sa -p 'P@$$w0rd!'

Wow!. Isn't it nice? Now I've just connected to SQL Server on Mac for the first time. Now it's time to keep getting busy with it.

Executing SQL Commands

1. SQL Server Version

2. List all the databases

Still I've not created any user databases so that you can see only system databases. Nothing new here. 

2. List of all the table in msdb DB

Now you can play with regular work that you do with SQL Server on daily basis. 



Friday, October 21, 2016

Short notes on Statistics

  • Recompile would be the 1st step to influence to optimize to recompile the code again.
  • Compilation is the most important stage
  • Cost based optimizer == need to evaluate a query reasonably quickly
  • When writing a query, there are many ways to write it but do not focus to that first. Focus the result.
  • WHERE msal * 12 > literal -> Bad
  • WHERE msal > literal / 12 -> Good
  • All about statistics / estimation -> Query performance
  • Most of the cases the issue is not because of out of date stats or indexes, it could be parameter sniffing
  • EXEC <proc name> <parameters> -> uses the existing plan
  • EXEC <proc name> <parameters> WITH RECOMPILE -> generates a new plan
  • Using above two, you can identify the parameter sensitiveness (sniffing)
  • Update stats invalidate the plan
  • Estimates vs Actual -> If it is different, it may not be the stat problem always. It may be parameter sniffing.
  • Selectivity
    • Optimizer loves highly selective predicates
    • Low no.of rows -> high selectivity
    • High no.of rows -> low selectivity
  • Statistics -> Summarized info about the data distribution of table columns
  • DBCC AUTOPILOT -> Undocumented
  • Hypothetical Indexes -> just the index structure.
    • What if analysis using AUTOPILOT with >= SQL Server 2008
  • sp_helpstats ‘<table name>’, ‘all’
  • Histogram -> 200 steps + 1 row for the null if the column allows null
    • SQL Server 7.0 had 300 rows
    • When the histogram is large, it increases the compilation time because histogram is not like an index
    • EQ_ROWS – Equal rows for the index key
    • Rows * All density = Avg no.of rows returns for that column or the combination of cols
    • If the table is huge, the data distribution present in histogram is not quite accurate
    • Step compression -> When building the histogram, if it finds the value approximately similar to each adjacent steps, then the algorithm will compress those steps and create 1 step
    • RANGE_ROWS -> values between two HI_KEYs and excluding the HI_KEYs at both ends
  • Exec sp_autostats ‘<table name>’
  • Sys.stats -> shows all the stats including index stats
  • Sys.dm_db_stats_properties -> gives lot of details that you can use to update stats more effectively programmatically.
  • Histogram -> direct hit
  • Stats will update on Index Rebuild but not on Index re-org.
  • Entire stats structure stored in db as a BLOB (Header, Density Vector and Histogram)
  • Partitioned table uses table level stats which has 200 steps by each by each partition
  • Online partition level index rebuild in SQL Server 2014, but when MS introduced partitioning? It was in SQL Server 2005. So MS took 9 years to get online partition level index rebuild
  • Tuple cardinality -> used to estimate distinct cardinality
  • Optimization | Compilation | Execution
    • Local variable values are not known at optimization time
  • Parameters and literals can be sniffed -> uses histogram
  • Variables cannot be sniffed -> density vector
  • UDTATE STATS with any % will not be parallelized but FULL SCAN can be parallelized.
  • Dynamic auto updating threshold
    • For large tables to have 20% change , you need to wait long time to update stats.
  • SQL Server does not understand the correlation between columns
  • Calculation direction in query plan is from right to left.
  • Problem is always with monster tables
  • Is 200 steps enough for such tables?
  • Even the table is partitioned, still the histogram is @ table level
  • Sp_recompile for a table is an expensive operation. It needs Sch-M lock on the table
  • Filtered stats – consider creating filtered stats, even daily basis to tackle and solve the estimates problems due to skewed data. So that you will get better execution plans
  • QUERYTRACEON (2353) -> Additional info.

Monday, August 29, 2016

Powershell supports Linux

Microsoft this week announced that PowerShell is now open sourced and available on Linux, the start of a development process that aims to enable users to manage any platform from anywhere, on any device.

R Integration in SQL Server 2016

R is a popular programming language that is used in data science to analyse data. R has capabilities for data analytics. Data analytics needs special data structures such as arrays, vectors, matrix. Traditional RDBMS products do not have capabilities for data analytics because they do not have such data structures stated above. 

The latest release of MS SQL Server 2016 has R integration so that data processing and data analytics can be done in the same data management platform. 

This blog post highlights R integration in SQL Server 2016 by using a simple example. 

Let's assume you have a requirement to generate sequence of numbers. There are various methods you can do in SQL Server using T-SQL.  The typical solution would be using a WHILE loop and few control variables which is mentioned below;

There is an interesting  msdn forum question which has many alternative ways to achieve this. It is worth to read the discussion. 

The same thing can be accomplished using R in a very simple way. R has a function named seq, and below code snippet shows R seq to generate a sequence of numbers from 1 to 10. You can use RStudio to use full R functionalities. 

> seq(1, 10)
 [1]  1  2  3  4  5  6  7  8  9 10

Now, lets see how we can integrate the above simple R function with SQL Server 2016. First you need to enable R integration in SQL Server 2016. 

I had to restart SQL Server service after executing the above code which should not be the case. Below is the version of the SQL Server that I'm using for this demo. 

Microsoft SQL Server 2016 (RTM) - 13.0.1601.5 (X64)

I had to restart SQL Server Launchpad service too which is a new service introduced in MS SQL Server 2016. 

Then you can use sp_execute_external_script programming interface to call an external script written in R or in other language which recognized by SQL Server

Below is the output of the above code. 

There are many complex data analytics work you can do in R and the same thing you can do inside SQL Server by combing the data stored in SQL Server.