Thursday, November 26, 2009

How to create a procedure in all databases in a server?

I had a situation where I wanted to create a stored procedure in all databases for a specified SQL server.

I created a below code for that purpose.

Thursday, October 29, 2009

SQL Azure Database

SQL Azure Database is Microsoft's cloud-based relational database system which is based on Microsoft SQL Server. Earlier it was called SQL Data Services. SQL Azure CTP 2 is now available for MSDN subscribers.


SQL Azure Team Blog

Wednesday, October 28, 2009

View Jobs Schedule Information

The code portion given below is useful when you need to get the schedule information of SQL Server jobs. There are several system tables exist in msdb database to retrieve jobs related information. But in this scenario I wanted to see the jobs schedule description which displays in jobs properties UI.

The description is not stored in any system tables as is rather it will do a manipulation by using other values. 

It will use the system SP, sp_get_schedule_description   to prepare description. 

create table #temp_jobschedule
job_id uniqueidentifier
,schedule_id int
,schedule_name sysname
,[enabled] int
,freq_type int
,freq_interval int
,freq_subday_type int
,freq_subday_interval int
,freq_relative_interval int
,freq_recurrence_factor int
,active_start_date int
,active_end_date int
,active_start_time int
,active_end_time int
,date_created datetime
,schedule_description nvarchar(4000)
,next_run_date int
,next_run_time int
,schedule_uid uniqueidentifier
,job_count int


declare @Job_Id uniqueidentifier
declare @Last_Job_Id uniqueidentifier

select top 1 @Job_Id=J.job_id
from msdb..sysjobs j
where j.enabled=1
and j.category_id not in (10,15,13)
order by J.job_id

select top 1 @Last_Job_Id=J.job_id
from msdb..sysjobs j
where j.enabled=1
and j.category_id not in (10,15,13)
order by J.job_id desc

while @Job_Id is not null

insert into #temp_jobschedule (schedule_id,schedule_name,[enabled],freq_type,freq_interval,freq_subday_type,freq_subday_interval,freq_relative_interval,freq_recurrence_factor
exec sp_executesql N'exec msdb..sp_help_jobschedule @job_id=@JOB_ID,@include_description=1',N'@JOB_ID uniqueidentifier', @JOB_ID=@Job_Id

select top 1 @Job_Id=J.job_id
from msdb..sysjobs j
where j.[enabled]=1
and j.category_id not in (10,15,13)
and J.job_id>@Job_Id
order by J.job_id

if @Job_Id=@Last_Job_Id
set @Job_Id=null


update #temp_jobschedule set job_id=jsch.job_id
from msdb.dbo.sysjobschedules jsch
where #temp_jobschedule.schedule_id=jsch.schedule_id

from #temp_jobschedule jsch
inner join msdb.dbo.sysjobs j
on jsch.job_id=j.job_id
inner join msdb.dbo.sysjobservers jsrv
on jsch.job_id=jsrv.job_id

Thursday, October 15, 2009

Undocumented DBCC Commands

I found this article while searching about DBCC commands. This is really interesting one specially for the people who wants to know SQL Server internals. This article applies for SQL Server 7 and 2000 but I have used some DBCC commands in SQL Server 2005 too. (E.g: DBCC IND, DBCC PAGE)

Since these are undocumented onces you have to use at your risk and no support available from Microsoft.

Useful Undocumented DBCC Commands

Wednesday, October 14, 2009

Using Backslash (“\”) in SSIS Expressions

I wanted build an expression to make a folder name dynamically. (Basically expressions are anyway dynamic in nature)

So I simply used the “\” as end of my expression as stated below;

@[User::Differential_backup_path] + @[User::database_name] + "Differential\"

It gave an error;

Then I tried various methods including CHAR(47) which is an ASCII value for the backslash.

After spending some time on research in google, I found that you need to use two backslashes. (“\\”)

So the correct expression would be;

@[User::Differential_backup_path] + @[User::database_name] + "Differential\\"

Monday, October 12, 2009

Identify out of synch subscriptions

One of the main tasks of DBA is to monitor replication status. There may be situations where subscriptions may out of synch due to various reasons. If this happened, DEV and QA will face issues and they will start communicate with DBAs to verify the replication status. If you’ve not setup proper monitoring mechanism to see replication staus this is the usual story.

However if you have already setup monitoring mechanism the DBAs will identify the issues upfront without making panic on DEVs and QAs.

This blog is on how to identify out of synch subscriptions especially in transactional replication.

Thursday, October 8, 2009

Policy-based Management in SQL Server 2008

Policy-base Management is a new feature in SQL Server 2008 that helps SQL Server administration. It allows Database Administrators to manage SQL Server instances by intent through clearly defined policies, thus reducing the potential for administrative errors. The policy-based framework implements the policies behind the scene with a Policy Engine, SQL Server Agent jobs, SQLCLR, DDL triggers and Service Broker. Policies can be applied or evaluated against a single server or a group of servers, thus improving the scalability of monitoring and administration.

I found couple of interesting articles on this subject and I'm posting them here for you're reference. 

Tuesday, October 6, 2009

Inserting multiple rows in single INSERT

Do you know there is a new T-SQL command in SQL Server 2008 which facilitates you to insert several rows in single INSERT statement? This is really a cool feature to me. As a DBA I want to write many data patches to populate data specially for lookup data. Let me say, you want to insert several records into a table. You will have to write several INSERT statements to accomplish this task in prior versions of SQL Server. Now in SQL Server 2008 it is very easy. So I thought of posting this blog for your reference. Read the full blog to see this cool command. 

I want to write a data patch to populate the table, dbo.CountryRegion.

IF OBJECT_ID('dbo.CountryRegion','U') IS NOT NULL
DROP TABLE dbo.CountryRegion

CREATE TABLE dbo.CountryRegion
CountryRegionCode char(2)
,[Name] varchar(100)

Sunday, October 4, 2009

Tables with no Clustered Index

Most of the time it is a good practice to have a Clustered Index on each table. However I've seen numerous instances where there are tables without Clustered Index. The following code will help 
you to identify the list of tables which do not have Clustered Index.

Sunday, September 27, 2009

Query best practices

In this blog I'm stating several best practices when writing SQL queries in MS SQL Server. They are;
  • Choose the most appropriate data type
  • Write correctly formed queries
  • Return only the columns/rows needed
  • Fully qualified object names
  • Avoid long actions in triggers
  • Avoid expensive operators such as NOT LIKE
  • Minimize cursor use
  • Limits query and index hints
  • Avoid implicit or explicit functions in WHERE clause

Monday, September 21, 2009

Name begins with sp_

If your stored procedures names begin with the sp_ prefix and is not in the master database, you will see SP:CacheMiss before the cache hit for each execution even if the stored procedure call is owner qualified. The reason for this is that the sp_ prefix tells SQL Server that the stored procedure is a system stored procedure. System stored procedures have different name resolution rules. With system stored procedures, SQL Server will look first in the master database if the call is not database qualified. Then will look in the current database. To avoid this additional work, do not use stored procedure names that begin with sp_

Saturday, September 19, 2009

Loop through all the databases in a server by using SSIS

As a database developer or administrator you may need to loop through all the databases in a server and perform various tasks for each database. There are various methods you can perform this in T-SQL. However here I’m presenting a simple solution by using an SSIS package.Let me go through the solution and explain how it is implemented.

Below are the steps;

  1. Start SQL Server Business Intelligence Development Studio.
  2. Create an Integration Project. Name the directory and project name.
  3. Rename the package to EachDatabase.dtsx by using solution explorer.

  1. Create an OLE DB connection in Connection Manager to the target server. Set the RetainSameConnection property to True in connection manager's property window.
  2. Add two Execute SQL Tasks, Script Task and For Loop Container to the Control Flow section.
  3. Connect them as shown below;

  1. I have given meaningful names for each component.
  2. You may be getting errors and warnings in Execute SQL Tasks and ignore them for the movement.
  3. You need to add four variables to the package. Scope of the variables will be different according to the place where they use.
  4. See the diagram below for the variables.

  1. Max_ID and Min_ID variables have the package scope while “database_name” and “loop” are having For Loop Container scope.
  2. Now let’s configure Execute SQL Tasks, For Loop Container and Script Task to get the final output.
Configure “Get all databases” Execute SQL Task

Change the properties mentioned below.
  1. Name, Description (You can give any name)
  2. ResultSet: Single row
  3. ConnectionType: OLE DB
  4. Connection: Select the connection you’ve created in the beginning
  5. SQLSourceType: Direct input
  6. SQL Statement: Add the SQL statement mentioned below;

  1. Set the variables “Result Set”. This is the place where you initialize Max_ID and Min_ID variables. Max_ID and Min_ID are not the database id rather it is an identity value of the temp table.

Configure For Loop Container

  1. Double click on the container and you will get the For Loop Editor as shown below;

  1. Set the For Loop Properties like above.

Configure “Get database name” Execute SQL Task

  1. Double click on “Get database name” Execute SQL Task.
  2. Set the property values;
    1. ResultSet: Single row
    2. ConnectionType: OLE DB
    3. Connection: (The connection you’ve created)
    4. SQLSourceType: Direct input
    5. SQLStatement: [select dbname from #databases where id=?]

  1. Set the parameter mapping. Variable “loop” will be the input and by using that value you get the database name.

  1. Setting Result Set. Assigning database name to the variable, “User::database_name

Configure “Display database name” Script Task

  1. Double click on the Script Task.
  2. Select Script tab in the left pan.
  3. Enter User::database_name in ReadWriteVariables section.
  4. Click on Design Script button.
  5. Add the VB.NET code mentioned below;

Dim db_name As String

db_name = Dts.Variables("User::database_name").Value.ToString()


  1. Save and exit.

Now the package is all set for the final execution. Start the package and see its execution. Database name will be displayed in a message box and when you click on “OK” the next one will be displayed.

The advantage of this solution is to define the databases which you are going to loop through from the beginning.

I have used SQL Server 2005 BIDS (Business Intelligence Development Manager) to demonstrate the solution.

Wednesday, September 9, 2009

More about ORDER BY clause

Every one knows the purpose and the use of ORDER BY clause in SQL Server, but most may not know how ORDER BY works, in-side SQL Server. The implementation of T-SQL is based on set theory. Each table is considered as a set. Let’s take an example of Customer and Order tables. They are actually considered as two sets. The records of the table are identical to members or elements in sets. The T-SQL statements will not work as sequential language execution. Consider the T-SQL statement below.

use AdventureWorks

SELECT TOP 100 SH.CustomerID,SH.SalesOrderNumber,SD.OrderQty,SD.UnitPrice FROM Sales.SalesOrderHeader SH
INNER JOIN Sales.SalesOrderDetail SD
ON SH.SalesOrderID=SD.SalesOrderID

When you execute the query above, it will not execute sequentially within the SQL Server. Even though the SELECT statement is appeared first it will not execute as the first statement. Instead it will execute at a later part of the query (The execution order of a T-SQL query is beyond the scope of the article). During execution, T-SQL query will go through several virtual tables. The final result set of the query also displays as a virtual table.

But when you include the ORDER BY clause in your query the final result will not be a table, instead it returns as a cursor output. Because, a set does not have predetermined order of its members. It is a logical collection of members and the order of the members should not matter. In other words you can not perform an ORDER BY in a set simply by using set concepts. This is the reason the SQL Server uses cursors for this operation. ORDER BY is a costly operation to SQL Server (The use of cursors is anyway a costly operation).

Consider the T-SQL query below.

USE AdventureWorks

SELECT SalesOrderID,OrderQty,ProductID,UnitPrice FROM Sales.SalesOrderDetail

Look at the execution plan of this query.

You may wonder why the execution plan does not contain the SORT operator. The obvious reason is SalesOrderID is the clustered index of the table, Sales.SalesOrderDetail which means the data is physically ordered by using the key of SalesOrederID column. So there will not be any additional cost for this sorting.

Let’s look at the query below;


USE AdventureWorks

SELECT SalesOrderID,OrderQty,ProductID,UnitPrice FROM Sales.SalesOrderDetail
ORDER BY UnitPrice
Look at the execution plan of this query.

Now you can see the SORTing is happening here and it consumes 71% percent from the total cost of the query.
Consider this query;

USE AdventureWorks

SELECT SalesOrderID,OrderQty,ProductID,UnitPrice FROM
(SELECT SalesOrderID,OrderQty,ProductID,UnitPrice FROM Sales.SalesOrderDetail
ORDER BY UnitPrice) A

I have used ORDER BY clause withing the sub-query, let’s execute this and see the result.

What do you expect as the result of this query?

Think before it executes.

Below is the output;

Msg 1033, Level 15, State 1, Line 3
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.

Were you able to figure out why we are getting this error?

As I already explained, the query will execute by creating several virtual tables, keeping the intermediate results and final result also a virtual table. But ORDER BY clause works quite differently. It returns a cursor. So the sub-query is returning a cursor result while outer query is implementing a virtual table. This is the reason why you can not use the ORDER BY clause in sub-queries, views, inline functions, etc (as mentioned in the error message).

The error message gives a hint to resolve the issue.

Consider the query below;


USE AdventureWorks

SELECT SalesOrderID,OrderQty,ProductID,UnitPrice FROM
(SELECT TOP 100 PERCENT SalesOrderID,OrderQty,ProductID,UnitPrice FROM Sales.SalesOrderDetail
ORDER BY UnitPrice) A

Now you can see the result of the query and the execution plan.

The difference of query 3 & 4 is the TOP 100 PERCENT statement. When you use the TOP operator (which is not a relational operator rather it is a T-SQL specific) the ORDER BY clause will not use a cursor when implementing. It uses a virtual table thus it prevents the above error and also the cost is very low with compared to the query 2.


As a rule of thumb it is a good practice to avoid ORDER BY clause unless you really need the rows sorted. Use ORDER BY clause in sub-queries with TOP 100 PERCENT if you really need it. A query with ORDER BY clause can not be used as a table expression that is, a view, inline table-valued function, sub query, derived table, or common table expression (CTE).

SQL version used for this article is, Microsoft SQL Server 2005 - 9.00.3073.00 (Intel X86) Copyright (c) 1988-2005 Microsoft Corporation Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 2)

Thursday, September 3, 2009

Understanding INNER JOIN in detail

The objective of this article is dig into detail of how INNER joins produce the result set. The INNER join is the common join type used in many cases. Even though it is used very often I have seen most people are not certain how it produces the result set when changing the join condition.

Before go into the article contents, I need to create the two tables below to illustrate the examples used here.
Table 1
CREATE TABLE #InnerJoinTest1    
 InnerJoinTest1_PK INT IDENTITY(1,1)     
Table 2
CREATE TABLE #InnerJoinTest2    
 InnerJoinTest2_PK int    
Since the tables above are created in tempdb, the database currently used in your query editor will not matter. Then I use the queries below to populate the created tables with some sample data.
To keep the explanation easy I used only 5 records in each table. However once you understand the theory behind it you can map it to larger tables as well. The concept is similar despite of the number records in the tables.
Populate table 1
INSERT INTO #InnerJoinTest1    
Populate table 2
INSERT INTO #InnerJoinTest2    
SELECT * FROM #InnerJoinTest1
Once you have executed all the queries above successfully, everything is set for our discussion.
The Scenario
I will present you all the queries which involve an INNER JOIN in advance so that you can manipulate and decide how many records each query will return. Finally you can compare your answers with the result set that actual query execution produces by the SQL Server.
Query 1
SELECT a.*,b.* FROM #InnerJoinTest1 a    
INNER JOIN #InnerJoinTest2 b    
ON a.InnerJoinTest1_PK=b.InnerJoinTest2_PK
Query 2
SELECT a.*,b.* FROM #InnerJoinTest1 a    
INNER JOIN #InnerJoinTest2 b    
ON a.InnerJoinTest1_PK>b.InnerJoinTest2_PK
Query 3
SELECT a.*,b.* FROM #InnerJoinTest1 a    
INNER JOIN #InnerJoinTest2 b    
ON a.InnerJoinTest1_PK>=b.InnerJoinTest2_PK
Query 4
SELECT b.*,a.* FROM #InnerJoinTest1 a    
INNER JOIN #InnerJoinTest2 b    
ON a.InnerJoinTest1_PK
ORDER BY a.InnerJoinTest1_PK
Query 5
SELECT b.*,a.* FROM #InnerJoinTest1 a    
INNER JOIN #InnerJoinTest2 b    
ON a.InnerJoinTest1_PK<=b.InnerJoinTest2_PK    
ORDER BY a.InnerJoinTest1_PK
Query 6
SELECT a.*,b.* FROM #InnerJoinTest1 a    
INNER JOIN #InnerJoinTest2 b    
ON a.InnerJoinTest1_PK<>b.InnerJoinTest2_PK
Take a piece of paper and manually determine the result set of each query will return.
Cartesian Product
First of all, I would like to mention that I will consider table 1 as set 1 and table 2 as set 2. (according to the set theory).
Let's produce the Cartesian product of the two sets. The Cartesian product of the two sets is given below.

You can produce the Cartesian product when you join every row of one table to every row of another table. You can also get one by joining every row of a table to every row of itself.
It is very important to understand how the result set is produced for the Cartesian product from the queries above. The Cartesian product will give you the maximum possible number of combinations that can be created from the given two sets. The number of records in the Cartesian product can be computed by using the following formula.
Number of records = No.of records of table 1 X No.of record of table2
In our case it is 25. (5x5)
Let's look at the first query. This kind of INNER JOIN is the most popular and often you may be able to determine the result set it produces. This INNER JOIN with the equal condition will give you the matching records of both sets. If you look at the above Cartesian product, how many matching records you will be able to find?

I have highlighted the matching instances in the above Cartesian product. So that the first query will give you the highlighted records as the output.
Then consider the second query.
In this case you need to find all the possible record combinations where the first table value is greater than the second table values.
The image below shows the Cartesian product and I have highlighted the record combination for query 2.

Execute query 2 and compare the result set with the highlighted records shown above. The actual query execution result should match with the above highlighted result set.
Applying the same theory for rest of the queries, you should be able to figure out the result set of each one.
Let's take query 3. Again I'm using the Cartesian product to determine the result set. This query will produce the records where InnerJoinTest1_PK is greater than or equal to the InnerJoinTest2_PK.

It is quite easier now to determine the result set if you consider the Cartesian product of two tables.
Let's look at the next two queries. (Query 4 &5) The Cartesian product below is shown the result set of both queries 4 and 5.
The green highlighted records are the output of query 4 where as query 5 will give you the combination of green highlighted and yellow highlighted records. Because the difference of query 4 and 5 is the '=' operator. So that the query 5 will output equal values of both columns in addition to the output of the query 4. The query 4 will produce the records where InnerJoinTest1_PK is less than the InnerJoinTest2_PK of the table 2.

The order of results from the actual execution of this query may be different with the Cartesian product. But it is not really a matter for our discussion. You use simply an ORDER BY clause to order the result set as you desire.
Finally consider the last query. (Query 6) This query will output the records where both columns are not equal. Again look at out most important Cartesian product shown below.
The green highlighted records show where both columns are not equal. Theoretically this result set is similar to Cartesian product - intersection. The intersection is the INNER JOIN result with equal operator. (Output of the query 1)

I assume you all got the better understanding over the different types of INNER JOIN presented in this article and how the result set is determined without actually executing the query. This understanding is very important when it comes to development or the administration work. The article is open for further discussions.

Understanding Skewed Data in SQL Server

My latest article. Understanding Skewed Data in SQL Server Introduction I re...