Monday, August 29, 2016
Thursday, August 25, 2016
It is certain that many are excited and welcome the Microsoft official announcement of SQL Server is going to support Linux platform. Specially the open source community must be thrilled with the news.
However, still there is not much information available to public about how SQL Server on Linux will looks like. As per Microsoft the initial release will be available on mid of 2017.
If we look at the history of SQL Server, the very early released versions were on UNIX based operating system known as OS/2 which is jointly developed by IBM and Microsoft. As per wikipedia the kernel type of OS/2 is mentioned as hybrid. That was 1988/89 period.
The SQL Server versions 1.0, 1.1, and 4.2 were all based on OS/2 platform. Microsoft had separately developed the version 4.2 to support for their first version of Windows NT OS but with the same features as in the version that runs on OS/2.
Historically, the SQL Server has born in UNIX based platform and later it has been ported to support only on Windows after ending the agreement with Sybase.
After 25 years later, Microsoft has decided to make SQL Server available on Linux platform which is a good move.
source: Microsoft SQL Server - History
In my opinion, the main challenge would be to develop an abstraction layer to support Linux platform. The architecture of Windows and Linux are distinctly different. Both has preemptive scheduling but for the process model, Linux has very unique implementation of threads. SQL Server's process/threads model that runs in Windows is, single process multi-threading, meaning you can see only a single process for an SQL Server instance while internally it has thread pool to service for client requests. Consequently, thread is the unit of work at OS level to accomplish SQL Server requests in Windows.
Conversely, in Linux has multi processes and there is no concept called threads as in Windows OS. Linux calls fork() system call to create a child process from the parent process and then it is finished or killed. In a nutshell, in Linux, everything is done by in-terms of processes.
Bridging this architectural difference in SQL Server probably the most challenging part. I'm no expert in OS but if you think in software engineering point of view, this is the image I get about the abstraction layer.
SQL Server has it's own user mode operating system known as SQLOS which is non-preemptive scheduler to interact with the Windows OS to service better for special needs of SQL Server. The SQLOS was introduced in SQL Server 2005 code name Yukon.
To support Linux platform, the SQLOS component may need lot of additions or else there could be a separate component like SQLOS to support Linux platform. We do not know that information yet.
What features will be in SQL Server on Linux?
It is unclear what features will be available in SQL Server version that runs on Linux. To provide full features set as in SQL Server 2016 would be a real challenge. Another question is, whether it will include the BI tools? As per the announcement blog, it clearly states "We are bringing the core relational database capabilities to preview today" What are the core relational database capabilities? I presume the following capabilities should definitely need be addressed. They are; scheduling, memory management, I/O, exception handling, ACID support, etc. How about high availability options like clustering and AlwaysOn? I predict there will be some sort of high availability option in the first release on Linux. How about the Page size, is it 8K or different? By default Linux page size is 4K (4096 Bytes), however SQL Server in Windows supports non-configurable page size which is 8K(8192 Bytes). Consequently, SQL Server on Linux has to support is default page size of 4K. However unlike in Windows, Linux page size is configurable but it is not certain how feasible and the consequences of doing it. I just checked the page size in Mac OS and it is 4K too. See below figure.
Will there be In-Memory OLTP engine with SQL Server on Linux? In-Memory OLTP, code name Hekaton was first introduced in SQL Server 2014 and it further developed in SQL Server 2016. In-Memory OLTP is an enormously important feature in future RDBMS products. However it is unclear yet, the first release of SQL Server on Linux will have this feature or not.
What this means to DBAs?
Skill set required for future SQL Server DBAs are going to be expanded. They need to learn Linux OS commands as well as Python and Powershell for scripting tasks. In future companies will going to have a mix of Linux and Windows based SQL Server installations most probably. Licensing cost will decide which platform version of SQL Server is going to be the bigger portion. In a nutshell, future SQL Server DBAs will be cross-platform DBAs. It's challenging isn't it?
How about the Certifications?
Most likely, a new certification path will emerge to cater for the new trends/skills of SQL Server. To become a successful DBA, you might need to be qualified for SQL Server for both Linux and Windows platforms as well as cloud platforms.
How this will impact to other RDBMS products?
My sense is, this will impact directly to MySQL. MS SQL Server is a proven enterprise level data platform. If anyone has a chance to get that into Linux OS, then it is highly unlikely someone will choose MySQL besides SQL Server providing that SQL Server Developer edition is free.
I'm really excited and looking forward to get some hands on with SQL Server on Linux in very near future.
Thursday, August 18, 2016
- How many files (data / log) we need to create? Is it based on logical cores or physical CPUs?
- How to decide the initial size of the tempdb?
- Do we need to enable autogrowth or not? If we do, then, is it for all the files or just one file?
- The size of autogrowth?
chunks less than 64MB and up to 64MB = 4 VLFs
chunks larger than 64MB and up to 1GB = 8 VLFs
chunks larger than 1GB = 16 VLFs
- Do we need to keep all data files and log file in the same drive?
- Changes in SQL Server 2016
Friday, August 12, 2016
Monday, August 8, 2016
- Read Uncommitted (the lowest level, high concurrency, less blocking, more currency related issues)
- Read Committed (default)
- Repeatable Read
- Serializable (the highest level, less concurrency, higher blocking, no concurrency related issues)
- Read Committed Snapshot
- Snapshot isolation level
Wednesday, August 3, 2016
I was analyzing IO stats in one of our SQL Servers and noticed that IO latency (Read / Write) are very high. As a rule of thumb, we know tha...
I wanted build an expression to make a folder name dynamically. (Basically expressions are anyway dynamic in nature) So I simply used the ...
When querying sys.dm_exec_requests dmv, you can see the waiting requests and the resource those requests are waiting for. Sample of such val...
Sharding is a data partitioning technique which is popular in NoSQL databases like MongoDB and Cassandra. You can scale out the data set h...