Tuesday, 15 February 2011

When Is Data Written To Disc?

When you run a query in SQL, the first thing that happens, is SQL checks to see if the data pages that it needs to modify are currently in the Buffer Cache or not. It does this by reading an in-memory hash table, that records the dbid-fileno-pageno of each page that is in the cache at that time. You can see for yourself what pages are currently in your cache, by sys.dm_os_buffer_descriptors. If the pages are not in the cache, then it reads them in from disc.

Once all the required pages are in cache, SQL will begin an Implicit Transaction (if an Explicit Transaction is not already open for the statement). It will then perform the required operations on the page(s). If this was a DML operation, it will also mark them as being dirty. (This just means that they have changed in memory, without being written to disc). It also writes the transaction records to the log cache.

When the transaction (either implicit or explicit) commits, all transaction records, up to and including the commit, will be written to the log. There is a caveat here however; It is not just the log records from this specific transaction that are written. In order to make the transaction durable, SQL will also have to flush other log records that are intermingled with this transaction. It is also worthy of note, that SQL never writes individual log records, it always writes in blocks of between 512 bytes and 60 kb. This is called the "Log Block" and it's size is managed dynamically, internally.

This process is called write-ahead logging, or WAL, and ensures database consistency, but can cause a bottleneck, because SQL caps the amount of transactions that can be "in-flight". (This means that the Log Manager has issued a write, but an acknowledgement that the write has completed). The cap is different depending on your version of SQL, but in 2008, it is 3840KB, regardless of whether you are using a 32-bit or 64-bit version.

If this is causing a bottleneck, then you will notice lots of WRITELOG waits, when you query sys.dm_os_wait_stats. If this is the case, then you should try to optimize your transaction log. Details can be found in this post.

Once the log records have been committed to disc, then SQL is free to flush the "Dirty Pages" to disc. It does not do this straight away however. These dirty data pages are written to disc when the LazyWritter kicks in, or when a Checkpoint occurs.

A checkpoint occurs in a number of circumstances. Firstly, and most obviously, if you manually issue a checkpoint, when a backup starts, when the server is shut down (gracefully), or when an ALTER DATABASE statement is run. Also, if you are in Simple Recovery Model, a checkpoint will occur when your log is 70% full, or if you are in Full Recovery Model, a Checkpoint will run periodically, depending on your setting for recoveryinterval, which allows you to specify a target recovery time.

The LazyWriter process is a SQLOS thread, and it's role in life is to keep the Buffer Pool clean, with enough space to add new pages. It does this by periodically scanning the buffer cache and keeping track of how often a page is referenced. If it decides that it is no longer useful to keep a page in the cache, (using an algorithm that tracks how often a page is used and even what type of page it is) it checks to see if the page is "Dirty", if it is it writes it to the data file. It then marks the page as being "free" in a linked list. When a worker thread needs a page, it will choose one from this list.

Find my book, Pro SQL Server Administration on Amazon -

America

United Kingdom

No comments:

Post a Comment