Dec 13

2011

The Anatomy of Vardecimals

SQL Server - Internals , SQL Server - OrcaMDF

In this post I’ll do a deep dive into how vardecimals are stored on disk. For a general introduction to what they are and how/when to use them, see this post.

Dec 12

2011

Determining If Vardecimal Is Enabled For a Table Without Using OBJECTPROPERTY

SQL Server - Internals

Determining whether vardecimal.aspx) is enabled for a given table is usually done by using the OBJECTPROPERTY function like so:

Dec 09

2011

Presenting at the 2012 Spring SQL Server Connections

Conferences and Presenting

Las Vegas feels like home to me, ever since going there the first time at age 7 and looking at all those ever so banned-for-me slot machines. I’ve since gone there more times than I can count, including my 21st birthday – and what a difference that age does, in a city like Las Vegas. I’ve been to the CES conference, DevConnections (2007) and I’ve played a hand of poker or two, eventually ending up in me staying there for a month to participate in the World Series of Poker (finished #822 out of 8.773).

Dec 05

2011

SQLBits 9 Session Evaluation

Conferences and Presenting

I’d heard many rumors about the excellent SQLBits session evaluation results that speakers are sent. Knowing the SSRS geeks on the SQLBits team, I’d expect nothing short of data & graph filled reports – and I’m glad to say they didn’t disappoint!

Nov 28

2011

Automated Testing of OrcaMDF Against Multiple SQL Server Versions

.NET , SQL Server - OrcaMDF

Since I released OrcaMDF Studio, I’ve gotten aware of some base table differences between SQL Server 2008 and 2005. These differences causes OrcaMDF to fail since it’s coded against 2008 R2 and expect that format.

Nov 25

2011

OrcaMDF Studio Release + Feature Recap

SQL Server - OrcaMDF

Just about two and a half months have passed since I last posted an OrcaMDF feature recap. Since then I’ve been busy attending three of the top SQL Server conferences – SQLBits, PASS and SQL Rally. It’s been excellent chatting about OrcaMDF and getting some feedback on where to take it, thanks to all of you!

Nov 17

2011

Optimizing Single Instance Amazon S3 Delete Performance

.NET , Amazon Web Services

Once you’ve made a mess and you’ve now got millions of objects you need to delete, how do you do that as fast as possible?

Nov 10

2011

OrcaMDF Now Exposes Metadata Through System DMVs

SQL Server - OrcaMDF

I’m sitting here on the train in Denmark, on the final leg home from SQLRally Nordic. During my presentation based on my OrcaMDF work, I implicitly announced that OrcaMDF now exposes metadata – thougt I might as well share here as well. Other than expanding the core engine support in OrcaMDF, one of the main features I’ve wanted to implement was a way for OrcaMDF to expose metadata about your database. How do you list the tables, indexes, columns, etc. from your database?

Nov 07

2011

Pushing the Limits of Amazon S3 Upload Performance

.NET , Amazon Web Services

Recently I’ve been working on a project where I’ve got millions of relatively small objects, sized between 5kb and 500kb, and they all have to be uploaded to S3. Naturally, doing a synchronous upload of each object, one by one, just doesn’t cut it. We need to upload the objects in parallel to achieve acceptable performance. But what are the optimal parameters when it comes to the number of simultaneous upload threads? Does it depend on the object size? How much of a difference does HTTPS over HTTP make? Let me share what I discovered during my testing.

Oct 26

2011

How to Set Up and Serve Private Content Using S3 and Amazon CloudFront

.NET , Amazon Web Services

Imagine the scenario – you’ve got customers all over the world all requesting binary files from you. To speed up your delivery, you want to utilize a CDN. Furthermore, all of the files needs to be protected on a specific user session level. Basically, you need to grant access to the specific file when a given user logs in – it’s not enough just to have a “hidden” URL or a URL with an infinitely sharable policy in the query string.

Oct 24

2011

OrcaMDF Now Supports Databases With Multiple Data Files

SQL Server - OrcaMDF

One of the latest features I’ve added to OrcaMDF is support of databases with multiple data files. This required relatively little parsing changes, actually it was mostly bug fixing code that wasn’t hit previously, due to only working with single file databases. It did however require some major refactoring to move away from MdfFile being the primary entrypoint, to now using the Database class, encapsulating a variable number of DataFiles.

Oct 20

2011

What Do Airlines Use for Primary Keys?

SQL Server

On my way home from the PASS Summit in Seattle, I had a layover in Amsterdam before continuing onto Copenhagen. For various reasons, we were about one and a half hours delayed, and I arrived in AMS at 9:30, my CPH flight departing at 9:35. As you’d probably guessed, I missed my flight.

Oct 13

2011

Live Blogging From the 2011 PASS Summit Day 2 Keynote

Conferences and Presenting

It’s 8:15 AM and I’m back at the bloggers table, ready for the day 2 keynote. The format will be the same as yesterday.

Oct 12

2011

Live Blogging From the 2011 PASS Summit Day 1 Keynote

Conferences and Presenting

I’ve found my seat at the Bloggers Table, ready for the beginning of the keynote in just 12 minutes. I’ll be doing my best keeping up with all of the exciting news about to be spilled out.

Oct 09

2011

Live Blogging From the PASS Keynotes

Conferences and Presenting

I’m glad to announce that I’m one of the lucky few who’ve been appointed a seat at the coveted bloggers table during the PASS Summit keynotes. This means I’ll have an excellent view of the leynotes, and I’ll be live blogging and tweeting from both.

Oct 07

2011

Keeping Track of Time While Presenting

Conferences and Presenting

One of the most important aspects of presenting is to stay within the allotted time slot. You should aim at finishing exactly on time, neither exceeding nor finishing too early.

Oct 05

2011

Reflections on SQLBits 9 and Getting the Most From Conferences

Conferences and Presenting

After getting back to Denmark from an excellent SQLBits 9, I’ve had a chance to reflect a bit on my experiences. Not just on SQLBits itself, but the very concept of attending conferences and how to get the most of it.

Oct 03

2011

SQLBits Presentation Materials

Conferences and Presenting

I’ve got a blog post coming up tomorrow on general SQLBits reflections (‘twas awesome), for now I’ll just post my session materials as promised :)

Sep 10

2011

OrcaMDF Feature Recap

SQL Server - OrcaMDF

Time flies – it’s been about four months since I originally introduced my pet project, OrcaMDF. Since then, quite a lot has happened and OrcaMDF is somewhat more capable than when it started out. As a result I thought I’d provide a recap of what OrcaMDF is currently capable of, as well as what my plans are for the future.

Aug 23

2011

2011 Fall Presentation Schedule

Conferences and Presenting

My fall schedule is starting to fall into place, and boy, is it a busy one. I am absolutely thrilled that I’ll be going to the three major SQL Server conferences this fall: SQLBits, PASS Summit and PASS SQLRally Nordic. While I’m thrilled that I’ll be attending, I’m even more thrilled, humbled and excited over the fact that I’ll be presenting at all three.

Aug 04

2011

Presenting a Free Preview of my SQL Server Storage Engine and MDF File Internals Training Day

Conferences and Presenting

In collaboration with Trifork, I’ll be giving a free preview of my SQL Server Storage Engine and MDF File Internals training day at SQLBits.

Aug 03

2011

SQLBits 9 Agenda Published

Conferences and Presenting

The agenda for SQLBits 9 has been published (though it’s still provisional). It’s looking really, really good. Especially so when you consider the price of the event – you’ve got until the 26th of August to get the early bird price of £375 for two complete days of conference – PLUS a whole day of full day training sessions.

Aug 01

2011

How Not to Reinvent Indexes

SQL Server

In a moment of weakness I pleged to make an absolute fool of myself for this months Meme Monday. I wish I could say that this happened 20 years ago, when I was but a young grasshopper. To my disgrace, this happened fewer years ago than I’d like to admit.

Jul 18

2011

What is the Size of the LOB Pointer for (MAX) Types Like Varchar, Varbinary, Etc?

SQL Server - Internals

LOB types like varchar(MAX), nvarchar(MAX), varbinary(MAX) and xml suffer from split personality disorder. SQL Server may store values in-row or off-row depending on the size of the value, the available space in the record and the table settings. Because of this, it’s no easy task to predict the size of the pointer left in the record itself. You might even say… It depends.

Jul 16

2011

The Garbage Null Bitmap and Why You Can't Rely on It Solely

SQL Server - Internals

While adding some extra sparse column tests to the OrcaMDF test suite, I discovered an bug in my parsing of records. While the problem was simple enough, it took me a while to debug. Running the test, it worked about 40% of the time while failing the remaining 60% of the time. As I hadn’t picked up on this pattern I happily fixed (or so I thought) the bug, ran my test and verified that it was working. Shortly after the test failed – without me having changed any code. After having the first few strains of hair turn grey, I noticed the pattern and subsequently fixed the bug.

Jul 16

2011

The 8 Byte Record That Was 9 Bytes While Making No Sense

SQL Server - Internals

Warning: this is a select is (most likely) not broken, it’s just not working as I’d expect. It may very well be that I’m just overlooking something, in which case I hope someone will correct me :)

Jul 15

2011

The Null Bitmap is Not Always Present in Data Records

SQL Server - Internals

While implementing sparse column support for OrcaMDF, I ran into a special condition that caught me by surprise – a data record with no null bitmap. Even Paul Randal mentioned that the null bitmap would always be present in data records in his A SQL Server DBA myth a day: (6/30) three null bitmap myths-three-null-bitmap-myths.aspx) post.

Jul 15

2011

Sparse Column Storage & the Sparse Vector

SQL Server - Internals

In this post I’ll be looking at the internal storage mechanism that supports sparse columns. For an introduction to what sparse columns are and when they ought to be used, take a look here.

Jul 15

2011

Identifying Complex Columns in Records

SQL Server - Internals

For an introduction to the anatomy of records, I suggest you read this post by Paul Randal.

Jul 13

2011

Creating a Type Aware Parser for the sys.system_internals_partition_columns.ti Field

SQL Server - Internals

Based on my findings exploring the sys.system_internals_partition_columns.ti field, I needed parser that could extract the scale, precision, max_length as well as the max_inrow_length fields from it. The tricky part is that those values are stored differently for each individual type, added onto the fact that some types have hardcoded defaults that are not stored in the ti field, even though there’s space for it.

Jul 13

2011

Exploring the sys.system_internals_partition_columns.ti Field

SQL Server - Internals

Running sp_helptext on the sys.system_internals_partition_columns system view reveals the following internal query:

Jul 12

2011

SQLBits 9 Voting & Registration Open

Conferences and Presenting

The registration for SQLBits 9 is now open.Equally important, the voting for sessions is has also opened. Simply login and take a look at the list of sessions to vote for your preferred sessions.

Jul 11

2011

Converting Between Base 2, 10 and 16 in T-SQL

SQL Server - Tricks

There are many numeral systems, the most common ones in computer science being binary (base 2), decimal (base 10) and hexadecimal (base 16). All numbers can be expressed in either system and you may now and then need to convert between them.

Jul 06

2011

Determining the Uniquifier Column Ordinal for Clustered and Nonclustered Indexes

SQL Server - Internals

Lately I’ve been working on nonclustered index parsing. One of my test cases proved to be somewhat more tricky than I’d anticipated, namely the parsing of nonclustered indexes for non-unique clustered tables. Working with non-unique clustered indexes, we’ll have to take care of uniquifiers when necessary.

Jun 30

2011

Presenting a Precon at SQLBits

Conferences and Presenting

I’m honored to announce that I’ll be presenting my SQL Server Storage Engine and MDF File Internals precon at SQLBits on September 29th. It’s an amazing bunch of precon sessions and presenters, I didn’t think I’d be speaking at SQLBits, much less presenting a precon – given the lineup of speakers.

Jun 21

2011

Copying a SQL Server Database File That's in Use Using Volume Shadow Copy

.NET

When working on OrcaMDF I usually setup a test database, force a checkpoint and then perform my tests on the MDF file. Problem is, you can’t open the MDF file for reading, nor copy it, as long as the database is online in SQL Server. I could shut down SQL Server temporarily while copying the file, but that quickly becomes quite a hassle.

Jun 16

2011

Getting Bit by Datetime Rounding or Why 23:59:59.999 < '23:59:59.999'

SQL Server - Internals

Earlier today I was doing some ad-hoc querying to retrieve some numbers for the month of May. Not giving it deeper thought, I made a simple query like this:

Jun 14

2011

Avoiding Regressions in OrcaMDF by System Testing

.NET , SQL Server - OrcaMDF , Testing

As I continue to add new features & support for new data structures in OrcaMDF, the risk of regressions increase. Especially so as I’m developing in a largely unknown field, given that I can’t plan for structures and relations that I do not yet know about. To reduce the risk of regressions, testing is an obvious need.

Jun 09

2011

Anatomy of a Forwarded Record & the Back Pointer

SQL Server - Internals

Earlier this week I provided some details on the forwarding stub that’s left behind when a heap record is forwarded. In this post I’ll look at the second part of a forwarded record – the actual record to which the forwarding stub points.

Jun 07

2011

Anatomy of a Forwarded Record & the Forwarding Stub

SQL Server - Internals

A forwarded record occurs whenever a record in a heap increases in size and it no longer fits on the page. Instead of causing a page split, as would happen had the table not been a heap, the record is moved onto another with enough free space, or onto a newly allocated page. Forwarded records can wreak havoc to your performance due to fragmentation, but I’ll leave not cover that here as many other more skilled people have already done so.

Jun 01

2011

Using Fiddler to Automatically Download Streamed MP3s

.NET

Eric Lawrence’s Fiddler has many uses. I use it every day for debugging our client/server interaction, caching behavior, etc. What many don’t realize is that Fiddler is also an excellent platform for scripting, enabling you to modify requests and responses as they go out and come back. I made a quick script to automatically download streamed MP3 files as they were played, naming them automatically from the ID3 information contained in them.

May 31

2011

Saving Space by Storing Decimal Values in Integer Data Types

SQL Server - Internals

I recently stumbled upon a question on Stack Overflow on how best to reduce their data size as it’s growing out of hand. As the original author hasn’t replied back yet (as of writing this post, I’m making some assumptions on the scenario – so take it as an abstract scenario). The basic scenario is that they have a number of measuring stations, each one of those containing a lot of equipment reporting back to a SQL Server in a schema like the following abstract:

May 24

2011

Avoiding Page Splits By Splitting Pages

SQL Server - Optimization

Continuing my review of my old database designs, I stumbled upon yet another mind numbing design decision. Back then, I’d just recently learned about the whole page split problem and how you should always use sequentially valued clustered keys.

May 19

2011

Checking Which Database is Stored in a Detached MDF File

.NET , SQL Server - OrcaMDF

Inspired by this question on StackOverflow, I’ve made a quick script to demonstrate how this might be done using OrcaMDF.

May 19

2011

Bridging the Gap Between Smallint and Int

SQL Server - Optimization

Being a proponent of carefully choosing your data types, I’ve often longed for the mediumint data type that MySQL has. Both smallint and int are signed data types, meaning their ranges are –32,768 to 32,767 for smallint and –2,147,483,648 to 2,147,483,647 for int. For most relational db schemas, positive identity values are used, meaning we’re looking at a possible 32,767 vs 2,147,483,647 values for smallint vs int. That’s a humongous difference, and it comes at a storage cost as well – 2 vs 4 bytes per column. If only there was something in between…

May 19

2011

Reverse Engineering SQL Server Page Headers

.NET , SQL Server - Internals

One of the first challenges I faced when starting out the development of OrcaMDF was parsing page headers. We all know that pages are basically split in two parts, the 96 byte header and the 8096 byte body of remaining bytes. Much has been written about headers and Paul Randal (b|t) has a great post describing the contents of the header as well. However, though the contents have been described, I’ve been completely unable to find any kind of details on the storage format. What data types are the individual fields, and what’s the order? Oh well, we’ve always got DBCC PAGE.

May 17

2011

Wasted Bytes Add Up, Consider Your Data Types Carefully

SQL Server - Optimization

I recently had a look at the statistics storage of a system I designed some time ago. As is usually the case, back when I made it, I neither expected nor planned for a large amount data, and yet somehow that table currently has about 750m rows in it.

May 12

2011

Reading Bits in OrcaMDF

SQL Server - Internals , SQL Server - OrcaMDF

Bits are stored very differently from other fixed length data types in SQL Server. Usually all fixed length columns will be present, one after the other, in the fixed data part of a record. As the smallest unit of data we can write to disk is a byte, the naïve approach to storing bits would be to use a whole bit for each bit. It would be very simple to parse as it would follow the usual scheme, but it would also waste quite some space.

May 10

2011

Parsing Dates in OrcaMDF

SQL Server - Internals , SQL Server - OrcaMDF

There are several different date related data types in SQL Server. Currently OrcaMDF supports the three most common types: date, datetime & smalldatetime.

May 05

2011

Implementing Data Types in OrcaMDF

SQL Server - Internals , SQL Server - OrcaMDF

Implementing parsing support for SQL Server data types in OrcaMDF is a simple matter of implementing the ISqlType interface:

May 03

2011

Introducing OrcaMDF

SQL Server - Internals , SQL Server - OrcaMDF

I’ve been spamming Twitter the last couple of days with progress on my pet project, OrcaMDF. But what is OrcaMDF really?

Miracle Open World 2011

I was invited to speak at MOW2011 for the SQL Server track. Last year I got good reviews for my presentation on Dissecting PDF Documents, a deep dive into the file format of PDF files. Wanting to stay in the same grove, I decided to take a look at the MDF format as it’s somewhat closer to SQL Server DBA’s than PDF files. Having almost worn my SQL Server 2008 Internals book down from reading, I’ve always been interested in the internals, though I still felt like I was lacking a lot of knowledge.

A parser is born

For my demos at MOW I wanted to at least read the records from a data page, just like the output from DBCC Page. The basic page format is well documented, and it’s not the first time I’ve taken a deeper look at pages. Surprisingly quickly, I had record parsing from data pages functioning using a hardcoded schema. Parsing a single page is fun, but really, I’d like to get all the data from a table. Restricting myself to just consider clustered tables made it simpler as it’d just be a matter of following the linked list of pages from start to end. However, that meant I’d have to parse the header as well. There’s some good information out there on the anatomy of pages, but everything I could find had a distinct lack of information on the actual header structure and field types.

May 03

2011

My PASS Summit 2011 Abstracts

Conferences and Presenting

Last year I went on a marathon conference trip, starting out in LA at Adobe Max, continuing on to QCon in San Francisco before finally ending up in lovely Seattle for the PASS Summit.

Apr 28

2011

Converting Page Pointers Into a Humanly Readable Format

SQL Server - Internals

I often like to spend my weekends perusing the sys.system_internals_allocation_units table, looking for the remnants of Frodo and his crew. In the sys.system_internals_allocation_units there are several references to relevant pages:

Apr 26

2011

Watch Out For Implicit Uniqueidentifier Conversions

SQL Server - Optimization

I recently gave a presentations on the topic of GUID usage at Miracle Open World. After finishing off my last slide and opening to questions, one of the attendees told a story of how an implicit GUID conversion had resulted in index scans instead of index seeks.

Apr 17

2011

42 Weeks Later

Miscellaneous

Apr 17

2011

Miracle Open World 2011 Followup

Conferences and Presenting

Last week I attended the Miracle Open World 2011 conference. As in the previous years the 80% content, 80% social moniker held true to its reputation. My legs and arms are still sore from carrying four-man rubber rings to the top of the four-story waterslide – not the typical aftermath of a conference. I met a lot of new awesome people and managed to pull of a couple of presentations with decent success as well, judging from the responses I’ve had so far. I’ve included the slides & demo code for the presentations below.