improve.dk
Just another mindless drone looking for the perfect stack
posts - 215, comments - 453

Seemingly Random Exceptions Thrown From CachedPathData.ValidatePath

Written on Tuesday, January 31, 2012 by Mark S. Rasmussen in Development: .NET
No comments | Write first comment

Several times a day I’d get an error report email noting that the following exception had occurred in our ASP.NET 4.0 application:

System.Web.HttpException (0x80004005)
  at System.Web.CachedPathData.ValidatePath(String physicalPath)
  at System.Web.HttpApplication.PipelineStepManager.ValidateHelper(HttpContext context)

The 80004005 code is a red herring – it’s used for lots of different errors and doesn’t really indicate what’s wrong. Besides that, there’s no message on the exception, so I was at a loss for what might’ve caused it.

We have several 100k’s of visitors each day and I only got 5-10 of these exceptions a day, so it wasn’t critical. Even so, I don’t like exceptions being thrown without reason. After much digging (and cursing at the combination of our error logging and gmail trimming the affected URL’s), I discovered the cause.

All of the URL’s had an extra %20 at the end – caused by others linking incorrectly to our site.

After a short bit of Googling, I found the new RelaxedUrlToFileSystemMapping httpRuntime attribute in .NET 4.0. And sure enough, setting it to false (or letting it have it’s default false value), an exception is thrown when I add %20 to the URL. Once set to true, everything works as expected.

Though I got the problem solved, I would’ve appreciated a more descriptive exception being thrown.

No comments | Write first comment
 

Presenting at SQLBits X

Written on Thursday, January 26, 2012 by Mark S. Rasmussen in SQL Server: SQLBits, Presenting
No comments | Write first comment

SQLBits X is coming up soon, and by the looks of it, it’ll feature a full house of excited attendees:

image

If you haven’t been before, just take a look at these ten reasons, provided by Simon Sabin. Not convinced yet? Take a look at 10 more reasons, provided by Jonathan Allen (though written for SQLBits 7, they’re just as applicable for SQLBits X).

Revealing the Magic

To my big surprise, I’ve got another chance at speaking at SQLBits – I must’ve done something right after all :)

I will once again be presenting a session based on my work with OrcaMDF. Here’s the abstract:

Think SQL Server is magical? You're right! However, there's some sense to the magic, and that's what I'll show you in this level 500 deep dive session. Through my work in creating OrcaMDF, an open source parser for SQL Server databases, I've learned a lot of internal details for the SQL Server database file format. In this session, I will walk you through the internal storage format of MDF files, how we might go about parsing a complete database ourselves, using nothing but a hex editor. I will cover how SQL Server stores its own internal metadata about objects, how it knows where to find your data on disk, and once it finds it, how to read it. Using the knowledge from this session, you'll find it much easier to predict performance characteristics of queries since you'll know what needs to be done.

The basis of the session is the same as the one I gave last year. However, based on the feedback I got, as well as the work I’ve done with OrcaMDF since then, I’ll be readjusting the content a bit. I’ll not be covering disaster recovery, just as I’ll trim some sections to leave room for some new stuff like compression internals.

Just as last time, the session is meant to inspire, not to teach. There’ll be far too much content to understand everything during the session. My hope is to reveal how SQL Server is really governed by a relatively small set of rules – and once you know those, you’ve got a powerful tool to add to your existing arsenal.

No comments | Write first comment
 

The Anatomy of Row & Page Compressed Integers

Written on Monday, January 30, 2012 by Mark S. Rasmussen in SQL Server: Internals, SQL Server: OrcaMDF, SQL Server: Compression
No comments | Write first comment

While working on row compression support for OrcaMDF, I ran into some challenges when trying to parse integers. Contrary to normal non-compressed integer storage, these are all variable width – meaning an integer with a value of 50 will only take up one byte, instead of the usual four. That wasn’t new though, seeing as vardecimals are also stored as variable width. What is different however is the way the numbers are stored on disk. Note that while I was only implementing row compression, the integer compression used in page compression is exactly the same, so this goes for both types of compression.

Tinyint

Tinyint is pretty much identical to the usual tinyint storage. The only exception being that a value of 0 will take up no bytes when row compressed, where as in non-compressed storage it’ll store the value 0x0, taking up a single byte. All of the integer types are the same regarding 0-values – the value is indicated by the compression row metadata and thus requires no actual value stored.

Smallint

Let’s start out by looking at a normal non-compressed smallint, for the values –2, –1, 1 and 2. As mentioned before, 0 isn’t interesting as nothing is stored. Note that all of these values are represented exactly as they’re stored on disk – in this case they’re stored in little endian.

-2	=	0xFEFF
-1	=	0xFFFF
1	=	0x0100
2	=	0x0200

Starting with the values 1 and 2, they’re very straightforward. Simply convert it into decimal and you’ve got the actual number. –1 however, is somewhat different. The value 0xFFFF converted to decimal is 65.535 – the maximum value we can store in an unsigned two-byte integer. The SQL Server range for a smallint is –32.768 to 32.767.

Calculating the actual values relies on what’s called integer overflows. Take a look at the following C# snippet:

unchecked
{
	Console.WriteLine(0 + (short)32767);
	Console.WriteLine(0 + (short)32768);
	Console.WriteLine(0 + (short)32769);
	// ...
	Console.WriteLine(0 + (short)65534);
	Console.WriteLine(0 + (short)65535);
}

The output is as follows:

32767
-32768
-32767
-2
-1

If we start with the value 0 and add the maximum value for a signed short, 32.767, we obviously end up with just that – 32.767. However, if we add 32.768, which is outside the range of the short, we rollover and end up with the smallest possible short value. Since these are constant numbers, the compiler won’t allow the overflow – unless we encapsulate our code in an unchecked {} section.

You may have heard of the fabled sign bit. Basically it’s the highest order bit that’s being used to designate whether a number is positive or negative. As special sounding as it is, it should be obvious from the above that the sign bit isn’t special in any way – though it can be queried to determine the sign of a given number. Take a look at what happens to the sign bit when we overflow:

32.767	=	0b0111111111111111
-32.768	=	0b1000000000000000
-32.767	=	0b1000000000000001

For the number to become large enough for it to cause an overflow, the high order “sign bit” needs to be set. It isn’t magical in any way, it’s simply used to cause the overflow.

OK, so that’s some background information on how normal non-compressed integers are stored. Now let’s have a look at how those same smallint values are stored in a row compressed table:

-2	=	0x7E
-1	=	0x7F
1	=	0x81
2	=	0x82

Let’s try and convert those directly to decimal, as we did before:

-2	=	0x7E	=	126
-1	=	0x7F	=	127
1	=	0x81	=	129
2	=	0x82	=	130

Obviously, these are not stored the same way. The immediate difference is that we’re now only using a single byte – due to the variable-width storage nature. When parsing these values, we should simply look at the number of byte stored. If it’s using a single byte, we know it’s in the 0 to 255 range (for tinyints) or –128 to 127 range for smallints. Smallints in that range will be stored using a single signed byte.

If we use the same methodology as  before, we obviously get the wrong results.1 <> 0 + 129. The trick in this case is to treat the stored values as unsigned integers, and then minimum value as the offset. That is, instead of using 0 as the offset, we’ll use the signed 1-byte minimum value of –128 as the offset:

-2	=	0x7E	=	-128 + 126
-1	=	0x7F	=	-128 + 127
1	=	0x81	=	-128 + 129
2	=	0x82	=	-128 + 130

Aha, so that must mean we’ll need to store two bytes as soon as we exceed the signed 1-byte range, right? Right!

image

One extremely important difference is that the non-compressed values will always be stored in little endian on disk, whereas the row compressed integers are stored using big endian! So not only do they use different offset values, they also use different endianness. The end result is the same, but the calculations involved are dramatically different.

Int & bigint

Once I figured out the endianness and number scheme of the row-compressed integer values, int and bigint were straightforward to implement. As with the other types, they’re still variable width so you may have a 5-byte bigint as well as a 1-byte int. Here’s the main parsing code for my SqlBigInt type implementation:

switch (value.Length)
{
	case 0:
		return 0;

	case 1:
		return (long)(-128 + value[0]);

	case 2:
		return (long)(-32768 + BitConverter.ToUInt16(new[] { value[1], value[0] }, 0));

	case 3:
		return (long)(-8388608 + BitConverter.ToUInt32(new byte[] { value[2], value[1], value[0], 0 }, 0));

	case 4:
		return (long)(-2147483648 + BitConverter.ToUInt32(new[] { value[3], value[2], value[1], value[0] }, 0));

	case 5:
		return (long)(-549755813888 + BitConverter.ToInt64(new byte[] { value[4], value[3], value[2], value[1], value[0], 0, 0, 0 }, 0));

	case 6:
		return (long)(-140737488355328 + BitConverter.ToInt64(new byte[] { value[5], value[4], value[3], value[2], value[1], value[0], 0, 0 }, 0));

	case 7:
		return (long)(-36028797018963968 + BitConverter.ToInt64(new byte[] { value[6], value[5], value[4], value[3], value[2], value[1], value[0], 0 }, 0));

	case 8:
		return (long)(-9223372036854775808 + BitConverter.ToInt64(new[] { value[7], value[6], value[5], value[4], value[3], value[2], value[1], value[0] }, 0));

	default:
		throw new ArgumentException("Invalid value length: " + value.Length);
}

The value variable is a byte array containing the bytes as stored on disk. If the length is 0, nothing is stored and hence we know it has a value of 0. For each of the remaining valid lengths, it’s simply a matter of using the smallest representable number as the offset and then adding the stored value onto it.

For non-compressed values we can use the BitConverter class directly as it expects the input value to be in system endianness – and for most Intel and AMD systems, that’ll be little endian (which means OrcaMDF won’t run on a big endian system!). However, as the compressed values are stored in big endian, I have to remap the input array into little endian format, as well as pad the 0-bytes so it matches up with the short, int and long sizes.

For the shorts and ints I’m reading unsigned values in, as that’s really what I’m interested in. This works since int + uint is coerced into a long value. I can’t do the same for the long’s since there’s no data type larger than a long. For the maximum long value of 9.223.372.036.854.775.807, what’s actually stored on disk is 0xFFFFFFFFFFFFFFFF. Parsing that as a signed long value using BitConverter results in the value –1 due to the overflow. Wrong as that may be, it all works out in the end due to an extra negative overflow:

-9.223.372.036.854.775.808 + 0xFFFFFFFFFFFFFF =>
-9.223.372.036.854.775.808 + -1 =
9.223.372.036.854.775.807

Conclusion

As usual, I’ve had a lot of fun trying to figure out how the bytes on disk ended up as the values I saw when performing a SELECT query. It doesn’t take long to realize that while the excellent Internals book really takes you far, there’s so much more to dive into.

No comments | Write first comment
 

Broadcom NICs Considered Harmful?

Written on Wednesday, January 25, 2012 by Mark S. Rasmussen in SQL Server: Misc
No comments | Write first comment

Every night at around 2AM I get an email from my best friend, confirming that she’s OK. It usually looks something like this:

JOB RUN:    'Backup.Daily' was run on 04-08-2011 at 02:00:00
DURATION:    0 hours, 5 minutes, 57 seconds
STATUS:     Succeeded
MESSAGES:    The job succeeded.  The Job was invoked by Schedule 9 (Backup.Daily (Mon-Sat)).  The last step to run was step 1 (Daily).

Just the other night, I got one that didn’t look right:

DURATION: 4 hours, 3 minutes, 32 seconds

Looking at the event log revealed a symptom of the problem:

SQL Server has encountered 2 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [J:\XXX.mdf] in database [XXX] (150).  The OS file handle is 0x0000000000003294.  The offset of the latest long I/O is: 0x00000033da0000

Our databases were the same, the workload was the same. The only teeny, tiny little thing that had changed was that I’d moved all of the data files + backup drive onto a new SAN. Obviously, that’s gotta be the problem.

Broadcom, how I loathe thee!

Through some help on #sqlhelp and Database Administrators, I managed to find the root cause as well as to fix it. For a full walkthrough of the process, please see my post on Database Administrators.

Mark Storey-Smith ended up giving the vital clue – a link to a post by Kyle Brandt which I clearly remember reading earlier on, but didn’t suspect was applicable to my situation. I ended up disabling jumbo frames, large send offload (LSO) and TCP connection offload (TOE), and lo and behold, everything was running smoothly. By enabling each of the features individually I pinpointed the issue to the Broadcom TOE feature on the NICs. Once I enabled TOE, my IO requests were stalling. As soon as I disabled TOE, everything ran smoothly.

image

After disabling TOE on both NICs, my backups went from looking like this:

image

To this:

image

At this point the backup timing was back on track and the event log was all green. I did use the same Broadcom NICs with TOE enabled for the previous SAN, so obviously something must have triggered the issue, whether it’s a problem between the new switches, the drivers, cables, I have no idea. All I know is that I’m apparently not the first to suffer similar issues with Broadcom NICs and I know for sure that I’ll get Intels in my next servers.

No comments | Write first comment
 

Presenting at Miracle Open World 2012

Written on Monday, January 23, 2012 by Mark S. Rasmussen in SQL Server: Miracle Open World, Presenting
No comments | Write first comment

Having presented at Miracle Open World back to back in 2010 and 2011, I’m excited to announce I’ll also be present in 2012! Not only will I be presenting, I’ll be presenting a full day track.

I’ve been continually improving my original internals training day that I presented at SQLBits. At MOW, I’ll be presenting the evolved session, which now also includes compression internals. As the allotted time at MOW is slightly shorter than at SQLBits, and I have slightly more material, expect to bring coffee :)

For more details on the session, please see the description at the MOW website.

No comments | Write first comment