improve.dk
Just another mindless drone looking for the perfect stack
posts - 227, comments - 489

The 8 byte record that was 9 bytes while making no sense

Written on July 16, 2011 by Mark S. Rasmussen in SQL Server: Internals

Warning: this is a select is (most likely) not broken, it’s just not working as I’d expect. It may very well be that I’m just overlooking something, in which case I hope someone will correct me :)

I’ve previously blogged about how sparse-column-only table records didn’t have a null bitmap, nor did they store the usual column count, except for the number of variable length columns. In my effort to test OrcaMDF, I added the following SQL code as the setup for a test:

CREATE TABLE ScanAllNullSparse
(
	A int SPARSE,
	B int SPARSE
)
INSERT INTO ScanAllNullSparse DEFAULT VALUES

Dumping out the resulting record yields the following:

image

And this is where things start to get weird. The status bits (red) are all off, meaning there’s neither a null bitmap nor variable length columns in this record. The next two (blue) bytes indicate the end offset of the fixed length portion – right after those two very bytes, since we don’t have any fixed length data.

At this point I’m not too sure what to expect next – after all, in the previous blog post I showed how the column count wasn’t stored in all-sparse tables. Also, the status bits indicate that there’s no null bitmap. But what is the green 0x0100 (decimal value 1) bytes then? The only value I can see them possible indicating is the number of variable length columns. But why would that be present when the status bits indicate there are no such columns? Oh well, if that’s the case, then the next two (pink) bytes must be the offset array entry for the variable length column – having a value of 8 indicates that the variable length column has no value.

But wait, if the variable length column doesn’t have a value, then what is that last (orange/brownish) 0x00 byte doing at the very end? That’s beyond the offset marked in the (assumedly) variable length offset array… And if the pink bytes really is the variable length offset array – should it not indicate a complex column for the sparse vector? (though it would make sense for it not to do so, if it weren’t stored in the record).

I can still parse this by just taking some precautions, but I still don’t understand what’s going on. Any suggestions?

It’s not just DBCC PAGE

To clear DBCC PAGE of any suspicion I amended my original test by inserting two extra rows with DEFAULT VALUES. The resulting offset table looks like this:

image

As can be seen, the storage engine allocates 9 bytes for all three rows (though we can only verify the first two). Thus it’s not just DBCC PAGE that reads the records as being 9 bytes, so does the storage engine. This just strengthens the case that SQL Server knows best, now if only I could figure out why :)

Feedback

Gravatar

Paul White wrote on 7/16/2011 5:41 AM

Hi,

The green and pink data above is the metadata for an extra variable-length column at the end of the normal data record.

This extra system column is the sparse vector (which might also be followed by versioning information). In your example, there is no sparse data so the sparse vector contains zero elements.

The detailed structure of the sparse vector is contained in SQL Server 2008 Internals (Kalen Delaney, Microsoft Press) pp406-408.

Hope that helps,

Paul
Gravatar

Mark S. Rasmussen wrote on 7/16/2011 1:21 PM

Hi Paul,

I'm afraid it's not that simple. I've parsed the sparse vector in a previous post - improve.dk/...

When the sparse vector is present, the record will have the VARIABLE_COLUMNS bit set in the status bits. Even though the column is not present in sys.columns/sysrscols, it still acts like a normal varlength column.

That being said, the record may have a column entry for the sparse vector, but it can be empty, if none of the sparse columns have ever had a non-null value.

That last case almost makes sense - the green 0100 identifies the one variable length sparse vector column. The pink 0800 defines the valid zero-length value offset as right after the offset value itself. But that still leaves out the trailing 0x00 byte at the end - what is that doing?
Gravatar

Ewald Cress wrote on 7/17/2011 1:01 AM

This probably isn't a very helpful comment, but worth considering as an option...

What if your analysis is 100% correct up to the question about the trailing 0x00, and the explanation for the 0x00 is simply that the DBCC PAGE code incorrectly calculates the row length as 9 instead of 8? It is after all just a pretty printer and not core storage engine code, so edge case bugs can escape being noticed or fixed.
Gravatar

Mark S. Rasmussen wrote on 7/17/2011 4:48 AM

Well, I can't rule the possibility that DBCC PAGE may not handle the scenario correctly. That being said, through my career I've usually been wrong whenever I've called out select being wrong, so I'm rather adamant towards doing that in this case.

I don't have an explanation, as of yet, for the structure but I'm still thinking it's an edge case I'm just not understanding correctly yet - after all, I've had quite a few of those thus far :)
Gravatar

Ewald Cress wrote on 7/17/2011 9:32 PM

Fair enough - it's always the worst kind of cop-out to point fingers elsewhere. I'm just slightly concerned for your mental health if you find a lot more edge case implementations as you dig deeper!

Seriously though, I applaud what you're doing, and I love these posts. Very much the spirit of Ken Henderson, sans source code access.
Gravatar

SwePeso wrote on 7/19/2011 6:24 AM

Does it have something to do with COLUMN_SET?

Add this to your table and try again

", cs xml column_set FOR ALL_SPARSE_COLUMNS"

Gravatar

Mark S. Rasmussen wrote on 7/19/2011 2:33 PM

That's interesting, AFAIK the COLUMN_SET isn't stored physically in any way in the record, it's pure metadata. However, adding a third column defined as the ALL_SPARSE_COLUMNS COLUMN_SET does change the output to the following (still 9 bytes in length):

00000400 01000000 00

I can't see this being the same layout as before - in that case we've now got a zero-offset variable length column, which isn't valid AFAIK, still with an extra trailing byte.

If I insert a row with any value like so:
INSERT INTO ScanAllNullSparse (A) VALUES (5)

Then, with or without the column set, the record looks perfectly fine & normal, weighing in at 20 bytes, all bytes accounted for:

20000400 01001480 05000100 01000c00 05000000
Gravatar

Jeremy Carroll wrote on 4/8/2012 6:51 PM

9B limit is for the FORWARDING_STUB, which is 9B. Even a clustered index has this rule b/c it can be converted into a heap.
Gravatar

Mark S. Rasmussen wrote on 4/9/2012 4:39 PM

Jeremy,

Very interesting! I can see how this makes sense for a heap as we'd want to make sure there's always room for the FORWARDING_STUB to avoid having to move unrelated records to make space. And I assume the rule is there for clustered indexes so we won't have to change all of the (potentially < 9 bytes) record structures themselves during the heap conversion.

So in essence a bit of space is wasted to make conversions between indexes and heaps faster, as well as to ensure heap simplicity and speed. And seeing as <9 byte records are probably pretty much non existent in most systems, it seems like a good trade off.

Post Comment

Name  
Email
Url
Comment
Please add 8 and 2 and type the answer here: