We’re used to using transactions when dealing with the database layer. Transactions ensure we can perform multiple queries as one atomic event, either they all succed or they all fail, obeying the rules of ACIDity. Until Vista, performing transactional file operations haven’t been possible.
Transaction NTFS (or TxF) is available from Vista and onwards, which means Server 2008 is also capable. XP and Server 2003 do not support TxF and there are currently no plans of adding TxF support in systems previous to Vista.
So what is the benefit of using TxF? The benefit is that we can now perform ACID operations in the file level, meaning we can perform several file operations (whether that be moves, deletions, creations etc) and make sure all of them are committed atomically. It also provides isolation from/for other processes, so whenever a transaction has been started, we will always see a consistent view of a view until we have committed the transaction, even though it has been modified otherwhere. Surendra Verma has a great video on Channel 9 explaining TxF. Jon Cargille and Christian Allred has another video on Channel 9 that goes even more in-depth on the inner workings on TxF and the Vista KTM.
Why hasn’t TxF gotten more momentum? Most likely because it’s not part of the BCL! To utilize TxF we have to call Win32 API functions, which is a big step away from lazily utilizing database transactions by just wrapping our code inside of a TransactionScope.
Using TxF is actually quite simple once we’ve made a couple of necessary managed wrapper classes. Let me present you to KtmTransactionHandle:
The KtmTransactionHandle represents a specific transaction going on inside of the KTM. In the code, there’s references for further reading of the specific fucntions, most of them stemming from MSDN. Note that the CreateTransactionHandle method assumes there’s already a current transaction, if there is not, an exception will be thrown.
The second class we need is called TransactedFile. I basically made this to be used as a direct replacement of System.IO.File. It does not include all of the functionality of System.IO.File, but it does have the two most important ones, Open and Delete - most of the other functions are just wrappers of these two, so they are easily appended later on.
The primary two API functions used are DeleteFileTransactedW.aspx) and CreateFileTransactedW.aspx). Note that these functions are the ‘W’ versions, accepting unicode paths for the files. To send the strings as null terminated unicode, we have to add the MarshalAs(UnmanagedType.LPWstr) attribute to the ‘path’ parameter.
The BCL FileMode, FileShare and FileAccess all have native counterparts. The constant values are in the Microsoft.Win32.NativeMethods class, but unfortunately it’s internal so we’ll have to make our own. There are three helper functions for translating between the managed and native versions of FileMode, FileShare and FileAccess.
The Open and Delete methods both try to obtain a KTM transaction handle as their first action. If a current transaction does not exist, they will throw an exception since KtmTransactionHandle assumes one exists. We could modify these to either perform a transacted operation or non transacted, depending on the availability of a current transaction, but in this case we’re explicitly assuming a transaction will be available.
Next up the Delete operation will attempt to delete the file using the DeleteFileTransactedW function, passing in the KTM transaction handle. The Open function first tries to obtain a SafeFileHandle for the file, which is basically a wrapper class around a normal file handle. Using the SafeFileHandle, we can create a new FileStream, passing in the file handle as a parameter.
Using these two classes, we can now perform transactional file operations:
Note that the KTM transaction is able to participate in a distributed transaction using the MS DTC service. That means we can both perform database and file operations inside of a transaction scope and have all of them performed ACIDically.
Using transactions comes at a cost - performance. Since the system has to guarantee the ACID properties are respected, there will be administrative overhead as well as the possibility of extra disk activity. Whenever we modify an existing file, the original file is left untouched until the new file has been written to disk, otherwise we might risk destryoying the original file if the computer were to crash halfways through a write procedure.
There are of course limitations in TxF.aspx), as there are in all good things. Most notably it’ll only work for local volumes, you can’t use TxF on file shares as it’s not supported by the CIFS/SMB protocols.
Mark S. Rasmussen
I'm the CTO at iPaper where I cuddle with databases, mold code and maintain the overall technical & team responsibility. I'm an avid speaker at user groups & conferences. I love life, motorcycles, photography and all things technical. Say hi on Twitter, write me an email or look me up on LinkedIn.