How the Active Directory – Data Store Really Works (Inside NTDS.dit) – Part 1

You might as I have asked yourself many times – What is inside NTDS.dit? (Most experienced Active Directory admins knows that NTDS.dit is the database and the physical on disk store that Active Directory uses to store information – most of you have probably got in touch with NTDS.dit during backup and restore scenarios)

Long story in a short version – I wasn’t satisfy not knowing – neither was I after being reading the following article:
(That I actually think isn’t that bad – but is also probably the most detailed public available information on the subject)
[1] http://technet.microsoft.com/en-us/library/cc772829(WS.10).aspx

So I decided with a very good friend of mine Stanimir Stoyanov (Microsoft Visual C# MVP) to go ahead and build a tool that could read NTDS.dit and decode its internals, and then we started a journey that has given us invaluable knowledge at this part of Active Directory, this is the first article in a series of articles that will describe what’s really inside NTDS.dit and how Active Directory works on the database layer.

The illustration below has been presented in various documentations since Active Directory was initially released over 10 years ago; a similar illustration is also available in (However after this research project it’s actually turning out to be inaccurate in some aspects – in the way the DRA/REPL communicates with the DBLayer) [1]

Table 1: DSA Components (Simplified for the DBLayer)

Component

Description

Ntdsa.dll – Directory System Agent The DSA, which runs as Ntdsa.dll on each domain controller, provides the interfaces through which directory clients and other directory servers gain access to the directory database (the DBLayer). In addition, the DSA enforces directory semantics, maintains the schema, guarantees object identity, and enforces data types on attributes.
Esent.dll – Extensible Storage Engine (ESE) APIs The Extensible Storage Engine (ESE) is an advanced indexed and sequential access method (ISAM) storage technology. ESE enables applications to store and retrieve data from tables using indexed or sequential cursor navigation. It supports denormalized schemas including wide tables with numerous sparse columns, multi-valued columns, and sparse and rich indexes. It enables applications to enjoy a consistent data state using transacted data update and retrieval.

 

ESE was formerly known as Joint Engine Technology (JET) Blue, The DBLayer uses the ESE APIs documented here: http://msdn.microsoft.com/en-us/library/windows/desktop/gg269259(v=exchg.10).aspx

NTDS.dit The on physical-disk file that represent the ESE/JetBlue database that holds the information store for the given DSA/Active Directory Domain Controller.

Data Store Physical Structure / Inside NTDS.dit – Tables

Finally we can start looking into the content/internal structure of NTDS.dit – but first let’s take a look on what has been reveled before, the illustration below is from [1] and is accurate as far as outside the white box that represent the tables within the database, the tables do exist (Except for * “sd_table” on Windows 2000 DSAs) – but there is more tables that isn’t mentioned in this example.

So it’s about time to reveal the real table structure of an NTDS.dit database file – It’s time to use the tool we produced to first discover this:

Table 2: NTDS.DIT – Tables

Table

Description

Minimum DSA Version

Datatable Contains all objects and phantoms [2.1] represented as rows (1 object/phantom = 1 row in the table) from any instanced naming context (NC) held as either writable or read-only by the Directory System Agent (DSA) hosting the database and where columns represent every [2:3] attribute present in the schema except linked attributes [2:2]

 

[2.1]: phantoms are references to object’s hosted outside the given database (NTDS.DIT) and the given Directory System Agent (DSA)

[2:2] Post-Windows Server 2003 the attribute “ntSecurityDescriptor” is stored in the “sd_table” rather than in the “datatable”

[2:3] Some columns doesn’t reflect attributes and are columns pre-defined in the NTDS.dit template database generated by Microsoft (those are needed for internal states to the DSA)

Windows 2000 Server

 

Note: Windows Server 2008 R2 added a column to support the “is-Recycled” state

Hiddentable Contains one row but several columns that defines the state of the database as well the [2:2] DNT (reference) of the NTDSA-Settings object that represents this DSA (used for finding config information specific to this domain controller.)

 

[2:4] The concept of DNTs (Distinguished Name Tags)

Windows 2000 Server
Note: Windows Server 2003 Introduced additional state columns such as backupexpiration_col
Link_table Contains link-pair references (DNT, DNT), the link base (link id >> 1) and possibly a binary blob (In case of DN-binary, DN-string syntax) Windows 2000 Server

 

Note: Windows Server 2008 R2 added a column to support deactivated links for recycle-bin

Sd_table Contains single-instance-stored SDs (Security Descriptors) that pre-Windows Server 2003 was stored in the ntSecurityDescriptor attribute in the “datatable” – those are now instead referenced to the SDs in the “sd_table” that is, if more than one object has exactly the same security defined (Security Descriptor) both objects are referenced to the same row in the “sd_table”, hence the single-instance-storage and reducing the size needed to store Security Descriptors. Windows Server 2003.
Sdpropcounttable Used by the Security Descriptor Propagation Demon (SDProp) responsible for Security Descriptor inheritance down the tree, within the local database  
Sdproptable Used by the Security Descriptor Propagation Demon (SDProp) responsible for Security Descriptor inheritance down the tree, within the local database Windows 2000 Server
Quota_rebuild_progress_table Contains temporary information during quota tracking rebuild, for the Active Directory quota feature introduced in Windows Server 2003 – this allows the demon to keep track of processed objects. Windows Server 2003
Quota_table Contains quota tracking information, for the Active Directory quota feature introduced in Windows Server 2003, quota tracking is peer naming context (NC) and for a given security principal identified by its SID. Windows Server 2003
MSysObjects ESE Internals – out of scope for this article N/A
MSysObjectsShadow ESE Internals – out of scope for this article N/A
MSysUnicodeFixupVer2 ESE Internals – out of scope for this article N/A

In the next article – we will take a deep-dive into the content and the structure of the “datatable” also known as the object-store.

How install from media (IFM) really works (Part 2)

This is the second and last blog post (If someone really cares about the differences for ADAM /AD LDS I can point that out to – just send me and e-mail) that completes a series of posts covering how the “Install from media” feature really works, it’s an in-depth very technical series of posts that explains what happens under the hood and this second post explains the changed regarding to this feature that was introduced with Windows Server 2008 (Most of the changes made are to support RODCs as you will noticed if you counties to read).

Note: This article is Windows Server 2003 Install From Media (IFM) functionality + Changes made in Windows Server 2008 and later (This article doesn’t go through the list of functionality that has been left unchanged from Windows Server 2003 again, therefore I recommend to read part 1 first: How install from media (IFM) really works (Part 1)

Background

Install from media was first introduced in Windows Server 2003, as a solution to improve the installation experience of newly promoted domain controllers in branch offices mainly (or sites with slow-links where the initial replication could take significant time to complete), but it is actually an important component in many disaster recovery plans I have designed for various customers over the years, As it is a fast and efficient way to re-install a domain controller and get it up to sync, (that’s the proper way to handle a faulting replicas/domain controllers in most cases). The feature has been mostly changed in Windows Server 2008 and later to address the new type of DCs – Read Only Domain Controllers to be supported by Install from media (IFM) or as sometimes referred to as rifm, there has also been some improvements in the ability to produce install from media (IFM) without taking a regular backup.

What dose Install from media (IFM) consist of

Install from media (IFM) contains two important things.

  • NTDS.DIT (Active Directory Database) – at the time the IFM is generated (Regardless of Windows Server 2003, Windows Server 2008 or later –the NTDS.dit is pretty much unchanged until DCPROMO makes a lot of changes at the becoming domain controller that takes use of the database – it will change the DSA reference and update related “instance specific” information in the hidden table) – How ever this excepts from rifm’s or Read-Only Domain Controller install from media.
  • SYSVOL (SYSVOL GPT Storage)
  • Registry (Contains the SYSKEY used to decrypt the PEK (also known as Password Encryption Key) that efficiently ensure that the protection for sensitive information stored in the Active Directory database (Such as Password Hashes) are unique to each instance of the database (read each domain controller) –Note: This doesn’t apply to RODCs .

Sourcing install from media (IFM) using System State and VSS

Sourcing the media used by IFM is different in Windows Server 2003 (all versions) and Windows Server 2008 and later, the difference is the technology used to gather information required. Windows Server 2008 and later is using VSS and VSS Writers for NTDS (Active Directory Domain Services) and a Registry VSS writer to source the required information to construct an IFM – Note: the Registry doesn’t apply to RODCs

Table 1: VSS Writers used by install from media

Name

Description

Guid

Registry Writer The registry writer is responsible for the Windows registry. Beginning with Windows Vista and Windows Server 2008, the registry writer now performs in-place backups and restores of the registry. On versions of Windows prior to Windows Vista, the registry writer used an intermediate repository file (sometimes called a “spit file”) to store registry data.

In Windows Vista and later, the registry writer does not report user hives.


The writer ID for the registry writer is AFBAB4A2-367D-4D15-A586-71DBB18F8485.
NTDS Writer Beginning with Windows Server 2003, this writer reports the NTDS database file (ntds.dit) and the associated log files. These files are required to restore the Active Directory correctly.

There is only one ntds.dit file per domain controller, and it is reported in the writer metadata as in the following example:

<DATABASE_FILES path=”C:WindowsNTDS”

filespec=”ntds.dit”

filespecBackupType=”3855″/>

Here is an example that shows how to list components in the writer’s metadata:

<BACKUP_LOCATIONS>

<DATABASE logicalPath=”C:_Windows_NTDS”

componentName=”ntds”

caption=”” restoreMetadata=”no”

notifyOnBackupComplete=”no”

selectable=”no”

selectableForRestore=”no”

componentFlags=”3″>

<DATABASE_FILES path=”C:WindowsNTDS”

filespec=”ntds.dit”

filespecBackupType=”3855″/>

<DATABASE_LOGFILES path=”C:WindowsNTDS”

filespec=”edb*.log”

filespecBackupType=”3855″/>

<DATABASE_LOGFILES path=”C:WindowsNTDS”

filespec=”edb.chk”

filespecBackupType=”3855″/>

</DATABASE>

</BACKUP_LOCATIONS>

At backup time, the writer sets the backup expiration time in the writer’s backup metadata. Requesters should retrieve this metadata by using IVssComponent::GetBackupMetadata to determine whether the database has expired. Expired databases cannot be restored.

If the computer that contains the NTDS database is a domain controller, the backup application should always perform a system state backup across all volumes containing critical system state information. At restore time, the application should first restart the computer in Directory Services Restore Mode and then perform a system state restore.


The writer ID for this writer is B2014C9E-8711-4C5C-A5A9-3CF384484757.

Sourcing install from media (IFM) in Windows Server 2008 and later with NTDSUTIL

Windows Server 2008 introduces a new context in the NTDSUTIL command line tool to give us a built-in tool to produce install from media instead of having us to perform and restore a backup as in Windows Server 2003.

The new context is named “IFM” and is designed to produce install form media IFM for the following cases.

Table 2: NTDSUTIL IFM options

Name

Notes

Source

Destination

Full IFM Note: If the SYSVOL tree is located at the same volume as the database, its log files and/or the registry – it will still be included in the snapshot but not copied into the IFM media.

Note: The NTDS VSS writer is invoked and as a result of this we can see that the ‘state_col’ in the ‘hiddentable’ in the ntds.dit database are changed to 4. This means a status of a Backed up database. This flag is only set if the NTDS VSS Writer is used and not for legacy backups.

This command can only be performed on a full/writable DCs This IFM media can only be used to promote full DCs (Technically unless instanceType is changed recursive on all NCs)
Full IFM with SYSVOL Same as above +

Note: The full SYSVOL tree will be copied to the IFM media, except DfsrPrivate Folders.

This command can only be performed on a full/writable DCs This IFM media can only be used to promote full DCs (Technically unless instanceType is changed recursive on all NCs)
RODC IFM Note: The NTDS VSS writer is passed with a “special” secrets flag in this case, performing a delete of all columns in the database that contains a secret attribute, hidden attribute or an attribute that has been marked as “filtered attribute set” in the schema.

Note: That the PEK (Password Encryption Key) is stored in the “pekList” is not marked secret, (it can’t be as it’s the master key used to protect secret attributes/columns) but it’s rather marked as a ‘hidden attribute’ meaning that we at this state has cleared out the master key for decrypting any other secrets in the DB, however those has just also been cleared so this makes it “safe safe”

NTDSUTIL will remove link values for linked attributes that are marked as “filtered attribute set” in the schema (This is not done by the NTDS VSS writer) and if the command is performed on a full/writable DC all objects including NC heads will recursively change InstanceType – clearing the 0x4 – Write Flag.

Note: In Windows Server 2008 R2 a check is performed against the domain functional level (DFL) and is failing the command if we’re below Windows Server 2003 as DFL with “Can’t produce RODC IFM media for down-level instances) According to be this is an error/mistake in the product as RODCs requires Windows Server 2003 forest functional level (FFL) and not DFL – furthermore there is no issues performing RODC IFM media while the DFL is in Windows 2000 Native for example and then rise the FFL to Windows Server 2003 and introduce a RODC using that media created prior to that the FFL was Windows Server 2003.

Experienced administrators can actually by pass this check, but I won’t include those steps here.

This command can be performed on a full/writable DCs as well Read-Only DCs This IFM media can only be used to promote RODCs
(Technically unless instanceType is changed recursive on all NCs)
RODC IFM with SYSVOL Same as above +

Note: The full SYSVOL tree will be copied to the IFM media, except DfsrPrivate Folders.

This command can be performed on a full/writable DCs as well Read-Only DCs This IFM media can only be used to promote RODCs
(Technically unless instanceType is changed recursive on all NCs)

Once the snapshot is performed by the VSS writers, the snapshotted volumes are mounted. There would be one mount point entry for each drive that contains one of the following:

  • NTDS.dit – Active Directory Database
  • Log Files – Active Directory Database Log files (Technically never included/copied to the IFM media, but needed by the NTDS writer itself).
  • Registry – The SYSTEM and the REGISTERY hives (Those are not copied for Read-Only Domain Controller IFM or rifm)
  • SYSVOL – SYSVOL is only included if request (Full IFM with SYSVOL or RODC IFM with SYSVOL)

That means if all of the above (A,B,C,D) are located at the C: drive, there will only be one mount point for the C: drive mounted, once one or more mount points has been created, the data listed in (A,B,C,D) are copied over to the following structure:

Table 3: IFM on disk structure

Folder Name

Content

Active Directory NTDS.dit – Active Directory database
Registry SYSTEM and SECURITY – registry hives (Except for RODCs)
SYSVOL The full SYSVOL tree – only if requested in NTDSUTIL


Conversations taking place in DCPROMO for Read-Only Domain Controllers.

If a Read-Only Domain Controller (RODC) is being promoted using a RODC media generated from a full DC the following conversations has to take place.

  1. The attribute msDS-hasMasterNCs are moved into msDS-hasFullReplicaNCs
  2. The binary portion of msDsHasInstantiatedNCs is changed from indicating to have writable NCs instanced to have none-writable NCs instanced.
  3. Update msds-NCType:
    1. Schema NC are updated to contain: NCT_SPECIAL_SECRET_PROCESSING
    2. Domain, Configuration NC and any hosted NDNCs are updated to contain: NCT_SPECIAL_SECRET_PROCESSING | NCT_FILTERED_ATTRIBUTE_SET
    3. On any partial NCs (If the DC the IFM was sourced from was a Full DC and GC) are updated to contain: NCT_FILTERED_ATTRIBUTE_SET

Note: Those conversations are only necessary when a RODC is being promoted by IFM media that was converted from a full DC.

Preventing an invalid database to be used by IFM

There are several checks taking place during a DCRPOMO IFM to determine that the database that is used during IFM is valid according to a few rules.

  1. Preventing a Read-Only Domain Controller (RODC) promotion using a none-converted IFM media from a full DC: This is prevented by looking at the instanceType at the domain NC head, If it contains it WRITE flag and a promotion of an RODC is in progress, the promotion fails with: The Install-From-Media promotion of a Read-Only DC cannot start because the specified source database is not allowed. Only databases from other RODCs can be used for IFM promotion of a RODC. (8200)
  2. Preventing a Full/Writable Domain Controller (DC) promotion using a RODC IFM media: This is prevented by looking at the instanceType at the doman NC head, if it doesn’t contain the it WRITE flag and a promotion of a Full/Writable DC is in progress the promotion fails with an error similar to the one above.
  3. If the schema version in the IFM media database used to promote the DC is different from the local machines schema.ini, the builds between the source and the current operating system the IFM media is being used on are considered a mismatch and the promotion will fail.

Preventing secrets to sneak into Read-Only Domain Controller (RDOC) being promoted using IFM

Even if the NTDSUTIL tool makes sure that IFM produced for Read-Only Domain Controllers (RODCs) are completely removed from secret attributes, hidden attributes and attributes that contain the “Filtered Attribute Set” flag, In fact the columns representing those attributes in NTDS.dit and the ‘datatable’ are actually completely removed, and any linked attributes that contain the “Filtered Attribute Set” flag will have their rows deleted from the ‘link_table’.

Note: Those are all physical deleted and don’t end up in the Deleted Objects container in either the IsDeleted or IsRecycled state.

DCPROMO performs an additional check, and ensure that no secrets are present in the NTDS.dit while a RODC is being promoted from IFM, if there is columns in the databse representing any secrets (as mentioned above) those will be deleted.

A final solution to make sure that if there are any secrets left in the DB, the last allocated USN to the database before it was IFM’ed are stored as in every database in the ‘hiddentable’ in the ‘usn_col’ column.

(After that the DB is fully initialized and accepts updates again, new USNs are allocated and this is reflected in the ‘usn_col’) However prior to that the Directory System Agent (DSA) accepts updates again, the value in the ‘usn_col’ are copied over to a new column named ‘usnatrifm’ this column will maintain the USN prior to the promotion of the DC using the IFM media, and remain the same as it was when the IFM media was produced for the entire life time of the database (until the DC is demoted)

This allows replication of secrets to compare the ‘usnatrifm’ with the metadata of the attribute containing the secret being replicated in, if that USN has a higher value than the ‘usnatrifm’ column, the current secret in the database is considered not cached (consider to not be valid) and will be replicated in/writing over the old secret that made it in with the IFM.

The reason for why the ‘usnatrifm’ has to remain for the entire life time of the DC is that secret caching happens on-demand meaning that a secret may made it in with IFM for user account ‘ChristofferA’ and ‘ChristofferA’ moves into the branch where the RODC is placed and authenticates 5 years after it was promoted.