Your education on AD begins with the components that deliver its services. This month: the store.
Active Directory by Design
Your education on AD begins with the components that deliver its services. This month: the store.
Now that Windows 2000 is on the shelves and many machines
are coming pre-loaded, we’ll see a strong, continuous
push by Microsoft to induce businesses to move over to
the latest and greatest. Most prudent companies will take
small steps and continue to use the operating Windows
NT 4.x services. Essentially, the only thing that will
change is that the workstations and servers will be running
Win2K code. However, many of us will continue to use the
old supported NetBIOS-based services, at least until our
support staffs can get their arms around the new services
and architecture to determine if and how they want to
migrate to a complete Win2K environment.
While that issue works its way through your organization’s
pipeline, your best strategy to be ready for the big move
is to learn as much as you can before you’re called on
to take the slings and arrows.
Squarely at the center of the multitude of services available
in Win2K is Active Directory acting as the repository
for information. Recently, I’ve discussed directory services
in general; now I want to talk about AD specifically and
in detail. Obviously, AD is a complex and broad subject,
so this discussion will take place over the next several
months.
I want to start with an overview of the components and
services that are combined to deliver AD services in the
context of the terms that have been standardized in X.500.
AD isn’t an X.500 directory, but it does follow the general
X.500 modular architecture. It’s useful to use X.500 terms
when discussing the various directory service components
as a reference point when you later compare specific competing
products to each other. This month I give an overview
of AD Directory System Agent (DSA) components or—as it’s
referred to in AD—the store.
Acronyms |
AD--Active
Directory
DC--domain controller
DIB--Data Information
Base
DIT-- Directory
Information Tree
DSA--Directory
System Agent
ESE--Extensible
Store Engine
FSMO--flexible
single master operations
GC--global catalog
MMC--Microsoft
Management Console
RPC--remote
procedure call |
|
|
The AD Database
The underlying foundation of any directory, or any application
for that matter, is the database, which is referred to
in X.500 as the Data Information Base (DIB). AD refers
to the DIB as the store; it’s based upon the Exchange
JET storage engine. Microsoft is quick to point out that
it’s not the same JET engine that has been used in past
releases of Exchange (or Access, for that matter), but
a new version called the Extensible Store Engine (ESE),
which shares design with the JET engine, but not code.
The main characteristics of the database are that it’s
transaction-based, and it employs log and checkpoint files
to verify the transactions. Microsoft has claimed success
in tests with up to 1.5 million objects in the store.
The largest store I’ve heard about in a real organization
is just under 500,000 objects.
As you can imagine, backing up these files is critical
to the successful restoration of an AD. This process should
certainly be in the design documents. The backup process
becomes more complex as you add new domains. Remember,
each domain is only a partition of the directory store,
and at least one DC from each domain must be backed up.
The process is similar to the method for making sure that
a distributed Exchange messaging system is properly backed
up. I’ll cover this topic in more detail in a future column.
Unlike the SAM database, which was part of the registry,
the AD store is located in the NTDS.DIT file on a NTFS
partition on the domain controllers. AD’s distributed
store can be replicated on many different servers within
the same domain and also partitioned into many different
sections through different domains. Essentially, a domain
is a partition of the AD store. AD doesn’t support the
creation of DIB partitions smaller than the domain boundary.
Between the AD DSA and the DIB is the database layer,
which provides an access point into the actual ESE database.
In common Microsoft fashion, the database layer is an
abstraction layer between the data store, and applications
that are interested in the information in the store. This
layer can be accessed by the DSA or through other APIs
such as MAPI. It forms the Directory Information Tree
(DIT) structure for the DSA from the flat information
held in the directory store.
The DIB is the actual repository for the data that the
directory is comprised of. However, data isn’t useful
until it can be organized and accessed by interested agents,
such as applications and users. The real core of this
responsibility and of the directory itself is the Directory
System Agent. The DSA in AD terminology is a Domain Controller
(DC); it’s contained in the NTDSA.DLL that runs on DCs.
This DLL is installed and uninstalled with the dcpromo,
the Active Directory installation wizard, which finally
allows us to create and demote DCs without having to reinstall
the entire operating system (one of the great new features
of Win2K). The DSA Domain Controller is responsible for
providing updates to other DCs. These updates are organized
by type. The DSA is further broken down into several functions,
which are referred to as roles within the Domain Controller.
Operation: FSMO
Although all DSAs working together are designed to provide
multi-master replication of attributes and objects within
the directory, some of the information in the DIB can’t
afford to have replication conflicts. Therefore, some
of the DSA functions are still single master-based. These
are called flexible single master operations (FSMO), and
Microsoft has defined five specific operations that rely
upon them:
- Schema Master—The schema
is the actual structure of the DIB or store. This determines
where the attributes will be located in the database.
The schema for a given directory service must be consistent
throughout the entire forest. There can be only one
FSMO schema master within the entire forest of the directory.
- Relative ID Master—Each
domain uses this master to assign unique security IDs
to user, group, or computer objects. This FSMO is also
used when an object is moved from one domain to another,
to ensure that the RID remains unique across the forest.
- PDC Emulator Master—This
FSMO provides compatibility with the NT 4.x authentication
requests in a mixed-mode environment. This operation
is what allows AD to look just like a PDC to down-level
clients. It’s also used in a pure native mode environment
to resolve password inconsistencies. As with its NT
4.x virtual PDC counterpart, there can be only one PDC
Emulator Master in each Domain.
- Domain Naming Master—This
FSMO is responsible for managing the addition and deletion
of the domain names and controlling the uniqueness of
those names. If the administrator’s client machine can’t
connect to the DC assigned to this FSMO role, you won’t
be able to add or remove a domain. There’s only one
Domain Naming Master FSMO within a forest.
- Infrastructure Master—One
FSMO in each domain is responsible for enforcing group-to-user
references between domains as they’re updated in the
natural administration process. Microsoft advises that
if you have multiple DCs in a domain, don’t place a
Global Catalog on this machine, or the group-to-user
resolution process won’t work properly.
I know you’ve been hearing a lot about the multi-master
replication model for AD. However, as you can see, several
components still use the master/slave database model.
As you can imagine, this has a fundamental impact on where
you place these machines and on the type of machines you
use to support these services. You can’t count on multiple
instances of the FSMOs to create redundancies in the system.
These machines are best placed in locations that are physically
close (this side of routers) to the administrators who
are interested in them, and they also need to be closely
monitored for availability.
All FSMOs are determined automatically during the installation
process. However, as the network grows and the location
of these servers becomes less than optimal, you can transfer
these FSMO roles to other machines. Just make sure you’re
moving them to machines you’re confident will remain operational.
Treat them as you’d treat a PDC in NT 4.x.
These singular roles are normally transferred through
AD MMC snap-ins, but the roles can be seized from downed
machines if necessary. Again, this is similar to an NT
4.x BDC being promoted when a PDC fails. You seize these
roles with ntdsutil, a utility that comes with the Windows
2000 Resource Kit. That utility is also used for other
database management tasks such as moving, compacting,
repairing, and checking the integrity of the AD data store.
As with the BDC-to-PDC analogy, use caution when taking
this step. If the original FSMO comes back online, unpredictable
results can occur, and they won’t be pretty. If you have
an FSMO machine that goes down and you want to seize the
role for another machine, make sure you demote the original
DC component before bringing it back online. If you still
want it to be the FSMO for that role, then you need to
run dcpromo again to make it a DC and transfer the role
using the appropriate snap-in. This should remind you
of the reintroduction of a repaired PDC into a system
that has had a BDC promoted while the original PDC was
under repair.
DC Roles
There are also generic roles that DCs play in AD that
are critical to the communication and synchronization
of the data store. Every DC automatically participates
in intra-site automatically through remote procedure calls
(RPCs). One DC within each site, defined as a group of
domain controllers within one or more subnets connected
with fast links, is configured as a bridgehead server.
These servers act as a replication focal point to other
islands of self-contained fast connectivity. Replication
is a topic I’ll cover in length in a future column.
The other major role DCs may play is in holding a copy
of the Global Catalog (GC). This is an assigned role,
and any number of DCs within a domain may hold a copy
of the GC. The placement of GCs is determined largely
by the characteristics of the physical network. The GC
is an index file that contains all of the objects in the
entire directory forest. This index information is stored
in the NTDS.DIT file along with the other directory information.
However, it doesn’t contain all of the attributes in the
forest, only the ones Microsoft has placed in there by
default. As the Administrator, you can add other attributes
if you want them included in the index.
The default base of attributes in the GC can’t be removed;
they’re used by services to find other services they need
to function properly. This index file also allows users
and applications to locate directory objects without any
knowledge of the domains and across a discontiguous namespace.
The DSA in a directory implementation is where most of
the work takes place to keep the system properly converged
and available to provide accurate and useful information
to requestors. Microsoft has taken the X.500 DSA model
and created a more modular implementation, using DCs as
DSAs. The most important thing to consider with this DC
model is that it isn’t a complete multi-master database
model. There are singular and relatively singular roles
that some DCs are chosen to play within the system. To
ensure operational availability, take care in where you
locate these physically and in the hardware you choose
to run them on.