In-Depth
Taming Kerberos
If you’re using Windows 2000 Server or above, chances are you’re
also using Kerberos authentication. It’s time to get to know this
three-pronged protocol and learn how to troubleshoot it.
Guard dogs are great if you’re protecting a house or a military base.
But when it comes to securing nodes on a computer network, guard dogs
don’t seem to work very well. Try installing German shepherds on your
users’ desktops and you’ll soon see what I mean. However, there is one
dog you can install for computer security: Kerberos.
As everyone working with Windows 2000 or above should know by now, Kerberos
version 5 authentication protocol is the default for network authentication
on computers with Windows 2000 Server and Windows Server 2003. The Kerberos
protocol was developed at the Massachusetts Institute of Technology (MIT), and is named after Cerberus, the three-headed
fire-breathing dog guarding the gate to Hades. Remembering that makes
it easy to remember that Kerberos is a three-pronged authentication scheme
consisting of:
- Client. The system/user making the request
- Server. The system that offers a service to systems whose
identity can be confirmed
- Key Distribution Center (KDC). The third-party intermediary
between the client and the server, which vouches for the identity of
a client. In the Windows environment, the KDC is a domain controller
running Active Directory (it could be a Unix-based KDC also).
Kerberos has been heralded as—and proven to be—an improvement over the
previous standard, NT Lan Manager (NTLM). It’s more flexible and efficient
than NTLM and more secure. Benefits include more efficient authentication
to servers; mutual identification between client and server; delegated
authentication; simplified trust management; and improved interoperability
with other networks using Kerberos v.5 for authentication. Given the number
of articles that have already been written on how Kerberos works, I’m
not going to spend a lot of time on the topic.
Ten-Minute Tour
Windows 2000 and newer provides support for MIT Kerberos v.5 authentication,
as defined in IETF RFC
1510. The Kerberos protocol is composed of three sub-protocols. The
sub-protocol in which the KDC gives the client a logon session key and
a TGT (Ticket Granting Ticket) is called the Authentication Service (AS)
Exchange. The sub-protocol in which the KDC distributes a service session
key and ticket for the service is called the Ticket-Granting Service (TGS)
Exchange. The sub-protocol in which the client pre-sends the ticket for
admission to a service is called the Client-Server (CS) Exchange.
The chain of communication involved in a Kerberos authentication session
goes like this:
Authentication Service Exchange. User Helen, at a Windows 2000 Professional workstation,
logs onto a Windows 2003 network. The Kerberos client running on Helen’s
workstation converts her password to an encryption key and saves the result
in a program variable. The Kerberos client sends a message to the KDC
Server, of type KRB_AS_REQ (Kerberos Authentication Server Request). This
message has two parts:
- An identification of the user, Helen, and the service for which she
is requesting credentials, the TGS; and
- Pre-authentication data, intended to prove that Helen knows her password.
This is simply an authenticator encrypted with Helen’s master key. The
master key is generated by running Helen’s password through a one-way
function (OWF).
The KDC, upon receipt of KRB_AS_REQ from Helen, looks up the user in
the AD, gets her master key, decrypts the pre-authentication data and
evaluates the time stamp inside. If the time stamp passes the test, the
KDC can be assured that the pre-authentication data was encrypted with
Helen’s master key and isn’t merely a captured replay. Finally, once the
KDC has verified Helen’s identity, it will create credentials that the
client program on her workstation can present to the TGS. The credentials
are created and deployed as follows:
- A brand-new logon session key, encrypted with Helen’s master key,
is created.
- A second copy of the logon session key and Helen’s authorization
data, in a TGT, encrypted with the KDC’s own master key, is created.
- Next, the KDC sends these credentials back to the client by replying
with a message of type KRB_AS_REP (Kerberos Authentication Response).
- When the client receives the reply, it decrypts the logon session
key via application of Helen’s master key. The session key is then stored
in the client workstation’s ticket cache. The TGT is extracted from
the message, and stored in the cache, as well.
Ticket-Granting Service Exchange. At this stage, the Kerberos
client running on Helen’s workstation is going to actually request credentials
to access the target server, user Bob, by sending a message of type KRB_TGS_REQ
(Kerberos Ticket-Granting Service Request) to the KDC. This message consists
of the following components:
- Identity of the target service for which the client is requesting
credentials.
- Authenticator encrypted with the user’s logon session key.
- TGT acquired from the AS Exchange.
The KDC decrypts the TGT with its master key, and extracts Helen’s logon
session key. Helen’s logon session key is used to decrypt Helen’s authenticator.
If Helen’s authenticator passes the test, the KDC invents a new session
key for Helen to share with Bob. Two copies of this new session key are
sent back to Helen in a single message, encrypted as follows:
- One copy is encrypted using Helen’s logon session key.
- The second copy is encrypted using the target server’s master key,
in a ticket along with Helen’s authorization data.
- Helen decrypts
the target server session key, using her logon session key, and stores
the session key in her cache, along with the target server ticket.
Client-Server Exchange. Helen’s Kerberos client is now
ready to be authenticated by the target server, Bob. Helen’s client sends
Bob a message of type KRB_AP_REQ (Kerberos Application Request). This
message contains:
- An authenticator encrypted with the session key for Bob;
- The ticket for sessions with Bob, encrypted with Bob’s master key; and
- A flag indicating whether the client requests mutual authentication.
Bob decrypts the ticket, and extracts Helen’s authorization data and
session key. Bob uses the session key to decrypt Helen’s authenticator,
and evaluates the time stamp. If the authenticator passes the test, Bob
looks for a mutual authentication flag. If this flag is set, Bob uses
the session key to encrypt the time from Helen’s authenticator and returns
the result to Helen in a message of type KRB_AP_REP (Kerberos Application
Reply). Helen decrypts the reply with the session key. If the authenticator
is identical to the one that she sent Bob, the client is assured that
the server is genuine, and the connection proceeds. Tickets, Please
If you didn’t follow all this precisely, don’t worry too much; but it
is important that you understand a couple of points. The key one is that
virtually the entire Kerberos protocol is devoted to acquiring and using
tickets. Again—virtually the entire Kerberos protocol is devoted to
acquiring and using tickets.
Once our user Helen has been authenticated by the server, she’s given
a ticket. Think of the ticket as dangling from a keychain on the belt
of her Windows Explorer. Wherever she or any ticket-holder chooses to
explore, the ticket provides entry—or not.
In the same way a plane ticket is printed for a particular seat on a particular
flight at a specified time, a Kerberos ticket is “printed” with the user’s
Domain UserName and the time the ticket’s valid.
While it was Helen’s logon password that got her the ticket, when she
wants resources it presents this ticket rather than sending a password.
This method allows for stronger encryption on the ticket than a password
allows. For example, the ticket encryption includes time-based elements
that make it difficult to intercept.
Ticket Granting Tickets and Session Tickets have different purposes.
The idea is that once a user logs on that user gets a master ticket, a.k.a.
Ticket Granting Ticket. If that user wants resources from a different
server, the user gets a second Session Ticket, valid only for a limited
time for a particular purpose.
As long as Helen remains logged on, the tickets are renewed automatically.
The default ticket lifetimes are controlled at the domain level by using
domain policy. The defaults, which should be noted, are:
- MaxServiceTicketAge: 10 hours
- MaxTicketAge: 10 hours
- MaxRenewAge: 7 days
- MaxClockSkew: 5 minutes
Troubleshooting Kerberos
In an ideal world, Kerberos would hum along, happily issuing, processing
and renewing tickets and handling authentication without fail. And it’s
safe to say that’s what happens 99.9 percent of the time. When it doesn’t,
authentications and permissionings start to fail. Helen, who suddenly
starts having problems accessing resources, picks up the phone and calls
Rock, her friendly system administrator. Rock must diagnose the problem
and fix it, while being snarled at by a three-headed dog and grumpy users
clamoring for attention and solutions.
Thankfully, for Rock, troubleshooting Kerberos isn’t very different from
the troubleshooting procedures used for other protocols. With a few exceptions,
it’s basically a process of looking for places where the system for the
acquisition and use of tickets is impaired.
Enabling Event Logging
Event Logs are an excellent place to look, but Kerberos Event Logging
is turned off by default in Windows, which seems odd since logging gives
Rock the capability of tracing detailed Kerberos events through the event
log mechanism. This information in turn can be used to troubleshoot Kerberos.
The procedure involves a Registry hack and is fairly straightforward (see
“Enabling Kerberos Event Logging on a Specific
Computer”). Personally, I recommend doing this on all machines on
your network, as the information you accumulate in the security log can
save you considerable time and effort (but don’t forget the standard warnings
about all the horrible things that can, and probably will, happen to your
servers if you start fiddling with the Registry without knowing what you’re
doing). You can find any Kerberos-related events in the security log.
Enabling
Kerberos Event Logging on a Specific Computer |
1. Start Registry Editor.
2. Add the following Registry value
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\
Control\Lsa\Kerberos\Parameters
Registry Value: LogLevel
Value Type: REG_DWORD
Value Data: 0x1
If the Parameters subkey doesn’t exist, create it.
Note: Remove this Registry value when no longer
needed, so that performance isn’t degraded on the computer.
Also, you can remove this Registry value to disable
Kerberos event logging on a specific computer.
3. Quit Registry Editor, then restart the computer.
|
|
|
Step-by-Step Troubleshooting
- When Helen’s problem is relayed to Rock, there are a few basic things
he should do. He should make sure that Kerberos is actually supposed
to be in use. Neither Telnet nor FTP utilize Kerberos, so puzzling over
their steadfast refusal to do so is fruitless.
- Since everything is all about ticket getting, Rock should do a simple
check to assure that the KDC Service has started on the domain controllers.
If it hasn’t, he should start the service. If it has, another thing
to check is that time synchronization, typically controlled through
the W32Time (Windows Time) service, is operational and that time is
synchronized across the network. Information regarding this can typically
be found in the System Log or Event Viewer. Happily, not much else is
required regarding client time synchronization, since the built-in Windows
Time Service on XP and Win2K Pro synchronizes automatically with the
PDC Emulator. This eliminates the need to create a special logon script
for XP and Win2K Pro clients. n Rock should also confirm DNS records
if there are connectivity problems with the KDC. At login, clients need
to contact their DCs. DNS provides the IP address of the DCs. Hence,
he should first check that Helen’s TCP/IP settings are correct, using
“ipconfig /all” to make sure the client can query the DNS server.
- If that doesn’t shed light on the problem, try examining the SRV
records on the DNS server. Windows 2003 clients use DNS SRV records
to locate DCs—in particular, they attempt to resolve the _ldap._tcp.
dc._msdcs SRV records. Win2K and above DCs also publish SRV records
for _kerberos and _kpasswd services. The list of published SRV records
can be found on a DC in the following file:
%Windir%\System32\Config\Netlogon.dns
- What if Helen’s Kerberos problem is seemingly inconsistent? For example,
many operation authentications succeed, but operations that include
GPO applications don’t work at all? In that case, Rock may want to review
Helen’s group memberships to see if the problem lies there.
- Size could also matter. By design, the Kerberos token has a fixed
size. If Helen’s a member of a group either directly or by membership
in another group, the security ID (SID) for that group is added to her
token. For a SID to be added to the user’s token, it must be communicated
by using the Kerberos token. If the required SID information exceeds
the size of the token, authentication fails. The number of groups varies,
but the limit is approximately 70 to 80 groups. If Helen belongs to
a large number of security groups, that alone maybe the source of the
problem.
This problem may not be immediately obvious,
since NTLM authentication will continue to succeed. The Kerberos authentication
problem, then, isn’t obvious until Helen accesses operations that include
GPO application and these simply don’t work at all. There are several
workarounds covered in KnowledgeBase
280830, “Kerberos Authentication May Not Work If User Is a Member
of Many Groups.”
Errors and Their Meanings
One of the first things mentioned in troubleshooting Kerberos was enabling
the Event Log, since checking Event Viewer is an indispensable tool in
the arsenal. All error messages listed in Table 1 appear in Event Viewer.
Understanding what they mean provides a plethora of helpful hints for
troubleshooting Kerberos authentication problems.
Table 1. Kerberos
Error Messages and Meanings |
Code |
Summary |
What
it Means |
0x6
|
Client not found in Kerberos
database |
The KDC couldn't translate the
client principal Name from the KDC request into an account
in the AD. To troubleshoot this error, check whether the
client account exists in AD, if it has expired, and whether
AD replication is functioning correctly. |
0x7 |
Server not found in Kerberos
database |
The KDC couldn't translate
the server principal name from the KDC request into an
account in the AD. To troubleshoot this error check whether
the client account exists in AD, if it's expired, and
whether AD replication is functioning correctly. |
0x9 |
The client or server has a null
key |
Keys should never be null (blank).
Even null passwords generate keys, because the password
is concatenated with other elements to form the key. If
a client sees this error, the administrator should reset
the password on the account. |
0xE |
KDC has no support for this
encryption type |
The client tried to use an encryption
type that the KDC doesn't support, for any of the following
reasons: The client's account doesn't have a key of the
appropriate encryption type; the KDC account doesn't have
a key of the appropriate encryption type; or the requested
server account doesn't have a key of the appropriate encryption
type. The type may not be recognized at all, for example,
if a new type is introduced. This happens most frequently
with Massachusetts Institute of Technology (MIT) compatibility,
where an account may not yet have an MIT-compatible key.
Generally, a password change must occur for the MIT-compatible
key to be available. |
0x17 |
Password has expired |
This error can be caused by
conflicting credentials. Have the user log off and then
log on again to resolve the issue. |
0x18 |
Pre-authentication information
was invalid |
This indicates failure to obtain
ticket, possibly due to the client providing the wrong
password. |
0x1A |
Requested server and ticket
do not match |
This error will occur when a
server receives a ticket destined for another server.
This can be caused by DNS problems.
|
0x1F |
Integrity check on decrypted
field failed |
This error indicates that there's
a problem with the hash included in a Kerberos message.
This could be caused by a hacker attack. |
0x20 |
Ticket has expired |
This is not a real error; it
just indicates that a ticket's lifetime has ended and
that the Kerberos client should obtain a new ticket. |
0x22 |
Session request is a replay |
This error indicates that the
same authenticator is used twice. This can be caused by
a hacker attack. |
0x19 |
Additional pre-authentication |
The client didn't send pre-authentication,
or didn't send the appropriate type of pre-authentication,
to receive a ticket. The client will retry with the appropriate
kind of pre-authentication (the KDC returns the pre-authentication
type in the error). Many Kerberos implementations will
start off without pre-authenticated data and only add
it in a subsequent request when it sees this error. In
this case, this error can safely be ignored. |
0x25 |
Clock skew too great |
There is a time discrepancy
between client and server or client and KDC. To resolve
this issue synchronize time between the client and the
server. |
0x26 |
Incorrect net address |
Session tickets include the
addresses from which they're valid. This error can occur
if the address sending the ticket is different from the
valid address in the ticket. A possible cause could be
an IP address change invalidating any existing cached
tickets. Another possible cause is when a ticket's passed
through a proxy server or NAT. The client is unaware of
the address scheme used by the proxy server, so unless
the program caused the client to request a proxy server
ticket with the proxy server's source address, the ticket
could be invalid. |
0x3C |
Generic error |
A generic error that may be
a memory allocation failure. Checking the event logs may
be useful. |
0x29 |
Message stream modified |
This indicates that the server
was unable to decrypt the ticket sent by a client, so
the server doesn't know the secret key used to encrypt
the ticket, or the client got the ticket from a KDC that
didn't know the server's key. This can be tested by determining
if the server can obtain a ticket to itself, or if anybody
else can locate the server. The secure channel used by
NTLM is also an indicator of the validity of the password
on local machine accounts. Put another way, it means that
the checksum used to verify the data packet didn't match
what was expected, which would imply a corrupted data
stream or possible attack. |
|
|
Troubleshooting tools
Microsoft offers a number of tools to troubleshoot Kerberos, spread across
the resource kit, the support tools, and the platform SDK. Most are command-prompt
tools, and all of them are valuable, especially Kerbtray. (See Table 2).
Table 2. Kerberos
Troubleshooting Tools |
Tool
(Location) |
Comments |
mytoken.exe
(Platform SDK) |
Command-prompt tool to display
the content of a user's access token, including the user's
rights and group memberships. |
klist.exe
(Resource Kit) |
Command-prompt tool to look
at the local Kerberos ticket cache. Klist can also be
used to purge tickets. |
Kerbtray
(Resource Kit) |
GUI tool that displays the content
of the local Kerberos ticket cache. |
Netdiag
(Support tools) |
Netdiag helps isolate networking
and connectivity problems by providing a series of tests
to determine the state of your network client. One of
the netdiag tests is the Kerberos test (netdiag was named
nettest before Windows 2000 Beta 3). To run the Kerberos
test, type "netdiag /test:Kerberos"
at the command prompt. |
Replication monitor
(Support tools) |
Using replication monitor an
administrator can check not just the replication traffic,
but also the number of AS and TGS requests as well as
the FSMO roles. |
Network monitor
(Server CD) |
Network monitor doesn't come
out of the box with a parser for the Kerberos protocol.
A special Kerberos parser DLL is, however, available from
Microsoft.
Nltest (Resource Kit) Administers domain and user accounts.
Useful to test and discover information about trust relationships
(see Table 3). |
Netdom
(Support Tools) |
Versatile tool that can manage
domains and trust relationships from the command prompt
(see Table 4). In Windows 2000 it allows you to move a
workstation or member server to a new domain; rename/reset
a workstation or member server; verify or reset and synchronize
TIME within a domain and resynchronize out of synch domain
controllers; and that's only the beginning. |
Setspn
(Resource Kit) |
This command-line tool allows
management of the Service Principal Names (SPN) directory
property for an AD service account. SPNs are used to locate
a target principal name for running a service. SetSpn
allows you to view the current SPNs, reset the host SPNs,
and add or delete supplemental SPNs. (See Table 5.)
|
|
|
Table 3. NLTEST
Switches |
Nltest |
Action |
Nltest /trusted_domains
|
Discover trusted domains |
Nltest /dclist:
|
Discover a domain's DCs |
Nltest /whowill:
|
Find out whether a domain has
a DC available that can authenticate a particular user
|
Nltest /finduser:
|
Find the trusted domain for
a user |
|
|
Table 4. NETDOM
Switches |
Netdom |
Action |
Netdom TRUST
/d:
/ADD/Ud:
/Pd:
/Uo:<>
/Po: |
Creates a trust relationship
from a trusting domain to a trusted domain using the given
accounts and passwords. Adding the switch /TWOWAY after
the /ADD switch creates a bidirectional trust relationship. |
Netdom TRUST
/d:
/ADD
/PT:/REALM |
Creates a trust relationship
from a trusting domain to a non-Windows 2000 Kerberos
realm; sets the trust password. To make it a bidirectional
trust add /TWOWAY; to make it a transitive trust add /TRANS:yes. |
Netdom TRUST
/d:
/TRANS:yes |
Makes a trust relationship between
a trusting domain and trusted domain transitive. |
Netdom TRUST
/d:
/REMOVE |
Removes a trust relationship
between the trusting domain and the trusted domain. |
Netdom TRUST
/d:
/VERIFY |
Verifies the trust relationship
between a trusting domain and the trusted domain. |
Netdom TRUST
/d:
/Ud:
/RESET |
Resets the secure channel between
a trusting domain and the trusted domain. |
Netdom TRUST
/d:
/VERIFY/KERBEROS |
Verifies Kerberos authentication
(referrals) between the local workstation and a Kerberos
service in a verified domain.
|
|
|
Table 5. SETSPN
switches |
Header |
Header |
Setspn - l
|
Lists all the SPNs linked to
a server. |
Setspn - r |
Resets the default SPNs for
a server. |
Setspn - a |
Adds for a server;
an SPN usually has the following format: /. |
Setspn - d |
Deletes for a server.
|
|
|
Taming the Beast
It would be satisfying to tell you that this article contains all the
information you’ll ever need to troubleshoot any conceivable Kerberos
problem you might encounter. It isn’t. However, troubleshooting Kerberos
problems need not be an arcane art and the above should at least point
Rock, you and your colleagues in the right direction when the inevitable
growl comes from the machine. Now where is that rolled up newspaper?