ADSI (LDAP) distinguished name and canonical name escaping
ADSI (LDAP) distinguished name and canonical name escaping
Robbie Mosaic
2010-01-13
ADSI (Active Directory Service Interfaces) uses LDAP (Light-weight Directory Access Protocol) to access objects in the AD. Note that the object name, such a user name or a group name, is represented in a hierarchical order like “CN=Domain Users,OU=UserAccounts,DC=fareast,DC=corp,DC=microsoft,DC=com”. Note that since the protocol allows any Unicode character in the attributes, such as “Domain Users+=/”, escaping must be applied when converting an attribute from its bare from to its form in the distinguished name.
http://www-03.ibm.com/systems/i/software/ldap/underdn.html
Figure 1: Link to the IBM page about LDAP distinguished name escaping
IBM provides relatively complete informaton about LDAP distinguished name escaping (Figure 1). Another page also provides information (Ref 1). Basically, we need to escape these characters: “,;#+=<>” plus double quote and space. Note that hash (“#”) and space don’t need escaping all the time. Only when hash is the first character, it requires escaping. Only when space is the first and/or the last character, it (or them) needs to be escaped.
The way to escape the character is one of the two ways: precede the character with a backslash “”, or replace it with backslash plus the hexadecimal character code (two hex digits) of it in the UTF-8 character set. In the above two ways, either way can be chosen. Escaped example:
O=#Sue2C Grabbit\ and” Runn ,C=GB
If the above escaped attribute is unescaped, it is (in Visual Basic notation):
“#Sue, Grabbit and”” Runn ”
However, there is another name format other than “distinguished name”. It’s called “canonical name”. Sample string is like “centro23dn.local/Users/Domain Users”. You can use TranslateName() Windows API to convert between distinguished name and canonical name, using its enumeration EXTENDED_NAME_FORMAT values NameFullyQualifiedDN and NameCanonical. In the canonical name format, though without documentation, I saw all characters mentioned above are not escaped, except for backslash “” and forward slash “/”. The way to escape it, as I saw, was to precede it with a backslash. Escaped example:
centro23dn.local/Users/Domain/\ Users #+=,;<>”
If the above escaped name is unescaped, it is:
Domain/ Users #+=,;<>”
Notes on unescaping: when you unescape a name, you should not expect it is escaped exactly by the rule stated above. You should tolerate some situations. One situation is hash (“#”) is preceded with a backslash, even if it is not the first character. A similar situation is with space instead of hash. Besides these, currently no other situation needs to be considered.
Another place escaping is needed is the Active Directory search filter. Search for “Search Filter Syntax” in MSDN can lead you to the page describing the search syntax. RFC2254 specifies an escaping scheme, both that scheme and MSDN list the characters in a table:
ASCII | Escape Sequence
Character | Substitute
-----------+----------------
* | \2a
( | \28
) | \29
\ | \5c
NUL | \00