Partial report on the December, 2000 IETF meeting in San Diego.

The recent IETF meeting was one of the most interesting I have ever attended.  The Internet seems to have reached an inflection point.  There appears to be a willingness to question some of the presumptions upon which some parts of the Internet have been constructed.  This is not to say that people are now questioning the fundamental architectural aspects of the net.  Rather there seems to be a sense that we are building increasingly elaborate mechanisms when we ought to be finding more elegant solutions.

Domain Name System

It is very clear that the DNS is entering a stage of rapid evolution.  I've attended a lot of IETF meetings over the last dozen years and in none of them was DNS as intensely discussed or subject to proposals for such radical revision as it was in San Diego.

There were many different meetings in which the domain name system was discussed.  There are several forces that are applying pressure to DNS:

DNS as it was originally conceived was a fairly lightweight system; most queries could be resolved in a minimal number of packet round trips.  Several of the new extensions to DNS makes the name resolution process substantially more complex and more expensive and slow.

There was a significant undercurrent of feeling that perhaps the best way to proceed would be to avoid adding complexity to DNS and instead layer upon DNS a number of directory mechanisms.

Internationalization - The Non-radical Approaches

One of the few notions that everyone in the IETF seems to agree upon is that the Unicode character set can adequately represent the world's written languages in a digital form.  I have heard some vague statements that this assumption may not be universally held.  This could be a crack in the foundation upon which internationalized DNS is being constructed.

The first point of debate is whether to internationalize DNS at all.  There is a point of view that says we ought not to internationalize DNS at all.  Instead we ought to keep DNS as it is and push internationalization up a level of abstraction into directory services and search engines.  This is not widely accepted point of view.

Another point of view is that we have to change the DNS protocols or at least the packet structure so that they may directly carry Unicode.  My sense is that almost everyone believes that this would be the best approach but it is made very difficult in the short-term, and perhaps impossible in any term, because of the installed base of DNS resolvers and other DNS aware software.  There are those who advocate this approach; they believe we should cut the Gordian knot now.

A third approach is to use a presentation layer that mediates between the internationalized representation that is seen by humans and the existing DNS hostname character set.  (It is important to note that this approach does not prejudice our ability to fully internationalize DNS at a later date.)

The big issue in this third approach is how to encode Unicode into the rather limited "hostname" character set.  The general term for this encoding is "ACE" (ASCII Compatible Encoding).  There are several ACE algorithms under discussion within the IETF and there are many technical considerations to be balanced.  It is not an easy choice.

Because a DNS label is limited to 63 characters in length, and it may often take rather many "hostname" characters to represent an internationalized string, we will almost always end up having to place more stringent length limits on internationalized names than we do on today's domain names.  For some languages that typically use long words, for example Thai, the length restrictions may significantly restrict the ability to represent words of that language.  One can readily understand why the ACE approach is not the preferred one for speakers of such languages.

Unless a user is equipped with a sufficiently powerful presentation layer - something that is absent from virtually all currently deployed software - ACE encoded domain names will appear to be a sequence of nearly random characters.

I believe that most of the ACE algorithms being considered are such that they do not permit the encoding of a name that would otherwise fit within the hostname character set.  This is an important characteristic because otherwise there could readily be multiple distinct, but equally valid representations for the same name - this would, quite understandably, make trademark people very unhappy.

The ACE concept makes use of a recognizable prefix or suffix to allow software to distinguish a name that is ACE encoded from one that is not.  These trigger sequences are designed to minimize conflict with existing DNS names, but there is a small chance that some already allocated names may have to be revoked.

There is an odd issue associated with this prefix or suffix.  The issue is this:  Given the land-rush mentality towards domain names, when the suffix/prefix string is announced it is not unlikely that there will be a massive crush of registrations by speculators, particularly those who are already well equipped with technical means to generate a massive number of registration requests in a very short time.  As a consequence, the IETF is planning on holding the particular suffix/prefix string sequence under tight wraps until as late a time as possible.

Another issue with ACE encoding is how to provide for future extensions or versions of the ACE.

A few people mentioned the operational issues of ACE encoding.  For example, when attempting to diagnose a DNS problem the troubleshooter would most likely have to directly work with ACE encoded names - errors and frustration, not to mention the delay, would be the likely result.

There is considerable nervousness within the IETF at the prospect of uncontrolled internationalization of DNS.  DNS packets are found nearly everywhere on the Internet.  There is probably not a piece of equipment on the Internet that does not carry DNS packets or have DNS code built into it.

Another issue with internationalized DNS is the canonicalization of international character sets.  This is apparently a fairly arcane art - it ranges from the fairly simple, such as mapping all forms of spaces (full spaces, half spaces, em spaces, en spaces, etc) into one form to the complex (characters formed by the sequence independent composition of other, more basic characters.)  This process is called "nameprep".

As is true of most software, currently deployed DNS software should not be presumed to be fail-safe in the face of packets containing new formats.  In other words we can expect the deployment of internationalize DNS to bring with it some degree of software failures.

There are some aspects of DNS that would make such failure potentially far more troublesome than other kinds of network software failures.

First off, there are many devices in the net that violate (or at least lean-on) the end-to-end principle in one way or another - these are things like firewalls, network address translators (NATS), "transparent" web caches, content management systems, etc.  Because these devices are "infrastructure" devices, the software failure caused by internationalized DNS will not be readily apparent to end-users - they will simply perceive an infinite version of the famous "world wide wait."  However, the operators of those infrastructure devices, assuming they notice that something is acting badly, will frequently be left scratching their heads wondering why their equipment is misbehaving.  These operators may have difficulty isolating the cause of the problem.

Secondly, DNS relies on very heavily on cached data. Many DNS records have cache lifetimes measured in days, sometimes even in weeks, or longer.  This means that if something goes very wrong with an internationalized DNS experiment there is no "emergency off" switch. Instead, the remnants of the experiment could hang around in DNS caches for as long as several weeks.

Internationalization - The Radical Approach

Internationalized DNS is a Gordian Knot. And John Klensin may have an Alexandrian solution: http://www.ietf.org/internet-drafts/draft-klensin-i18n-newclass-00.txt

John Klensin is chair of the IAB and his ideas deserve much consideration.  And when he feels that a radical change might be needed, one should stop and listen.

The premise is that sometimes a small change results in more ancillary trouble than a big change.  How can this be?  As I mentioned above, one of the main impediments to "doing it right" for internationalized DNS is the installed base.  If we were to make small changes to DNS we will possibly spend the next decade or two firefighting the side effects.  However, if we make a radical change, one so large that it can only be used by new software, then we have a clear evolutionary demarcation between the old and the new, and there would be few direct interoperability issues.

What has been proposed is to establish a new DNS "class".  What is a "class"?  In the domain name system data types are divided into classes.  It is best to think of each class as entirely distinct.  To date there have only been three classes, only one, class "Internet" (of "IN") is really used in practice.

It can be reasonably safely presumed that all extant DNS implementations check the class and react reasonably gracefully when they encounter data in a class they do not comprehend.

When we join that presumption with the fact that a new DNS class opens the door for an entire new way of encoding domain names we find that we may have a way to bypass what may be a large number of small problems and instead undergo a single generational replacement (not painless, but at least finite.)

At the same time, it is hoped that we can break the chain of forces that have driven us into many of the problems we have had with DNS.  The main link in that chain is the notion that DNS is a directory service.  Hence Kleinsin's second proposal - the creation of true Internet directory services layered on top of DNS.

An Internet Directory

As I mentioned previously there are many who feel that is not necessary to internationalize DNS.  Instead we would keep DNS as it is and layer upon it any number of directory services.  While this is probably, in theory, the best of all-possible solutions it is one is exceedingly difficult.

First of all, the requirements for such directory services are not amenable to ready specification.  It may turn out that we need an open ended number of distinct such systems each with different characteristics.

Second, there's a lot of momentum behind the belief that DNS is a directory system.  And as long as we have this belief people will fight to have semantically meaningful names in the language of the choice.

But the Internet is nothing if not a place for new ideas.  And so we have one, again from John Kleinsin: http://www.ietf.org/internet-drafts/draft-klensin-dns-role-00.txt

This is not a protocol proposal or indeed anything that could ever lay claim to the adjective "specific".  Instead it is a feeler to explore whether we can cut through several of the DNS problems that lay before us by invoking the adage that "every problem in computer science can be solved by adding another layer of indirection."


Updated January 13, 2001 - Copyright © 2001 by Karl Auerbach, All Rights Reserved