CaveBear Blog: Thoughts on whois and privacy

April 2, 2003

Thoughts on whois and privacy

It is time for ICANN/IANA to squarely face the question of privacy in the DNS whois database.

Various people whose judgment I value [M. Mueller, B. Fausett] have suggested that ICANN/IANA may finally get to the issue of privacy.

The ICANN Board is establishing a "President's Standing Committee on Privacy" (why the committee is possessed by ICANN's "president" and not the Board is something we can deal with at another time and another place.)

Privacy is a hard question. It is a matter that pervades all aspects of information handling. It would be entirely inappropriate, and ultimately futile, to try to deal with privacy as an after-the-fact adjustment to the existing DNS whois system. It is necessary to examine the most fundamental questions - such as what reasons, if any, justify there being a whois database at all.

This note contains thoughts on how we might try to deal with these questions in a principled way.

We need a framework to structure our thoughts as we try to answer the question whether there ought to continue to be a whois database in its present form. Fortunately much work was done on privacy frameworks in the United States during the 1970's and later in Europe. Today we have had more than two decades of experience with the principles that came out of that work. Those principles have been found to be sound.

These principles are not absolutes - privacy is a balance between competing rights. Nor is the the balance fixed for all places and times. Privacy is affected by cultural and social values that vary with time and place.

Since privacy is contextual, for the purposes of this note I am using a contemporary Euro-American point of view.

Many of the privacy principles in the various privacy frameworks are concerned with letting the data subject know that his/her data is being collected and ensuring that the data subject can check the data for accuracy. I would suggest that we come back to these principles later, after we deal with the ultimate question of whether whois data should be disseminated at all.

For purposes of this discussion there is one privacy principle that stands out from the rest:

Principle: Personally identifiable information should be used only for those purposes for which it was collected.

Because whois is a running system that has evolved from the days when the internet was largely a friendly club of techies who knew one another we do not have the opportunity to clearly comprehend the purposes for which personal information was included in whois when whois began. Early documents, such as the 1974 ARPAnet Directory reflect the collegial nature of the net in those days. Whois in those days was very much like a club's membership list.

Because the history of whois is not of much help we are forced to look at the uses to which whois data is used today and ask which uses are an essential part of the purposes for which that information was disclosed and which uses are simply excrescences.

The approach I will use to answer that question is to ask what is the understanding of the purpose of the information disclosure in the mind of the person when he/she acquires a domain name? I will infer a more expansive reading to that purpose when the person is engaged in an informed, arms-length transaction; I will infer a less expansive reading when the person is typically less informed and has little, or no, negotiating power beyond simply walking away.

Personally identifiable information in whois is obtained directly from the data subjects as the result of the data subject acquiring a domain name. The disclosure is made because the DNS registrar demands that information as part of the price of obtaining a domain name.

Do we care whether this disclosure of personally identifiable information is a voluntary act? Certainly in the gross sense, the disclosure is voluntary - the person could chose to not obtain a domain name and thus not have to make the disclosure of his/her information. However, we ought not to focus our inquiry on the crude question whether the disclosure is voluntary. Rather we should try to comprehend the forces that drive a person to feel that they must accept the proposition that registration of a domain name requires the disclosure of their personally identifiable information.

Were the internet a trivial bit of fluff with very little relevance to the ability of a person to act as a meaningful part of the social fabric, then I would have no trouble concluding that those who disclose private information are doing so as part of a fair bargain for communications services.

However, the internet is increasingly becoming a utility, a necessary part of daily life. A domain name is increasingly becoming an important part for establishing an empowered role on the internet and in society. A person who wishes to establish a presence on the net beyond the extremely limited presence of an e-mail address or an ISP-hosted "home page" is virtually compelled to obtain his/her own domain name.

Moreover, many aspects of DNS registration contracts, including the obligation to disclose personal information, are not negotiable. Contract terms are established industry-wide by ICANN for the vast bulk of DNS registrations. The exception are the country-code TLDs, which are frequently available only to residents or citizens of the country associated with the ccTLD. In addition, ICANN's reluctance to create new general TLDs has further limited the diversity of contract choices available to those who wish to obtain a domain name.

In other words, the rules under which a person parts with his/her personally identifiable information are nearly always not subject to negotiation - they are a take-it-of-leave-it proposition. And because of the social utility of having a domain name, the person is strongly compelled to accept these terms.

Thus, when we come to the question of asking for what purpose DNS information is obtained we ought to take the narrow perspective; we ought to look at the minimal set of uses that are necessary to enable a the DNS registrar to successfully deliver the service for which the data subject has parted with his/her personal information.

In that context, the use for which the information is disclosed is to give the registrar enough information to contact the person for purposes of consummating the registration (including billing for charges incurred) and for periodically renewing the registration.

If, as I suggest, the data subject's intention is for the private information to be used only to achieve the registration of a domain name, than by logical extension, the purpose of the disclosure is not intended to benefit third party trademark holders or anti-spam advocates.

My conclusion therefore is that when people part with their personally identifiable information during the acquisition of a domain name that their expectation of the purpose is that such disclosure is solely to facilitate the acquisition and to facilitate periodic renewals. It is equally part of my conclusion that that the disclosure is not intended to benefit trademark owners or anyone else.

The broader conclusion that I draw is that, consistent with the privacy principle enunciated previously, personally identifiable information disclosed as part of the acquisition of a domain name ought to be used exclusively to accomplish and maintain the registration of the domain name. And further, that it would be an contrary to the intended purpose of the information to disclose it to any third parties without the data subject's express and informed consent.

In other words, the whois data, for purposes of privacy, is for the use of registrars for the sole purpose of servicing the data subjects in their role as customers of those registrars.

So, where does this leave trademark holders and anti-spam folks? Certainly these people have the need to track down those who are impinging on their legitimate rights. But why should that interest automatically supersede that of the data subject's interest in the privacy of information that he/she disclosed for the sole purpose of acquiring a domain name? The answer is simple, it doesn't.

As I mentioned, trademark people who feel that their marks have been violated and anti-spammers who believe that they have been abused do have rights. But these are not rights to access whois data, rather these are rights to invoke processes that may result in a controlled and limited opening of that whois data.

Thus, for example, the trademark owner who believes his/her mark has been abused should be required to demonstrate that there is reason to believe that a particular accused domain name is the source of that abuse. After successfully making this demonstration whois may be opened for the limited purpose of permitting the trademark owner to confront the person in control of that domain name.

There are many, particularly those who obtain free and unlimited whois access under today's regime of zero-privacy, who will complain that being forced to make a preliminary showing before obtaining access to whois will too slow and expensive. My answer to the matter of speed is simply that access to whois data based on nothing more than a mere accusation is an invitation to abuse. I believe that a magisterial process is necessary to determine whether the putative injured party has something more substantial than a bald accusation. My answer to the matter of cost is simply to build recovery of costs into the remedy for successful vindication of a trademark owner's rights. In fairness, however, the accused should recover his/her costs should the trademark owner's accusation fail.

So, in summary, it is my believe, based on established principles of privacy, that the existing whois system should be terminated. It should be replaced by systems of records that exist as private data between a name registrant and the registrar and which are used solely to promote the relationship between the registrant and the registrar. (I am intentionally avoiding delving into the split personality manifested by the ICANN mandated system in which a cloud of front-office registrars envelops a back end database "registry" operator.)

In parallel to this closed whois system there would need to be established a fast and inexpensive magisterial process. Anyone who believes that their rights are being violated would be required to make a minimal demonstration that such a belief is supported by a reasonable amount of concrete evidence. Upon making such a showing, the requested whois records would be disclosed, but only for the limited purpose of further processes to resolve the dispute. I am not here dealing with questions of the nature of that magisterial process. I am not dealing with questions such as whether the data subject has the right to receive notice and the right to present a rebuttal. However, whatever the process, it is necessary that the accusing party fully identify itself, and the fact that such a process occurred and the name of the accusing individual ought, as a matter of fairness, be available to the data subject.

This magisterial process necessarily involves humans and human judgments - it thus has real costs. Because the value of the system comes from its mere existence as well as from specific events, the question of equitably distributing the costs is complex and beyond the scope of this note.

Posted by karl at April 2, 2003 1:23 AM