Thanks for posting this
The FTC analysis was an interesting
experiment - but be careful not to jump to too many conclusions.
For example, use of port 43 WHOIS data is
often as a result of a two phase search
(1) Phase 1 - find websites that are real
- ie qualify the lead
(2) Run WHOIS search against the domain
name associated with the website
Just creating a random domain name, and
setting up WHOIS contact data, will not necessarily pick up this usage unless
the website is established and real in the first place. There are other
techniques available as well but often leave a trace. The process
above can be done reasonably anonymously.
Registrars could provide data
on WHOIS usage by IP address, and this could show the amount of data
mining going on (after removing IP addresses from registrars checking WHOIS
for transfer authorisation purposes). ie if WHOIS was being used as it
was intended the number of queries would be close to the number of unique IP
addresses, but there are often high peaks from a few IP
Note what was picked up in the analysis
below, is that when a real website is established - email addresses found on
that website are used.
Steve, interesting to read the Security and Stability
Advisory Committee recommendation on Whois. In relation to privacy you
state: "it is widely believed that Whois data is a source of e-mail
addresses for the distribution of spam". This may be a wide belief but
empirical evidence from the US Federal Trade Commission tells us
otherwise. See the last sentence of the note below in
To find out which
fields spammers consider most fertile for harvesting, investigators "seeded"
175 different locations on the Internet with 250 new, undercover email
addresses. The locations included web pages, newsgroups, chat rooms, message
boards, and online directories for web pages, instant message users, domain
names, resumes, and dating services. During the six weeks after the
postings, the accounts received 3,349 spam emails. The investigators found
- 86 percent of the addresses posted to web pages
received spam. It didn't matter where the addresses were posted on the
page: if the address had the "@" sign in it, it drew spam.
- 86 percent of the addresses posted to newsgroups
- Chat rooms are virtual magnets for harvesting
software. One address posted in a chat room received spam nine minutes
after it first was used.
Addresses posted in other areas on the Internet received
less spam, the investigators found. Half the addresses posted on free
personal web page services received spam, as did 27 percent of addresses
posted to message boards and nine percent of addresses listed in email
service directories. Addresses posted in instant message service user
profiles, "Whois" domain name registries, online resume
services, and online dating services did not receive any spam during the six
weeks of the investigation.