ktmatu
Home | Site Map | Site Search
Home > Information >
 

Sensitive Information in Log Files

Every action you make on the Internet is logged somewhere. These records don't necessarily directly reveal who you are or where you live, but nonetheless they can contain a lot of sensitive information about you, your behavior and interests. In this note we'll show what kind of information is stored to web server's logs when surfing the web, and how to minimize the amount of information you reveal about yourself.

The three most common ways to jump into a new web page are links at other pages, links at search engine result pages, and bookmarks. Let's see what kind of information is logged by the web server at http://ktmatu.com when a user lands to this website using the methods mentioned above:


ml10pc66.uta.fi - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "http://www.example.com/privacy-links/" "Lynx/2.8.3rel.1 libwww-FM/2.14"
ml10pc66.uta.fi - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "http://www.google.com/search?q=log+privacy" "Lynx/2.8.3rel.1 libwww-FM/2.14"
ml10pc66.uta.fi - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "-" "Lynx/2.8.3rel.1 libwww-FM/2.14"

Here we can see that someone from ml10pc66.uta.fi has requested the web page http://ktmatu.com/info/ using the Lynx browser. In the first case, the user has clicked a link at http://www.example.com/privacy-links/. In the other case the search phrase "log privacy" was used to perform a web search using the Google search engine. In the last case, the referrer is missing "-". This indicates that the url was directly typed into the address field or it was selected from a bookmark file. Many log analysis programs like Relax make it easy to analyze what keywords were used in search engines.

In many cases only the IP number like 153.1.14.166 appears in logs instead of more readable ml10pc66.uta.fi. The conversion is, however, an easy task with a tool like lrdns.

Usually the computer's name doesn't reveal too much identifiable information, especially if the ISP you are using is big (spider-fra-tc014.proxy.aol.com) and the IPs are allocated dynamically. However, this information can be directly connected to you, but ISPs don't reveal this information unless you have involved in illegal activities.

The broad geographic location of your computer is in many cases direcly readable from the logs. A log entry like dialup-123.saintlouis1.example.net indicates that it's quite likely that you live somevere around St. Louis, Missouri, USA.

Sometimes the computers are even named after their users, like mr-smith.example.net. Thus, we can directly see the interests of Mr Smith. Obviously this exposes Mr Smith to many risks. Some evil-minded webmaster can even blackmail him.

Even more cryptic computer names like ml10pc66.uta.fi can reveal a lot about the user. Somebody who knows how the network is arranged at the University of Tampere, Finland knows the exact location of this computer and the possible user.

To hide the computer's name we can use try to use a proxy server provided by the ISP. In many cases this is even mandatory.


www-proxy.uta.fi - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "http://www.example.com/privacy-links/" "Lynx/2.8.3rel.1 libwww-FM/2.14"
www-proxy.uta.fi - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "http://www.google.com/search?q=log+privacy" "Lynx/2.8.3rel.1 libwww-FM/2.14"
www-proxy.uta.fi - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "-" "Lynx/2.8.3rel.1 libwww-FM/2.14"

Now the folks at ktmatu can only see that some user from uta.fi domain has accessed the web site. All other sensitive information like the referrer is still there. There are also so called transparent proxies. These servers doesn't hide the user's computer.

Proxy servers log all requests in a similar manner to the web servers. If this proxy server is located at your employer's facilities, it's obviously not such a good idea to use their computer to visit sites containing material that could put you in a vexing position. Your boss can see every page you have visited and analyze what kind of things you have been searching from the search engines!

Junkbuster is a free proxy server usually installed into local PC. It can be configured to hide the referrer, remove cookies and unwanted banner ads, and to misidentify the browser's name. It can also be chained into other proxy servers.


www-proxy.uta.fi - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "-" "Mozilla/3.01Gold (Macintosh; I; 68K)"
www-proxy.uta.fi - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "-" "Mozilla/3.01Gold (Macintosh; I; 68K)"
www-proxy.uta.fi - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "-" "Mozilla/3.01Gold (Macintosh; I; 68K)"

Now we can see that the referrer information is stripped away. This means that ktmatu's webmaster cannot know what was the page you came from or what were the keywords you used in search engines. There are some problems involved in stripping referrers and falsifying the browser. Some sites check referrers to prevent other sites from directly linking to their content, usually to images and other graphics. Other sites may redirect you to a Mac optimized pages because the browser field states that you are using Macintosh.

Many web site traffic counter services like HitBox.com and TheCounter.com gather a huge amount of information about your computer. Even the list of installed browser plugins ends up to their logs. Junkbuster makes it easy to stop this leak.

Sometimes it is very nice to be able to hide your real domain name, especially if it is directly connected to organisation's name. To visit competitor's or buying target's web pages is better done anonymously. This can be done with external proxy servers like those provided by Anonymizer. Anonymizer not only hides the organization, referrer and browser, but also encrypts URLs so that even the ISP or employer cannot log what page requests you have done.


proxy20.anonymizer.com - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "-" "Mozilla/4.0  (TuringOS; Turing Machine; 0.0)"
proxy20.anonymizer.com - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "-" "Mozilla/4.0  (TuringOS; Turing Machine; 0.0)"
proxy20.anonymizer.com - - [03/Jun/2001:19:49:12 +0300] "GET /info/ HTTP/1.0" 200 4487 "-" "Mozilla/4.0  (TuringOS; Turing Machine; 0.0)"

As we can see there is not too much identifiable or sensitive information left.

 
Home | Software | Information | Etsin | Chinese | Christmas Calendars | Site Info