Privacy Alerts - Data Mining

Data mining: turning America into one giant database

Have you ever wondered how they generate those "Books we think you'd like" lists on Amazon? It seems weird that they are usually pretty good matches to your preferences. Right?

Have you noticed the new "web history" feature on Gmail? It allows you to manage your browser history of searches and pages visited. So basically, what happens is if you are signed into Gmail while you are browsing, it logs all your visits. While you must "okay" the service, this is a peek into what's going on at Google. They still have this history whether or not you "okay" your management of it. This is not unique to Google. Imagine that all search engine/email services use similar methods (Yahoo!, MSN, and AOL). Your clickstream is worth money and it has research value that a lot of people would love to get their hands on.

Well, both of those occurrences are the result of a phenomenon called data mining (also known as Knowledge Database Discovery). Basically, this technology evolved from tracking your information and search history while you are logged onto your account. Data mining is a blossoming feature in internet technology. Look for it to make up a greater and greater part of marketing in the future.

So who has exactly what information on you?

Well, it varies from service to service. For example, a website you visit will have a lot less information on you than your internet service provider (ISP). Makes sense right? For the same service, companies may have different privacy policies or user agreements. For example, many smaller ISPs make a selling point of not selling your clickstream (unlike their national-sized competitors) to ensure customer privacy. So it varies from company to company.

Generally, here is what is available to each type of company:

Internet Service Provider:

They have your clickstream and can sell a "sanitized" version of it. They also have record of your IP address, but this is only really a concern if law enforcement or a national security agency should try to wiretap you, or if you do something illegal online.

Browser:

The browser company (e.g. Microsoft for Internet Explorer) doesn't have access to your surfing information, unless "error" reports get sent to the browser if you encounter a problem with a website. The browser does contain your cache, and although it doesn't get sent to the browser company, it may cause some privacy problems for others who use your computer. To learn more, click here.

An important word: should spyware be installed on your computer either maliciously or via a browser toolbar, your browsing history may be recorded.

Search Engine:

A search engine company has a non-identifying record of the keywords that a user searched. There is a huge industry built up around companies analyzing keywords used for product/information searches.

Search Engine + email account:

If you are signed into your email account while you are using the same companies search engine (Yahoo!, MSN, Google, and AOL) then they have record of your email address and your searches. They also have your clickstream information while you are signed in.

Website:

A website will have a visitor's IP address.

Website + website account:

A website that has a member or user section will have information on you.

So. There is tons of information out there... and it's being collected on you. If it's an area of concern for you, take care of yourself. Here are some ideas.

  • Choose a smaller or regional ISP that won't sell your clickstream rather than a nationwide IPS.
     
  • Although it may affect access to websites, you can change your privacy settings on your browser to not allow for cookies. Cookies are text files that get stored on your computer that say you visited the site. To learn more, click here.
     
  • Some search engines don't use cookies, don't collect search terms, and delete access logs after a certain period. Check out: http://www.scroogle.org/cgi-bin/scraper.htm
     
  • Don't sign in to your browser-email account. Once you sign in, you must sign out or wait a long period of time before it will automatically sign you out.
     
  • Don't sign in to the website (e.g. Amazon) if you are looking up sensitive or really private materials.
     
  • An important side note: anything you view at work is accessible to others and technically the property of work (Bad idea to look up sexual harassment books on your Amazon account at work. Keep it at home.)

Related articles

Comments

martin

December 19, 2007 at 2:40 PM

you have a fantastic site, keep it up!

Rate this article

Current rating: out of 5 votes

Your vote:

Leave a comment

Your Name (required)

Questions about this topic? Ask them on our Contact Us page.

Bookmark this page