AOL Just Did the Unthinkable – Boycott AOL?

Blogging, Business, Software, Technology August 6th, 2006

(Updated)
Thank you, Google for resisting the DOJ’s effort to obtain user search data. You put up a good fight to protect our privacy, and you won. Too bad it was all in vain.

AOL, in blatant violation of its users privacy just released the log of 3 month’s worth of searches by 650,000 users. Not to the DOJ, but for open download by anyone. The claim:

“This collection is distributed for non-commercial research use only. Any application of this collection for commercial purposes is STRICTLY PROHIBITED”

Prohibited. Yeah, right.spacer   As if they could control it. The data is supposedly “anonymized”, which in AOL-speak means the screen-name is replaced by a unique user number. Anyone a little bit familiar with data mining knows what this means, and obviously some commenters on the AOL blog have already put two and two together, “outing” certain users whose identity was easy to find based on the search patterns. I don’t even want to think what data mining pro’s will do with this.

AOL, you betrayed your users. If they are any smart, they will boycott your services. spacer

Update #1 (8/6): I’m going out on a limb here with this prediction: as they realize the magnitude of what they did (or if they don’t, due to the PR nightmare) AOL will apologize, the fingerpointing starts and heads will roll. They will remove the download link. Not before anyone who wanted the data will have obtained it though.

Update #2 (8/6): TechCrunch further elaborates on the “utter stupidity” of this move by AOL:”

“The data includes personal names, addresses, social security numbers and everything else someone might type into a search box. The most serious problem is the fact that many people often search on their own name, or those of their friends and family, to see what information is available about them on the net. Combine these ego searches with porn queries and you have a serious embarrassment. Combine them with “buy ecstasy” and you have evidence of a crime. Combine it with an address, social security number, etc., and you have an identity theft waiting to happen. The possibilities are endless.

Update #3 (8/6): The download link leads to a blank page. Perhaps AOL Exec’s are waking up… I wish all my predictions (see the first update above) would materialize this fast.   I wonder if there will be a black market for the “limited edition” downloaded dataset… eBay, anyone? spacer

Update #4 (8/6): Dennis pondering about possible ramifications, partly based on our Skype IM:

  1. Zoli estimates maybe 1,500-2,000 downloads by the time AOL woke up to what they’d done. What’s the real number?
  2. How long was the file in the wild?
  3. Could illicit copies end up on eBay?
  4. Could market data derived from the file end up on eBay or as part

    of a market intelligence offering? Almost certainly the second if not

    the first.

  5. What will be the impact on AOLs stock price?
  6. Might shorters speculate on the impact?
  7. What about a class action lawsuit? For once I think there are

    decent grounds for one of the ambulance chasers to send out its hit

    squad – they may even get what they need from the file

  8. Will AOL be able to track who got the file?
  9. What is the potential for wholesale identity theft among those 650,000 AOL users?

Update #5 (8/6): The last thing I expected was to find myself deleting comments; but this situation forced me to. A commenter provided a link to his site where he put up the file for anyone to download. I know the cat is out of the bag, and there will be several other sites, but at least I don’t want to actively promote making a bad situation even worse. Since I can’t edit comments, my only choice was to delete it.

Update #6 (8/7): ZDNet agrees: “People will be boycotting the company because of their blatent disregard for the privacy of users.”
The news is out on Infoworld, was well as mainstream news media all the way to Korea.

Update #7 (8/7):  AOL responded by email to John Battelle, also quoted at SiliconBeat.  “The summary: Man, did we screw up.

Related posts:

  • AOL Releases Search Logs from 500,000 Users
  • AOL Research exposes data; we’ve got a little sick feeling
  • AOL releases search data on 500k users… and then tries to take it back
  • More AOHell
  • Forget The Government, AOL Exposes Search Queries To Everyone
  • AOL Gate: Search Query Data Scandal
  • You never had privacy anyway
  • What do people search AOL for? Now we know
  • AOL Shared Private Search Queries
  • AOL Search Data Launches World’s Biggest Experiment On Privacy Invasion
  • No, no, no, you weren’t supposed to tear down THAT wall

Technorati : AOL, DOJ, Google, boycott, data mining, privacy, search data, search logs


Leave a comment Comment RSS

Previous: EverNote – Love You and Hate You
Next: Prohibitively High Rocket-Fuel Prices Bring Mideast Crisis To Standstill

Reader's Comments

  1. spacer Anonymous | August 6th, 2006 at 6:21 pm

    “Blatant violence”? Did you mean “Blatant violation”? spacer

    Reply to this comment
  2. spacer Anonymous | August 6th, 2006 at 6:34 pm

    Oh, thanks for catching it, corrected spacer

    Reply to this comment
  3. spacer Anonymous | August 6th, 2006 at 8:04 pm

    Reality check:

    1. anonymized search logs are perfectly legal to distribute, and in compliance with all rules

    2. anonymized Query logs are widely used in research, only that people have to specifically license it from google / yahoo etc, by working for them / with them.

    3, Query logs are an invaluable commodity for the research community. Without them all web research is like trying to design something without having any requirements to cater to (tailoring clothes without measurements!).

    4. Big players like Google and Yahoo DO NOT share this with the research community, using it internally to maintain their monopoly.

    Regarding comparisons with Google:

    1. technically, the “Google Trends” service is also an exposed view of their search logs

    2. Google Research also released internal data on the same day, for the same reason as AOL Research. Maybe you should produce blog post worshipping Google about this.

    3. Google not only studies what you search for, it also studies every bit of your email at Gmail, your calendar, and your IM conversations. Anyone who knows how to de-anonymize these search logs can also use adsense to deanonymize your email (using impression statistics).

    I’m myself am totally against AOL and it’s spammy nature of shoving their CDs around to everyone. However, I’m also against ignorant Google-fanatism, hence this comment. Imagine the advances in search technology that this data will bring. Why should a few kilobytes of data hurt?

    Reply to this comment
  4. spacer Anonymous | August 6th, 2006 at 9:15 pm

    Yes, I think about 5,000 AOL heads wiull roll over this.

    Reply to this comment
  5. spacer Anonymous | August 6th, 2006 at 9:41 pm

    5000 + 1 ?

    Reply to this comment
  6. spacer Meltin' Posts | August 6th, 2006 at 10:26 pm

    AOL releases private search data

    AOL just released information about 20 million web queries from 650,000 users. They just changed usernames into random strings, but they kept user-data association. Techcrunch makes privacy implications very clear.
    Blogs are buzzing, AOL users are gett…

    Reply to this comment
  7. spacer The Paradigm Shift | August 6th, 2006 at 11:49 pm

    Aol Releases Googles most prized Keyword List… Google is gonna get mega spammed.

    I’m shocked that AOL released this data,

    Reply to this comment
  8. spacer Castro's Blog | August 7th, 2006 at 12:52 am

    AOL Releases User Data

    TechCrunch is reporting that AOL has made available the search histories of 650,000 of their users. The user account name is replaced with an ID number, but as Michael Arrington correctly points out, there is often enough information in search

    Reply to this comment
  9. spacer greg hughes - dot - net | August 7th, 2006 at 2:27 am

    AOL screws the pooch – or at least about 650,000 of their own users

    Reply to this comment
  10. spacer Dan Appleman: Kibitzing and Commentary | August 7th, 2006 at 3:58 am

    Stunning Privacy Breach by AOL

    While most reports have commented on personally identifiable information in the queries, there’s a greater risk of identification due to ability to link “questionable” queries to requests to government web sites.

    Reply to this comment
  11. spacer Anonymous | August 7th, 2006 at 4:07 am

    The real threat to privacy isn’t as much the personal information as the presence of timestamps. That allows potentially any query, and thus user, to be tracked back to IP address. Especially if government owned sites are involved.

    See details at Stunning Privacy Breach by AOL.

    Reply to this comment
  12. spacer Agylen | August 7th, 2006 at 4:14 am

    Boycott AOL

    Zoli Erdos: “AOL, in blatant violation of its users privacy just released the log of 3 month’s worth of searches by 650,000 users. Not to the DOJ, but for open download by anyone.”
    Luckily, I’ve never used AOL for search. Almost…

    Reply to this comment
  13. spacer fredshouse | August 7th, 2006 at 4:22 am

    AOL discloses 650,000 AOL users’ search data

    Well this isn’t going to help AOL’s image. Over the weekend, AOL researchers posted a 400MB+ tarball of the raw search query data of some 650K AOL users over the period from March 1, 2006 to May 30, 2006. While…

    Reply to this comment
  14. spacer Anonymous | August 7th, 2006 at 6:08 am

    Apart from any ethical issues, AOL has breached its contract with its users. This disclosure contradicts AOL’s own privacy policy, which names search data as being part of a user’s network information, says that a user’s network information will only be disclosed as described in the privacy policy, and makes no mention of just publishing the data for public research. (There is a mention of using the data for researching use of the AOL network, but that’s not the same as letting the whole world do that).

    See:

    about.aol.com/aolnetwork/aol_pp

    Sean (www.prompt-communications.com)

    Reply to this comment
  15. spacer Anonymous | August 7th, 2006 at 7:57 am

    1) There will not be a black market for this data on ebay. Predictably, it is already mirrored and torrented.

    2) searching for drugs or words with drug connotations is not a crime. I doubt there is even probable cause for any police force to get a warrant. At this point, I could type “buy ecstasy” and this page might be the top hit.

    3) Google has released 6 DVDs worth of 5-gram search terms. They will not give a hoot about this “large” dataset

    4) AOL still has users?

    Reply to this comment
  16. spacer Christoph's Blog | August 7th, 2006 at 8:01 am

    Ver

    Reply to this comment
  17. spacer לינמגזין | August 7th, 2006 at 8:03 am

    AOL: לא למשרד המשפטים האמריקאי, כן לכל השאר

    בצעד חסר תקדים, חברת AOL החליטה לשחרר את לוג הפעילות של מנוייה לקהל הרחב. השימוש במידע,

    Reply to this comment
  18. spacer Anonymous | August 7th, 2006 at 8:35 am

    time to short AOL stock!

    Reply to this comment
  19. spacer Anonymous | August 7th, 2006 at 8:54 am

    What Google did was release information informing everyone of how often everything is searched for – not who searched for it. There is absolutely no user specific information available from Google’s (soon to be) published data and it should come to no surprise if “porn” tops the list of words searched for. On the other hand, maybe some names are searched for often and it shouldn’t be too much of a surprise to see some in the list, but that doesn’t mean that Steve Jobs or Bill Gates are typing their own names into Google.

    Hey, even with the AOL data there’s still an amount of deniability, but it’s appalling that anyone should be put in the position of having to deny anything. And the idea of a “unique” user ID means that at some point, somewhere AOL probably have a file that says how that corresponds to the user that did the search. If that file ever gets out, then all anonimity has been completely lost.

    Reply to this comment
  20. spacer Anonymous | August 7th, 2006 at 10:13 am

    It might well be that searching for drugs on AOL is not a crime, but after reading this: www.twopercentco.com/rants/archives/2006/08/drop_the_sudafe.html , I wouldn’t be so sure.

    Reply to this comment
  21. spacer Tech Recipes Blog | August 7th, 2006 at 10:51 am

    Don’t Blame Just AOL — The Bloggers are at Fault Too!

    AOL released a large database of searches that includes 20 million web queries from 650,000 AOL users. Even though they changed the AOL username to a random ID number, they did not filter the results in any other manner. Unfortunately, people’…

    Reply to this comment
  22. spacer IP Democracy | August 7th, 2006 at 10:54 am

    AOL’s Appallingly Bone-Headed Move

    Something happened over the weekend that I’m at a complete loss to explain. AOL released a list of over 20 million searches by 500,000 users. The online giant apparently did this for “research” purposes, although a key battle was won…

    Reply to this comment
  23. spacer Jimmy Daniels | August 7th, 2006 at 11:35 am

    AOL Releases Searchs From 500,000 Users

    Remember the big hubbub of the Government trying to get search data from Google and Microsoft last year? Well, apparently no one at AOL does, they just released search data from 500,000 users, they removed the AOL username, but just changed it to a ran…

    Reply to this comment
  24. spacer Anonymous | August 7th, 2006 at 11:43 am

    It was BAIT, you morons…

    Reply to this comment
  25. spacer Anonymous | August 7th, 2006 at 1:08 pm

    AOL will instead scapegoat the people who mirrored the data. They will use their PR team, dupe a government agency into denouncing the “dangerous” linkers, file a lawsuit, and drive the media to villainizing anyone who dared mention *their* bungle as hackers.

    It’s been done before: corphq.livejournal.com/60599.html

    Kill the messenger, and all you get is quiet.

    As long as the public falls for this sort of distraction tactic, they will deserve the world of corporate secrecy and cover ups they get.

    Reply to this comment
  26. spacer Anonymous | August 7th, 2006 at 5:02 pm

    The information in the database is _not_ anonymized. There are unique ID’s associated with each query, making it very possible to relate the identity of any given person in this database to their search query. The table that relates the anonymous ID’s to the users isn’t out in public – Yet. But I guarantee that information exists somewhere. You better believe it can be subpoenaed if a law enforcement agency that gets their hands on this database decides to go on a fishing expedition. I hope no innocent people on this list were doing research on questionable subjects or else they can say hello to “probable cause”.

    Reply to this comment
  27. spacer Anonymous | August 7th, 2006 at 5:30 pm

    Zoli – yawn. I knew it was coming all along because I am a Precog – -)

    read more at

    dealarchitect.typepad.com/deal_architect/2006/08/the_intention_e.html

    Reply to this comment
  28. spacer Anonymous | August 7th, 2006 at 10:18 pm

    Reporting about it is good. Distributing it to the public is bad–as bad as what AOL did. In fact, it is what AOL did. Yes, the data’s already out there, but that doesn’t mean the users no longer have rights. Personally, I say boycott everyone who intentionally distributes this data. (e.g., Slashdot)

    Reply to this comment
  29. spacer Anonymous | August 8th, 2006 at 2:37 am

    Agree, in fact that’s why for the very first time I had to delete two comments – they were pointing to mirror sites.

    Reply to this comment
  30. spacer Anonymous | August 8th, 2006 at 3:22 am

    hell, AOL won’t be able to stop it. by now, everyone on the internet has seen it, pretty much.

    www.aolsearchdatabase.com if you havent

    Reply to this comment
  31. spacer DP's Security Bits | August 8th, 2006 at 5:22 am

    AOL apologizes for privacy leak

    America Online posted a file containing three months of anonymized

    search queries of 658,000 users…

    Reply to this comment
  32. spacer Anonymous | August 10th, 2006 at 4:19 am

    I wonder if you can now search aol search and find the search logs online? How ironic would that be? The search engine actually showing you where to find somthing that they dont want you to find…

    Reply to this comment
  33. spacer Anonymous | September 4th, 2006 at 3:43 am

    These bastards should be shot for this, no-one in their position should make a mistake like this in their position

    Reply to this comment
  34. spacer Anonymous | September 22nd, 2006 at 8:12 pm

    counter-reality check:

    1. not in my country they aren’t (the Netherlands), unless you told the user you would *and* had business needs in keeping the data in the first place

    2. the govt. asked for them and AOL complied. all products of American govt. research are in the public domain, last time i heard. is this AOL’s not-so-subtle way of making sure the govt. complies with that?

    3. this is not a few kilobytes but a basic violation of trust. AOL deserves to die over this.

    Reply to this comment
  35. spacer Anonymous | October 10th, 2006 at 8:06 am

    AOL users are too dumb to boycott