Fun With AOL Search Data

I am having tons of fun slicing and dicing the AOL search data. AOL’s customer’s lost privacy is my gained entertainment!

At some point I’ll probably suck it into a SQL database. Until then I’m left with grep, sort, cut, uniq and wc. Here are some of the highlights from my (very unscientific) experiments:

  • Busiest AOLer: ID 71845 made a whopping 279,430 queries. Probably a hacked account, as the 2nd place winner has only 8k queries.
  • Longest Query: 522 characters in some Spanish text from ID 7372603. Anyone care to translate? (Look for “fazendo”)
  • Number of queries containing cancel and aol: 1321
  • Top 5 “sucks” queries:
    1. yoko ono sucks (34)
    2. life sucks (27)
    3. survivor sucks (22)
    4. aol sucks (19)
    5. wife sucks (18)
  • Number of queries containing social security numbers: 190

Apparently Yoko Ono sucks more than life itself. Wow. That’s something to be proud of. The SSN issue is truly scary. Some of those queries also contain full names, addresses, and driver’s license numbers. If AOL thought their “AnonID” was protecting people’s privacy, they were quite naive.


