"... excepting prepositions and conjunctions, the most commonly used word in the 17.15 million separate searches was 'free.' If something isn't free,
it better at least be 'new,' as that was the next-most common word. Excluding proper nouns, the next most popular words were 'lyrics,' 'county,' 'school,' 'city,' 'home,' 'state,' 'pictures,' 'music,' 'sale,' 'beach,' 'high,' 'map,' 'center' and 'sex.' "
A Text Mining and Media Measurement blog from Glenn Fannick, a Director of Product Development Management with Dow Jones & Co.
Wednesday, August 16, 2006
2.27 gigabytes of AOL data provides treasure trove of data mining
A Lee Gomes article in the WSJ.com (sub.) demonstrates a fun use of data mining, sources from recently released AOL search files, to show what people are searching for on the Web.
anyone know where I can get this 2.27 gb of data?
ReplyDeleteAOL is no longer distributing it, not surprisingly. But chucks of it are swirling around the Web and it's been mirrored at least one place. News.com has an interesting article about the more interesting groups of searches.
ReplyDelete