LA Times writer Brendan Buhler took TV newswriters to task Sunday for their overuse of the phrase "get a handle". OK. I hadn't noticed such a growing phenomena, but no matter.
He proved his claim by running a LexisNexis search on the phrase over five years and said its use is rising every year. ("It was in 3,504 stories in 2004, nearly 700 more than 2000. ")
I found this to be creating truth where none exists by a fast use of text mining. I have two concerns:
1) What were the context of these references? I searched "get a handle" in Factiva and found several mentiones of that phrase in a oft-sited direct quote ("Firefighters were able to get a handle on this early on," said Capt. Jason Neuman of the California Department of Forestry and Fire Protection.) Does that make the phrase more common or is it just a function of the phrase being replicated by the distribution of AP wire copy.
2) Did Mr. Buhler account for any changes in the universe of publications and/or documents over that time period? The number of mentions in one year versus another needs to be compared to the total documents in each year. When I ran the "get a handle" search in Factiva's top 50 U.S. Newspapers (a more controlled group) and then compared it to all documents each year in that group, I found the rate of mentions of the phrase rather flat year on year.
Ah. Lies, damn lies and statistics.
Monday, October 10, 2005
Lies, damn lies and text mining statistics
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment