Friday, February 29, 2008

Idiom Crisis Continues

The Onion reports on the ongoing US idiom shortage. Since the report came out, the US Secretary of State has reminded Canada of its obligations under NAFTA to share our natural idiom resources, but the Maritimes and the far north, who have traditionally been the largest suppliers of such locutions, have been reluctant to reduce stocks and are reportedly hoping for higher world commodity prices. "It's not just the English speaking countries," said Wally McGonigle, a spokesman for the guild of PEI idiom producers. "We've got English teachers all over the world banging down the door for this stuff. I'll be damned if we just open up our stocks and let a bunch of easterners get rich selling them to the States."

Tuesday, February 19, 2008

New 360-million words American corpus

Mark Davies and the folks at BYU have just released their long awaited BYU Corpus of American English (360+ million words, 1990-2007). This is the first large-scale balance corpus of American English that is freely available. It is similar in design to the British National Corpus but over three times bigger. There was a time when the American National Corpus was going to fill this role, but in the past 7 years, it has only managed to cobble together a meager 22 million words.

There are a number of changes I'd like made to the interface, such as adding the ability to search by word family, on top of the currently implemented lemma and word search, but overall, it's a lovely gift to the world. Thanks Mark!

Tuesday, February 12, 2008


From the BNC:

the ... tendency of 212
a ... tendency of 15
the ... tendency for 142
a ... tendency for 257

I can't say I see a difference in meaning, but "a tendency of" does sound odd. I wonder why.