Last year, I blogged about the first English Proficiency Index here. Education First has now released their second index, and they're getting some good publicity out of it (I'm linking to them). The same problems as last year occur: self selection of participants, and need for a computer with internet access to do the test. There's also no data on test reliability and no validity arguments. Still, with 1.7 million participants over three years in more than 50 countries, you wouldn't want to completely dismiss the results either.
Monday, October 29, 2012
Thursday, October 25, 2012
Another new and potentially useful site:
Extensive Reading Central is a not-for-profit organization dedicated to developing an Extensive Reading and Extensive Listening approach to foreign and second language learning. It was started by Dr. Rob Waring of Note Dame Seishin University, Okayama, Japan and Dr. Charles Browne of Meiji Gakuin University, Tokyo, Japan as a free service to the EFL community.
Tutela.ca (“Service”) is a Canadian not-for-profit online repository and community for ESL (English as a Second Language) and FSL (French as a Second Language) professionals registered to use the Service (“Users”). As a repository, the Service provides Users with access to ESL and FSL materials including classroom materials, lesson plans, assessment information, and reusable learning objects. As a community, the Service enables Users to share materials, discover new approaches, locate solutions, and network including through the use of the online meeting and webinar conferencing capabilities of the Service (“Conferencing”). The Service is supported by funding from Citizenship and Immigration Canada (“CIC”) and is owned and operated by Citadel Rock Online Communities Inc. (“Citadel”). (my bold)I'm not really sure how this works. Tutela itself is not for profit, but Citadel is a privately owned corporation, for profit as far as I can tell. It has received a number of grants from the federal government. Just so as you know...
Saturday, October 20, 2012
As I wrote yesterday, there are some strange tagging decisions concerning determinatives in this corpus. It seems though, as I added in an update, that these are largely the fault of the Part-of_Speech Tagging guidelines for the Penn Treebank Project.
The problems, though, are not limited to determinatives. Subordinators are also affected. The words that, whether, and if (e.g., They told me that it was OK and They asked me whether/if it was OK) are tagged as _ADP_ (short for ADPOSITION, a more inclusive term for prepositions.) The guidelines say:
"We make no explicit distinction between prepositions and subordinating conjunctions. (The distinction is not lost, however - a preposition is an IN that precedes a noun phrase or a prepositional phrase, and a subordinate conjunction is an IN that precedes a clause).This makes good sense for words like because, after, and since, which have been treated as "subordinating conjunctions" but really are prepositions and function as the heads of preposition phrases. It doesn't work, though, with that, whether, and if, which function as markers of subordination and not heads. Consider the difference between these two clauses:
The preposition "to" has its own special tag TO."
Friday, October 19, 2012
As Ben Zimmer blogged yesterday, there's a new and improved version of the Google Ngram viewer. The "improved" bit has a number of elements, but one is POS tagging. This is a wonderful thing, and I'm inordinately happy about it. Unfortunately, there are some very odd quirks to deal with.
Thursday, October 18, 2012
After 20 years of teaching English, I've only recently become consciously aware that my students may be applying the complementation pattern of one word to its derivationally related partner. That is, they regularly say and write things like *it influenced on the situation using a prepositional phrase complement headed by on where the verb should really just take a direct object. Notice, though, that the noun influence licenses the on PP complement that the students are using (e.g., it had no influence on the outcome).