Education First's English Proficiency Index

Last year, I blogged about the first English Proficiency Index here. Education First has now released their second index, and they're getting some good publicity out of it (I'm linking to them). The same problems as last year occur: self selection of participants, and need for a computer with internet access to do the test. There's also no data on test reliability and no validity arguments. Still, with 1.7 million participants over three years in more than 50 countries, you wouldn't want to completely dismiss the results either.

Another new and potentially useful site:
Extensive Reading Central is a not-for-profit organization dedicated to developing an Extensive Reading and Extensive Listening approach to foreign and second language learning. It was started by Dr. Rob Waring of Note Dame Seishin University, Okayama, Japan and Dr. Charles Browne of Meiji Gakuin University, Tokyo, Japan as a free service to the EFL community.


Tutela has been in beta for about a year but had its official launch during the TESL Canada conference. From their terms of use: (“Service”) is a Canadian not-for-profit online repository and community for ESL (English as a Second Language) and FSL (French as a Second Language) professionals registered to use the Service (“Users”). As a repository, the Service provides Users with access to ESL and FSL materials including classroom materials, lesson plans, assessment information, and reusable learning objects. As a community, the Service enables Users to share materials, discover new approaches, locate solutions, and network including through the use of the online meeting and webinar conferencing capabilities of the Service (“Conferencing”). The Service is supported by funding from Citizenship and Immigration Canada (“CIC”) and is owned and operated by Citadel Rock Online Communities Inc. (“Citadel”). (my bold)
I'm not really sure how this works. Tutela itself is not for profit, but Citadel is a privately owned corporation, for profit as far as I can tell. It has received a number of grants from the federal government. Just so as you know...

More Google Ngrams 2.0 POS tagging

As I wrote yesterday, there are some strange tagging decisions concerning determinatives in this corpus. It seems though, as I added in an update, that these are largely the fault of the Part-of_Speech Tagging guidelines for the Penn Treebank Project.

The problems, though, are not limited to determinatives. Subordinators are also affected. The words that, whether, and if (e.g., They told me that it was OK and They asked me whether/if it was OK) are tagged as _ADP_ (short for ADPOSITION, a more inclusive term for prepositions.) The guidelines say:
"We make no explicit distinction between prepositions and subordinating conjunctions. (The distinction is not lost, however - a preposition is an IN that precedes a noun phrase or a prepositional phrase, and a subordinate conjunction is an IN that precedes a clause).
The preposition "to" has its own special tag TO."
This makes good sense for words like because, after, and since, which have been treated as "subordinating conjunctions" but really are prepositions and function as the heads of preposition phrases. It doesn't work, though, with that, whether, and if, which function as markers of subordination and not heads. Consider the difference between these two clauses:

Language Learner Literature Award Winners

Somehow I missed the announcement, but the 2012 LLL Award Winners for books published in 2011 have been announced.

Google Ngrams 2.0 and POS tagging

As Ben Zimmer blogged yesterday, there's a new and improved version of the Google Ngram viewer. The "improved" bit has a number of elements, but one is POS tagging. This is a wonderful thing, and I'm inordinately happy about it. Unfortunately, there are some very odd quirks to deal with.

Impacting (on) complement choice

After 20 years of teaching English, I've only recently become consciously aware that my students may be applying the complementation pattern of one word to its derivationally related partner. That is, they regularly say and write things like *it influenced on the situation using a prepositional phrase complement headed by on where the verb should really just take a direct object. Notice, though, that the noun influence licenses the on PP complement that the students are using (e.g., it had no influence on the outcome).

An adverb caught modifying a noun

This morning, CBC Toronto's Metro Morning program was being broadcast remotely from the new arts and culture hub of Regent Park. Shortly before 6:00, I believe, the host, Matt Galloway, said something along the lines of "We're here at the opening officially of the new arts and culture hub." I'm going from memory, but I'm certain about the opening officially of....

What we have here is an adverb officially post modifying a noun opening. Most grammars will tell you adverbs aren't supposed to do that. It even sounds vaguely obscene. But I tell you, it's the truth.

Now, obviously, opening is deverbal, but you could say the official opening of... with the expected modification by an adjective, and you couldn't say *the officially opening of... with the adverb before the noun. More importantly, an example like *the running quickly of the race seems highly unlikely, so being deverbal isn't the whole deal.

You may think that officially must be modifying something else in the sentence, but since this is between the noun and the modifying of phrase, it's clearly internal to the noun phrase and the most obvious thing for it to be modifying in there is the head noun. When they post the podcast tomorrow, I'll try to find the exact sentence. I think you'll find that there's really nothing else in the sentence that officially could possibly have been modifying.

[Update: the podcast is now available, but it's only 25 minutes of highlights, and apparently interesting grammatical features don't qualify. A google search, though, turns up a number of relevant instances elsewhere.]

"Common" idioms

In the new issue of TESOL Journal is an article by Fatima Alali and Norbert Schmitt that discusses teaching idioms. "Because many idioms occur relatively rarely, each idiom’s frequency was checked in the British National Corpus to make sure it was relatively common" (p. 158). I've listed below the first 10 idioms used along with their frequencies in the BNC and COCA. For each corpus, I present the frequency per million words in the genre where the individual idiom appears most frequently. Sometimes this was spoken, but it varied a lot from idiom to idiom. With verb-based idioms, the frequency includes all forms of the verb.

[pay] the piper
[make] no bones about it
[mend] fences
life and limb
ivory tower
down in the dumps
[dwell] on the past
[cast] a long shadow
run of the mill
off the hook
Note. All frequencies are per million words.

Do these strike you as "common"?

The paper concludes, "formulaic language is an important component of discourse," a point with which I take no issue, but this is not the same as saying that particular formulas are important. The study makes a number of good points, but the focus, the teaching of idioms, is one that needs a rethink, not so much in how its done but in why it is done at all.

Past posts on idioms here.

Mark Davies' new academic word lists

Mark Davies, over at Brigham Young University, has developed some fantastic corpus-based resources. His most recent project, along with Dee Gardner, is a set of academic word lists (notice the plural).

In the comparison they provide with Coxhead's 2000 AWL, they claim,
our word lists provide better coverage of academic English. The 570 "word families" in the AWL cover 7.2% of the words in the COCA academic texts, but the top 570 word families in our list cover 14.0% -- nearly twice as much. In a "neutral" corpus -- the 32 million words of academic and semi-academic texts in the British National Corpus -- the AWL covers 7.1% and our list covers 14.0% -- again nearly twice as much.
I haven't had an opportunity to look at these carefully, but at first glance, this seems like a very unfair comparison. It seems that part of the way they have achieved the very high coverage rate is to include some very frequent words, words like between, low, need, difference, use. 

Coxhead's list is built on top of the West's General Service List, which is to say that it only includes words not already listed in the roughly 2,000 words of the GSL. As a result, more very frequent words are excluded (although others, such as area, which are very common but were excluded from the GSL because they had significant semantic overlap with another word, give the AWL an undeserved coverage boost). The approach taken by Davies and Gardner doesn't have any frequency ceiling at all. Rather, they have chosen to consider any word that occurs at least 1.5 times more frequently in the academic sub corpus of the COCA than in the other sections of the corpus. This captures words that have something of an academic proclivity, but a number of those words are vehemently everyday vocabulary.

On the other hand, the new lists distinguish between lexical categories. That is to say that noun use is considered academic, while the verb use is not. Similarly, it brings word families together while still distinguishing between different members. Thus, under the headword move, neither the noun nor the verb move are academic, but movement is.

This is quite a different approach, and will take some time to evaluate. I'm looking forward to trying though.

PS, the lists have been published minus every fifth word as a sort of embargo until a paper describing the lists can be published.

Grammatically speaking and number

I have often been critical of the "Grammatically Speaking" columns put out occasionally by TESOL, but the most recent is, I think clear, accurate, and interesting. It looks at the the question of why we say zero degrees instead of zero degree. Something I've brought up here before.

Schmitt points out that the terms plural and singular may be misleading, and suggests singular and nonsingular. This I think, is a useful approach.

As for his brain teaser, these usually strike me as fairly obvious, but this time, I have no idea what the issue is. Any insight?

Look at the two example sentences below. Explain what grammatical holdover they illustrate and suggest how a teacher might teach this to a language class.
  1. Here's why working at home is both a curse and a blessing.
  2. In particular, Biden cited the billions of dollars in government financial support for U.S. automakers during the recession as an example of the differing approaches between the parties.

Effectiveness of LINC programs

It's rare to find a study looking at overall program effectiveness. But Citizenship and Immigration Canada has conducted just such a study of their Language Instruction for Newcomers to Canada (LINC) program, and the results are interesting. They look at a variety of elements including costs, intake type, and provision of daycare. What interested me most, though, was the language learning outcomes, as displayed in the graph below.

2012 Language Learner Literature Awards Finalists

The Extensive Reading Foundation today announced the finalists for the 2012 Language Learner Literature awards. They are reproduced below the jump.

Some of these look well worth adding to the library. In particular, I'd like to check out Arman's Journey by Phillip Prowse and Solo Saxophone by Jeremy Harmer. Why we need yet another graded reader version of Call of the Wild, A Christmas Carol, and The Great Gatsby, though, I'm not sure. Oxford, Penguin, and Heinle all have versions of Call of the Wild, for example, and I'm sure there are other publishers with their own simplifications.

Thneedvillians' Guide to Zen of Pronunciation

If you are reasonably familiar with stuff like martial arts and flower arrangement, you may have heard about the three stages of mastering the art: Follow, Break, and Separate. Yes, it all sounds vague like most eastern ideas tend to do, but it is quite simple, really.

First, try learning what is given to you as it is: obey your master. Only then comes a stage where you realise, 'Hey, but this doesn't quite work with my body/in this context/etc.' - that's when you break the rules your master gave you, to suit your needs.

Then you live a comfortable art life for a long while - until you suddenly look at yourself doing the art from a distant, objective point of view. You are struck by the realisation that it is not you doing the art: you and the art are one, and the one just happens - and it is fun. No particular, prescribed methods need to exist for you or anyone. You are separate from the whole you-art thing, while making it happen by doing whatever is needed. Now that's mastery.

Wow. Do I sound like a Zen monk now?

So, while Brett made a fair point for stage 2 (or even 3?), I think teaching the basic, often exaggerated, pronunciation is the right thing to do for stage 1. This is especially the case with θ/ð because English uses other sounds close to them. I will show you what I mean:

The above is a portion of the IPA chart, where all the humanly possible speech sounds are represented. I have circled the relevant fricatives English adopts. Isn't it insane? English uses eight consecutive humanly possible fricatives! Compare, say, Japanese, where you start with the leftmost ɸ and then swiftly and discreetly jump to s (with a lot less air friction, I might add). 

So this is what learners of English face: learn to distinguish between f and θ, between θ and s, and so on, in such cramped space. What little difference there is needs to be duly magnified and presented to them. 

In fact, it is not just for learners: English speakers in general do need to do an exaggerated pronunciation of the right kind from time to time. Here is someone making her point (taken from a BBC programme 'Question Time'):

And here is a boy from the film 'The Wall':
'... in THE classroom ...'
I could go on, but I think I have made my point. Finally, though, I do admit that there is such a thing as too much exaggeration. Here is Britney Spears doing an L:
I could think of reasons for doing that, but in general, I will not recommend that. I do show it to my students though, because it's fun. And now I have done it to you. I hope you had as much fun; welcome to stage 3. 

Spacing around punctuation

For years, my students wrote their assignments on paper with a pen or pencil. These days, I ask that almost everything be submitted electronically. One upshot of this is that I've learned that students either don't know or don't care about spacing around punctuation marks. Many students--typically, they tend to be writers of non-roman scripts--put no space after sentence-final punctuation, commas, semi-colons, or colons. Others put a space before these marks. I've seen open parentheses with a trailing space but no leading space and the mirror image on the other end. When everything was written by hand, these inconsistencies weren't obvious, but now they're stark.

The thing that frustrates me is that use tends to be inconsistent. The same student will write period space here, but space period there. If it were simply transfer from the first language (Japanese and Chinese, for example don't employ spaces), then you'd expect it to be consistent. You'd also think it would be quite easy to overcome, but despite my best efforts, many students continue the behaviour. The message I get is that they simply don't care. While I fight against believing this, it does get disheartening. If they can't even learn to space correctly around punctuation (if I can't even figure out how to get them to do this), then why should I believe it will be different with other aspects of their language?

That's why they call it "krazy"

On this package of "skin guard" instant crazy glue, there is copy explaining that it features "delayed adhesion to skin." The warning, however, states that it "bonds to skin instantly." So, although the adhesion to skin is instant, it is a delayed instant?

When you say "thneed"

Yesterday, we went to see the Lorax. I must say I felt rather conflicted about the plot. A. O. Scott captures what was wrong with the movie well when he writes in the New York Times,
"Despite its soft environmentalist message “The Lorax” is an example of what it pretends to oppose. Its relationship to Dr. Seuss’s book is precisely that of the synthetic trees that line the streets of Thneedville to the organic Truffulas they have displaced. The movie is a noisy, useless piece of junk, reverse-engineered into something resembling popular art in accordance with the reigning imperatives of marketing and brand extension."
Despite watching with unease, I got goosebumps at the end, perhaps best described as plastic goosebumps, leaving me feeling manipulated.

So what does this have to do with English? Not much except that at the beginning, when the townsfolk of Thneedville come out to sing the town anthem, there's a close up of tongues extending well beyond their respective teeth, flapping in the wind as their owners draw out the initial th /θ/ of Thneedville.

Learners of English often have trouble with this sound and are often told to do as the Thneedvillians do, to stick out their tongue. Of course, it is possible to make the /θ/ sound with your tongue stuck out, but that's not how we usually do it, and it's probably not good pronunciation advice.

Speaking of pronunciation, Ron Thomson of Brock U is conducting a survey about teaching it.
 "We are seeking participants who are English language teachers, Speech Language Pathologists, or others with a university degree in an unrelated field to complete a survey examining beliefs and practices about second language pronunciation learning and teaching. The survey will require approximately 30 minutes of your time. After completing the survey, you may choose to enter a draw for ONE of FIVE $50 gift cards to To take the survey, click on the following link: or email for further information."

The myth of FANBOYS redux

The very first post I ever published on this blog was about the myth of FANBOYS, and it is still the most popular I've ever written. Now an expanded version of this post is available through the TESL Canada Journal.

Open access publishing

I'm on the editorial adviosry board of the TESOL Journal, a newish journal that is freely available to anyone with a TESOL membership and which otherwise sells articles at the price of $US 35 for 24 hours of online access (+ $4.55 tax). The EAB will be meeting in March at the TESOL convention. Unfortunately, though, I will not be able to attend. Nevertheless, I have put forward a number of proposals for the board to consider:

Proposal 1:
Open access (OA) publishing is a growing trend with more an more schools adopting policies to provide open access to faculty-produced research. In 2009, for example, MIT adopted an open access policy under which faculty grant the school "nonexclusive permission to make available his or her scholarly articles and to exercise the copyright in those articles for the purpose of open dissemination" (source). Harvard has a similar policy, as do many other schools.

Currently, my understanding is that TJ is published by Wiley-Blackwell and that copyright of the contents is held by TESOL. According to MIT's website, Wiley-Blackwell's licensing agreement with its authors is incompatible with MIT's OA policy and, I would assume, those of other schools. To publish in TJ, then, authors would be required to opt out of their respective schools' OA policies. 

The reality is that more and more authors are refusing to do so. You are no doubt aware of the growing boycott of Elsevier over OA among other issues. And an individual example close to our own field is that of Kai von Fintel, editor of the journal Semantics and Pragmatics.

Given this situation I propose that the TJ board pass a resolution supporting the rights of authors to freely post at least their final manuscript (postprint, after peer review, before typesetting) in open access repositories without any embargo (such as having to wait for 24 months before making the OA version available). 

Proposal 2:
When an author writes a book, typically they retain copyright. When an author publishes a journal article, typically they give up copyright. 

Given that, I propose that rather than requiring authors to vest copyright with TESOL, we allow them to retain copyright. 

Proposal 3:
Copyright is based on the premise that reproduction and distribution should be disallowed as the default option. Other licensing options exist, though, and allow for more flexible distribution options.
Given that, I propose that TJ be published under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License.

Proposal 4:
Science should be reproducible, and often the only way to make it so is to make the study's raw data available. With online publishing, technical barriers to sharing raw data have mostly been removed. 

Given this situation, I propose that TJ strongly encourage authors to make their raw data available through TJ for other researchers to use; that, if they choose not to do so, they provide a written explanation; and that this information be considered in deciding whether to accept or reject a paper.

iBooks Author straightjacket

I downloaded iBooks Author, considering using it to publish teaching materials. The license agreement, however, is rather a barrier. Particularly this section:

B. Distribution of your Work. As a condition of this License and provided you are in compliance with its terms, your Work may be distributed as follows: (i) if your Work is provided for free (at no charge), you may distribute the Work by any available means;
(ii) if your Work is provided for a fee (including as part of any subscription-based product or service), you may only distribute the Work through Apple and such distribution is subject to the following limitations and conditions: (a) you will be required to enter into a separate written agreement with Apple (or an Apple affiliate or subsidiary) before any commercial distribution of your Work may take place; and (b) Apple may determine for any reason and in its sole discretion not to select your Work for distribution. 

So, if you just plan to make your work freely available (like this blog), that's fine. But if you want to sell it at all, then you've got to create it, submit it to Apple, and hope that they will sell it for you. If they refuse, well, you can always give it away.

The exercise is not the game

In football, coaches will put cones on the ground and ask you to dribble the ball around them. This is supposed to improve your accuracy and fluency, but nobody believes that the purpose of this drill is to get better at dribbling around cones. Everybody understands that the purpose is a transfer of skills to a similar but different situation in a real football game.

Things are not so clear when it comes to the teaching of writing. It's pretty typical for writing textbooks and writing teachers to make claims like: "There are two ways of organizing a compare/contrast essay: the common traits method or the similarities/differences method."  Some of them might admit that there are many ways but then present "two of the most common" or some hedge to that effect. What students typically understand from this is: this is how you play the game.

Here's what I tell my students they really mean: When you're practicing to be a better writer, sometimes following a formula or copying a structure is a useful exercise. This simplifies things for you by allowing you to focus on certain elements and ignore others. Don't confuse the exercise with the game though. It's not common for academics, journalists, bloggers, or other self-directed writers to produce five-paragraph compare/contrast essays using "the similarities/differences method." Neither should this be your goal.

Economist wants to teach English

The "Johnson" blog at The Economist is looking for ideas about how the newspaper can make itself more useful to English (language?) teachers.

How do you even write a book like that?

The Sisters Bothers by Patrick deWitt is a fantastic read. Or at least it is so far. The incongruity between the character of the brothers, two hired killers in the 1850s, and the formality of their dialogue is jarring but somehow fully appropriate.
'What's the matter?' asked Charlie, leaning up on his elbow beside the fire.
'A horse.'
'Where is the rider?'
'There is no rider that I can see.'
'If the rider appears, you may wake me.' He turned and fell back asleep. (pp. 76-77)