Monday, April 25, 2016

levee, dyke, embankment

Embankment station District Circle roundel I'm often told (by Brits) that Americans are prudes when it comes to language. And I can often demonstrate the error or hypocrisy in their claim. Tidbit/titbit is one I've covered here so far. Another one is levee, which an Englishman informed me is used because Americans don’t want to say (AmE) dike/(BrE) dyke for a built-up bank to prevent the overflow of water.

So let me count out my objections to his claim:

1. Levee has been used in North America since the 18th century. (Orig. AmE) dyke has only been (slang (or a hyponym) for 'lesbian' since the 20th century. So, Americans definitely didn't start saying levee to avoid association with lesbians.

2. If you're an American like me, you primarily know levee from Don McLean's song American Pie, where it is a convenient rhyme for Chevy. (This is at the top of the 'Lynne's most hated songs' list. I hope I haven't earwormed you with it. My day is ruined.) I think of it as a Louisiana thing (it is used all the way up the Mississippi River), and that's where it came into English, from French. (That part of the continent came into the US with the 1803 Louisiana Purchase.) It may be more common now that people have heard it more in the news because of extreme weather in the Gulf States, but I still think of it as a vaguely regional term, rather than pan-American.

3. I only really knew the word dike from the story of the Little Dutch Boy. Where/when I grew up, I'd've  called it a little dam (because we weren't put off by the homophony with damn either!) But note the spelling. The main American spelling of this thing is dike, whereas the 'lesbian' sense is usually spelled/spelt dyke, which which Merriam-Webster lists as 'chiefly British variant of dike'.  So, in printed form, at least, the 'taboo' sense and the 'built up bank by the river' geographic (or is it architectural?) sense are a bit more linked. Living in the 'gay capital of Britain' near a place called Devil's Dyke, I can tell you that the British are aware and amused by the punning potential. In that sense, though, it tends to be for a natural feature, not an artificially built-up place by a river.

4. It’s not like the British are freely going about saying dike for the meaning 'levee'. They tend to prefer the word embankment for such things. Who are the prudes now?

(The green (more BrE) bits above were added after first posting.)

Monday, April 18, 2016

more on polite words and maths

It's been too long since I've posted here. And it will be a bit longer still. I'm currently in the US doing a tour of dictionary archives as part of my research for some on-going projects. Since I have limited time here, I'm working in the evenings to prep for the days in the archives.

But I did want to let you know about some podcasts that I don't think I've mentioned here on the blog.

Helen Zaltzman, for her Allusionist podcast, interviewed Rachele de Felice and me about our research on please and we had such a good time talking that we just kept on doing it. So, on her following podcast, she included some of our discussion about thank you.

Click here for Allusionist 33: Please

Click here for Allusionist 34: Continental (including thank you)

For more of me talking about polite words, click on the 'politeness' tag at the bottom of this post.

And this was longer ago, but I also appeared on the Relatively Prime podcast talking (again) about maths. Click here for that. 

Upcoming talks:
The Boring Conference, 7 May, London (sold out, sorry! but thrilled to be boring enough to be chosen for it)
Society for Editors and Proofreaders 27th annual conference, 11 Sept, Birmingham.

Another post will come soon-ish!

Monday, March 21, 2016

hay fever and allergies

I suffer. I do. At this point, the pollen people tell me it's alder trees. But it's always something.
Alder catkins, via Wikipedia

I complained about this on Facebook last night with the status "Hay fever? Already?" and this led a former (British) student, now working in New York to ask:
They don't really say that here do they? More just 'allergies' in general.
I grew up with hay fever in Upstate New York, and much of my family suffers, so I'm used to hearing the phrase in American English. But, of course, I had to look it up.

I found on the Corpus of Global Web-Based English more mentions of hay fever in Britain than America and more of allergies in North America than in Britain. But allergies wins overall in both countries. Of course, allergies can refer to more than just pollen allergies, so that's not totally surprising.

(The darker the blue in these tables the more a phrase is associated with a particular country in this corpus. The raw numbers can't be directly compared because the sizes of the sub-corpora for each country differ, but the US and GB sets are roughly the same size.)

But looking at Google Books gives a different story:

This shows hay fever as peaking earlier in the US (around the 1940s) and later in the UK (1970s), but not more common in BrE than in AmE. It also shows the rise of allergies--earlier in the US than in the UK. I feel like I use allergies a lot these days because I'm never really sure what I'm sneezing out. But I do seem to be sneezing for most of the year.

So, it looks like the US is leading a change to allergies over hay fever, but this little exercise does demonstrate that a lot depends on the make-up of the data you're using.

If it's not hay fever I have, then perhaps it is THE DREADED LURGY

Friday, March 11, 2016

good morning

Being a parent has opened my eyes to differences I probably wouldn't have otherwise noticed. Not so much because of interactions with my English child, but because of the situations in which I see English parents. I have already noted the well done/good job divide, which was very apparent at preschool level. Nowadays, I have to interact with other parents while taking Grover to school (in BrE, I'm doing the school run).

In the 500 meters/metres between our house and the school, we face a constant stream of parents (known and slightly known) heading in the other direction. (Yes, we're always among the last to arrive. Neither G nor I are morning people.) And, minus conversation between Grover and me about who has the smallest hands in her class, here's approximately how the school run went:

Evie's dad*:  Good morning.
Me:  Hello!
Rosie's dad: Morning!
Me: HELLo!
Somebody's (BrE) mum: G'morning!
Me: helloooooo
Me: Hello!
Teacher at the gate: Morning!
*These people may have actual names. I may even know some of them. But your own name shrivels in relevance when you are a parent.

I said the only hellos and everyone else said a variation on good morning. I've two things to say about that:

  1. Hello originated in the US in the early 19th century, and though the British use it plenty (--as adverb, mostly AmE) these days, I wonder if in Britain it may retain a tinge (just a [AmE] smidgen! a tiny, tiny, tiny bit!) more of its etymological link with surprise. Oh, hello! Hallo, halloa, hullo were British, but came a bit later than hello in AmE--first OED cite is by Charles Dickens--a year before he started travel(l)ing in the US. Hello only really got going as a greeting after the invention of the telephone, and that spread its use to the UK and elsewhere. For more on its forms and etymology, see the Online Etymology Dictionary.

  2. I feel like, where I'm from (western NY state), one only really says good morning right after someone gets out of bed. It's something you say to people who are still in their pajamas/pyjamas, before they've had their coffee. When it's directed at me by members of my family (for it's only usually your family who sees you in your (AmE) pj's/(BrE) jim-jams), one hears a good dose of sarcasm, as in "Isn't it nice of you to join the waking world three hours after the rest of us got up?".  I might be able to imagine a telemarketer saying good morning to me on the phone, and I see people using it to start the day on social media, but I doubt I'd hear it much from colleagues or people I pass on the street.

    I tweeted about this this morning, and I've had some Americans agree that good morning is something you say only to people with noticeable (orig. AmE) bedhead (from Arizona, New Mexico, [?] Sussex), and others not (all in the midwest: Illinois, Iowa, Missouri). I was willing to bet there would be regional variation in this--but Midwest wasn't a region I was betting on. (I lived in central Illinois for five years, and I don't recall feeling affronted or surprised by many people's good mornings, but I was a (AmE) grad student, so maybe I only got up in the afternoons.)  Many aspects of manners are more 'British-like' in the US South, and in areas where there's a lot of Spanish, there might be (what linguists call) interference from buenos dias. But since the people agreeing with me come from very Spanish-influenced areas, perhaps not. The New Mexico tweeter summed up how I'd react:

I started this post when it was still morning, but now it's not, so I've moved on to thinking about good day. If I hear it in my head, it's in a sort of brusque RP accent. Good day, old chaps!  But when I look for it with punctuation on either side in the Corpus of Contemporary American English, I find it occurs at a 4-times-greater rate than in the British National Corpus. (The Corpus of Historical American English tells us it's been dying out since the 19th century. Perhaps hello is to blame--though good day is used for both 'hello' and 'goodbye'.) This is a lesson for those who insist that such-and-such a word is "used by Americans/Britons because I can hear the accent in my head". Your head is unreliable.  (This was the subject of an online debate I had recently--which I'll probably blog about soon.) Our preconceptions about our language can be a lot stronger than our factual knowledge about it.

I'll leave you with this, which is now stuck in my head, and which my mother used to sing in some perverse effort to make me less grumpy in the morning. You can imagine how well that worked on teenage me.

Tuesday, February 16, 2016

lengthy, hefty

Did you know that lengthy is not only an Americanism, but a much-protested one? Early on in its life, lots of American patriots used the word; John Adams seems to have coined it, and Thomas Jefferson, Benjamin Franklin and (though English) Thomas Paine all used it. But here's what they thought of it in the Philadelphia magazine The Port folio (1801):
 [Lengthy] is a vicious, fugitive, scoundrel and True American word. It should be hooted by every elegant English scholar, and proscribed from every page.
Port folio, though published in the US, was "remarkable chiefly for close adhesion to established English ideas" [Henry Adams]. The authors complained that if lengthy makes sense, then so must breadthy, but since no one's saying breadthy, that shows how ridiculous lengthy is.

They didn't like it in England either (from the OED):
1793   Brit. Critic Nov. 286   We shall, at all times, with pleasure, receive from our transatlantic brethren real improvements of our common mother-tongue: but we shall hardly be induced to admit such phrases as that at p. 93—‘more lengthy’, for longer, or more diffuse.
At some point in the 19th century, the British (and everyone else) seem to have stopped minding it. While some still note it as an Americanism, some authors use it without comment:

From the OED
Nowadays, it seems to be used by the British even more than by Americans (from GloWBE):

None of the style guides on my shelf even mention it, except for Fowler's (3rd edn, by Robert Burchfield, 1996), which says "not a person in a thousand would regard it as anything other than an ordinary English word." To quote their definition of it, it is not simply a synonym for long but 'often with reproachful implication, prolix, tedious'. It was a useful word, so people used it.

I was thinking about the 'we don't have breadthy' anti-lenghthy argument. We don't. But we do have weighty, which goes back to the 1500s. It doesn't just mean heavy (for "languages abhor absolute synonyms just as nature abhors a vacuum"--Cruse 1986:270) , it has additional implications, usually of importance or seriousness. One suspects that the authors of the Port folio complaint noticed weighty but decided to (orig. AmE) keep it under their hats.

And then there's hefty, which the OED considers to be 'originally dialectal and US'. I like the word hefty and the noun heft to mean 'weight', which the OED marks as 'dial. & U.S.'. They seem slightly onomatopoetic to me. I can imagine exhaling 'hft' as I lift something with heft.

Again, according to the web-English corpus GloWBE, the 'American' adjective hefty gets more hits in Britain (1,954) than in America (1,366) in corpora of about 387 million words each. The noun heft is a bit more common in the US (224 v 200). What's remarkable about all that is that the word hefty is first cited in 1867, more than 100 years after the first use of lengthy. By the turn of the 20th century, English writers are using hefty, and no one's commenting on it as being an Americanism as they did for lengthy. Did acceptance of lengthy make hefty non-controversial? I don't know, but I found it interesting.

Still, there's no heighty and no breadthy. Go on. Start using them. I dare you. 

Thursday, February 11, 2016

double contractions

In the last post, I looked at of instead of have after modal verbs--as in should of gone and might of known--in contrast to the more standard spelling of the contraction 've: should've gone, might've known.  As we saw there, the of spelling was more prevalent in British online writing than American.

I promised then to look at what happens after negation. Here are the options (sticking with contracted have):
could not 've could not of
couldn't 've couldn't of
Again, I'm looking for these in the GloWBE corpus of English from the web. When I search for the of variants, I have to specifically search for a verb after the of in order to block out things like of course or of necessity, where the of isn't standing for have.

The full not versions in the first row of the table offer no surprises. Just as with the modals, there are more of spellings in the British than in the American (126 v 86).
The double-contracted versions in the bottom row get a bit more attention because I've been wanting to investigate the prevalence of double contractions, like n't've and 'd've. I use them quite a bit in writing and often get comments on them, so I've wondered if they're a more American thing. It's important here to remember that we're talking about writing, not speech. I'm not wondering if people say couldn't've--they do. I'm wondering whether they're (orig. AmE) ok with writing it.
First, the expected news: the of variants are more common in BrE, just as they were in the non-negated data. 85 American occurrences v 170 BrE.  Here's the top of the results table:

As you can see, some verbs show greater numbers with AmE, but this is to be expected because the numbers are small and because some of the verbs are used more in AmE than BrE--like figured, which is cut off the table. What's most important is the fact that the British total is twice as high as the American.

Is that just because BrE uses the present perfect (the reason for the have/'ve/of in these verb strings) more than AmE does? If that were so, we'd expect for the 've form to be more typical of British too, but that's not the case:

The tables in the previous post make this case more strongly, since here have the complication of whether people avoid writing double contractions. To test this a bit further, I've looked for another double contraction: 'd've, as in If I knew you were coming, I'd've baked a cake.

This table is a bit confusing because I searched for *'d 've. The 'd  is supposed to be separated from the word before in the corpus, but obviously that didn't happen all the time. So, the first line includes all the I'd'ves and and other things and the lower lines are other items that hadn't been input in the corpus in the right way and aren't included in the first line. It looks like the British part of the corpus suffers a bit more from bad coding of double-contractions. So, looking at the 'total' line at the bottom, there are more AmE double contractions, but not that many more: 67 versus 60.

Looking again at whether of is used instead of 've, it's still more British (59 total) than American (26 total) after 'd. Here's the top of the list:

So, it's not looking like British writers avoid double contractions all that much more than American writers--unless writing of instead of 've is part of an avoidance strategy. 

I found it interesting in the sheet music pictured above (and more than one version of it), it has been printed with a space before the 've. That's another solution--and perhaps that was more common in earlier days? The corpus would not distinguish between the space-ful version and space-less.

And on that note:

Thursday, February 04, 2016

might of, would of, could of, should of

A few years ago, The Telegraph ran an article about Americanisms on the BBC--or rather, an article about complaints about Americanisms on the BBC:
Nick Seaton, Campaign for Real Education, said: “It is not a surprise that a few expressions have crept in but the BBC should be setting an example for people and not indulging any slopping Americanised slang.”
(Tangent: I had to look up slopping, which doesn't seem to be used much as an adjective. Is he using the British slang 'dressing in an informal manner' or the American slang for 'gushing; speaking or writing effusively'? Or is slopping here being used as a euphemistic substitution for another word that ends in -ing?)

But (of course!) half of the 'Americanisms' in their closing list of 'Americanisms that have annoyed BBC listeners' weren't Americanisms. One (face up) was first (to the OED's knowledge) used by Daniel Defoe, the Englishman. Another (a big ask) is an Australianism. But one that really bothered me was this:
  • 'It might of been' instead of 'It might have been'
 Three reasons it bothered me:
  1. Shouldn't it might of been be corrected to it might've been rather than it might have been? That is, of is a misspelling of the similar-sounding 've here. Might've is perfectly good contraction in BrE as well as AmE. Is the complaint that people should say have because they shouldn't be contracting verbs on the BBC, or are they complaining about spelling 've wrong?
  2. We're talking about broadcast television and radio, which are spoken media. You can't see the spelling of what the presenters are saying. So how do they know the presenters said might of and not might've?  Of course, they could have seen it on the (orig. NAmE) closed-captioning/subtitles. But BBC subtitles usually make so little sense that I can't believe anyone would take them as an accurate record of what's been said. (Here's a Daily Mail collection of 'BBC subtitle blunders'.)
  3. I read of instead of 've a lot in my British students' essays. A lot. There's no reason to think they're getting it from American influence, because they'd have to read it and they probably don't get the chance to read a lot of misspel{ed/t} American English. The American books or news they read will have (we hope) been proofread. I suspect that errors like this aren't learn{ed/t}from exposure at all: they are re-invented by people who have misinterpreted what they've heard or who have a phonetic approach to spelling, sounding out the words in their minds as they write.
This particular Telegraph list is one of the things that I mock when I go around giving my How America Saved the English Language talk.  But so far, when I've talked about it, I've just said those three things about it. I have never looked up the numbers for who writes of and who writes 've after a modal verb. I think I've been afraid to, in case it just proved the Telegraph right that it's a very American thing.

I need not have feared! Not only was I right that I see it a lot in the UK, I was also right to feel that I probably see it more in the UK, because --you know what?-- the British spell (at least this bit of English) worse than Americans.

Here are the numbers from the Corpus of Global Web-Based English. The numbers stand for how many times these variations occur within about 387 million words of text from the open internet.

non-standard of American British
might of 392 672
would of 926 1634
could of 458 821
should of 442 683
standard 've American British
might've 506  277
would've 4921 3121
could've 2379 1502
should've 1685 1140

I've put the higher number in each row in blue bold in my table in order to reflect how it shows up in GloWBE. The blue-bold indicates that those numbers showed up in the darkest blue in the GloWBE search results, like the GB column here:

(The Canadian numbers are distracting--they're not based on as much text as GB and US.)

The darker the blue on GloWBE, the more a phrase is associated with a particular country. So, it's not just that the of versions are found in BrE--it could be said (if we want to be a bit hyperbolic) that they are BrE, as opposed to AmE.

In both countries, the 've version is used more than the misspelling. Nevertheless, the American numbers were darkest blue for these spellings--indicating the correct spellings are more "American" in some way--though note that the British 've versions are just one shade of blue lighter--the difference is not as stark as in the previous table.

The moral of this story  

It looks like the BBC complainers and the Telegraph writer assumed MODAL+of was an Americanism because they disapprove of it. But remember, kids:

Not liking something is not enough to make it an Americanism.

Coulda, shoulda, woulda

When I discovered these facts, I immediately tweeted the would of (etc.) table to the world, and one correspondent asked if the American way of misspelling would've isn't woulda. The answer is: no, not really. Americans might spell it that way if they're trying to mimic a particular accent or very casual speech (I coulda been a contenda!). It's like when people spell God as Gawd--not because they think that's how to spell an almighty name, but because they're trying to represent a certain pronunciation of it. No one accidentally writes theological texts with Gawd in them. But people do write would of in formal text 'accidentally'--because they don't know better, not because they're trying to represent someone's non-standard pronunciation. In the Corpus of Contemporary American English, 75% of the instances of coulda occur in the Fiction sub-corpus; authors use it when they're writing dialog(ue) to make it sound authentic. 

But you do get coulda, shoulda and woulda in an AmE expression, which accounts for about 10% of the coulda data. I think of it as shoulda, coulda, woulda, but there does seem to be some disagreement about the order of the parts:

The phrase can be used to mean something like "I (or you, etc.) could have done it, should have done it, would have done it --but I didn't, so maybe I shouldn't worry about it too much now". (A distant relative of the BrE use of never mind.) Sometimes it's used to accuse someone of not putting in enough effort--all talk, no action. 

The English singer Beverley Knight had a UK top-ten single called Shoulda Woulda Coulda, which  may have had a hand in populari{s/z}ing the phrase in BrE (though it's still primarily used in the US).

Another shoulda that's coming up in the GloWBE data is If you like it then you shoulda put a ring on it. And I can't hear that now without thinking of Stephen Merchant, so on this note, good night!

Postscript, 5 Feb 2016: @49suns pointed out that I haven't weeded out possible noise from things like She could of course play the harmonica. Good point. British people do write could of course (etc) more than Americans do because they use commas less. Americans would be more likely to write could, of course, play the harmonica--and with the commas it wouldn't be caught by the search software. As well as of course, there's of necessity and other things 'noising up' the data.

I'm not going to re-do all the tables because I've posted this now and many have commented on it.  But the good news (for this post) is that the conclusions about of is pretty much the same if we limit the search to modal + of + verb; it's still more frequently British--especially when preceding been, the case that was complained about in The Telegraph. Here's a sample.

An interesting case at the bottom is should of known, which reverses the pattern. This is just because should [have] known--often in should [have] known better -- is a much more common phrase in AmE than in BrE. Searching should * known, we get:

Looking more closely at that group, I found that 6 of the 21 American should of knowns were from song lyrics (none of the UK ones were), and one was using it as an example in telling people that they shouldn't write should of

The online interface doesn't like me searching for modal+of+verb, so I've had to search for *ould+of+verb, leaving out might and in the post I also left out must.  But having re-searched those, I can tell you: still dark blue in British, not in American.

The other thing I haven't done, which someone (or someones) else has suggested is what happens after negation. That is a lot more complicated, since there are more variations to consider (since both the n't and the have can be contracted).  I'm really interested in that, so I'm going to write a separate post on it next week. Till then!