a 'foreign spelling' test for GloWBE corpus

In blogging, I rely a lot on the Global Web-Based English corpus, GloWBE, which has millions of words of internet data categori{z/s}ed by the country of the website. It's divided into excerpts from 'blogs' (which includes comments on blogs) and 'general', which includes all sorts of things, even some blogs.

It's an invaluable tool for judging whether a word or phrasing is used in a particular place. But national borders are very weak on the internet, and commenters comment on all kinds of things from all kinds of places. And there are even people like me who are blogging in the 'wrong' country for their dialect (and I have run into some of my own writing in the corpus!). So, how can we know how much of the data that's in the 'US' category is actually by Americans and so forth?

This is a problem that has struck me as I've tried to use GloWBE data to research politeness markers. (I'm giving a paper on please in GloWBE next week.) So, I came up with a little test for foreignness in UK and US--though it won't work for other countries, as you'll see.

The test is based on the -or/-our spelling distinction, and the question is: how much of the data for each of these two countries has the 'wrong' spelling for the country? This works because for the (AmE) thirty-some words that have this spelling variation (except glamo(u)r, which is a funny one that I'll blog about at some point):
  • the 'correctness' of each spelling is long established in each country -- unlike -ise/-ize which is a more recent deviation
  • there's little variation within the country -- unlike, say f(o)etus, where people of different professions might spell it differently, or theat{er/re}, where proper names of US establishments often use the UK spelling
  • they are generally high-frequency words and therefore 'easy' to spell -- unlike paraly{s/z}e or mollus{c/k}, etc.
Here is the result, showing how many of each spelling were found on 'US' and 'GB' (as it is named on GloWBE) data in the corpus:


GB our
GB or
US or
US our
hono(u)r
10376
2985
17145
2437
humo(u)r
8903
1632
10461
1560
neighbo(u)r
5364
1029
8128
1037
odo(u)r
673
276
1224
119
tumo(u)r
2546
658
2269
244
vigo(u)r
910
201
745
197
totals
=28772
=6781
=39972
=5594
'foreign' rate
19%
12%


In other words,  19% of the -o(u)r words I searched for in the GB corpus had the AmE -or spelling and 12% of the US -o(u)r words had -our spellings. A few of these may just be by people who can't spell or who are putting on airs by using another spelling system, but they're probably a very few. The percentages for the individual words range from 8.9% (tumo(u)r) to 20.9% (vigo(u)r) for the US data and 15.5% (humo(u)r) to 29.1% (odo(u)r) for the GB data. It's important to use a number of words because the data will be skewed if it's just a word that American use more than British people (or vice versa), regardless of spelling. We can see that happening a bit with odo(u)r.

(I'd originally included colo(u)r in the list, which made the total difference more stark: 24% to 14%. I took it out because I suspected that the use of color in HTML coding might be skewing the result.)

So this can lead us to the hypotheses:
About 12% of US blog data is not written by Americans.
About 19% of UK blog data is written by people using American English spelling.
We cannot say that about 12% of the US data were written by British people, or even that 81% of the British data were written by British people, since -our spellings are used in the rest of the anglophone world. Half of the vigours written on British sites might have been written by Australians or Canadians, for all we know. The -or is more a marker of Americanness than -our is a marker of Britishness. But we also can't say that the people spelling -or are Americans, we can only assume they are people whose education in English spelling used American English standards. So, half of the -or spellings might have been written by people in the Phillipines (where American spelling is used). It's unlikely that it's that many, but I've phrased the hypotheses to allow for these possibilities.

Why is the number of American spellings on British sites larger than the number of British spellings on American sites? Well, it might just be because there are nearly five times as many Americans on the internet than there are Brits. (The US has 86.9% internet penetration on a population of around 318 million, so that's over 276 million internet users. The UK has 89.8% penetration in a population of about 63 million, so that's nearly 57 million internet users. Source.) Of course, there are a lot more countries involved here (and I'm not going to go do all that adding-up at the moment), but that's a reasonable step toward(s) explaining the difference, and one that doesn't involve running around screaming 'the sky is falling on British English'!

So, what this means is that if we look for differences between American and British Englishes on GloWBE and see that a form that's used in Britain is used 10% as much in America, we can't conclude that the Britishism is gaining traction in the US, because there's a fair likelihood that the people who used the Britishism weren't American. If we see one with 40% use in the US, however, we can aver that it's well on its way to being established there.

Anyhow, I'm glad I decided to explain that all in a blog post, as it makes it clearer in my mind for explaining it in about 10 seconds as I fly by that slide in my presentation next week. If you see any flaws in my thinking or math(s), please let me know!

In other news (aka shameless self-promotion):
The Odditorium people have made a podcast of my Catalyst Club talk about little words (especially the). It's a bit odd without the visuals, but they do call themselves 'oddpodcast', after all.



I'm speaking at two conferences in the next two weeks, plus have a few public speaking engagements in the future. Follow the links for more info. If you're nearby, come say (orig. AmE in this form) hello!

23 July (with Rachele de Felice): The politics of please in British and American English. Corpus Linguistics conference, University of Lancaster.

31 July Separated by a Common Politeness Marker: please in British and American English. International Pragmatics Association Conference, Antwerp.  
17 Sept How America Saved the English Language.  The Bedford Culture Club (Horsham, W Sussex). 

Further ahead, titles yet to be confirmed:
27 Sept  Sunday Assembly Brighton.  
27 Nov (Thanksgiving dinner--a day late)  English-Speaking Union, Chichester.
Read more

known them (to) and help them (to)

Yesterday, The Syntactician was asking me questions about semantic terminology in relation to particular uses of the verb know, as one does. And so, as one does, I looked for know in the indices of various books about verbs that I have, hoping to find a term that would suit her particular purposes. In doing so, I came across something that was completely new to me in F. R. Palmer's A linguistic study of the English verb (1965):


In case you can't read the photo, it says that you can 'help someone do something' or you can 'help someone to do something'.  So far, so familiar to me.

But then it goes on to say that know has the same pattern with  
(1)  Have you ever known them come on time?
and
(2)  Have you ever known them to come on time?

Now, if I have ever seen sentences of type (1) in the wild, I must have assumed them to have typos, because if I want to know someone/thing + verb, I must have the the to-infinitive form of the verb. Yes to (2), no to (1). Absolutely, no question.

So, I turned to the (English) Syntactician, who said that yes, (1) is good in her BrE, "but old-fashioned". I then went onto Twitter to proclaim my ignorance/learning/disbelief, and many English people (many of whom are probably not terribly old-fashioned) replied to say "Yes, that's fine. I can say that."  No US people replied to say they could say it, and now that I look in Algeo's British or American English?, I see that he records this as a British form.

Palmer hasn't mentioned the big restriction on this, however. Algeo does, but I learned the restriction  the hard way: by tweeting "Can you really know someone do something?" The answer there is 'no'--British English speakers can only use the to-less version in the perfect aspect (the 'have/had verbed' forms). So:
  • General (BrE or AmE) perfect: I have known them to frequent dark alleys.
  • BrE-only perfect:   I have known them frequent dark alleys.
  • General English present:  I know them to frequent dark alleys.
  • Nobody's English present:  *I know them frequent dark alleys.
 
(Overly academic side point. Skip this unless can name at least two theories of grammar!
I'm wondering how you get a [say, Chomskyan] theory of grammar to account for a complementation structure that is particular to a certain aspect of a certain verb. Maybe all theories are now so lexical that it's  possible--though you'd have to treat known and know as different lexical items, I guess. Would be easier to account for in a Construction Grammar, but still seems like a very heavy--or at least fiddly--cognitive load for a language to bear. If you know about such things, let me know in the comments, please!)


I should also say a bit about that help (to). As I said above, both of these are fine in AmE and BrE:

(3)   I helped them escape.


(4)  I helped them to escape.
 ...but what's interesting for us is that AmE prefers (3) [in 75% of the cases in the Brown corpus] and BrE prefers (4) [73% of the cases in the LOB corpus] (both figures from Algeo, p. 228).



And that, my friends, is how you write a blog post of less than 1000 words. When was the last time you had known me do that?  :)
Read more

talking about streets, roads, etc.

A while ago, I wrote about a difference in AmE and BrE use of street and road, in that in BrE it's more natural to cross the road and in AmE (certainly in a town or city) it's more common to cross the street. (I've also written about in/on the street, so see that post for more on that.) That's common-noun usage, but what about the proper names of vehicular paths?

There's no question that some ways of designating paths are more common in one country or the other. I've never seen a road named [Something] Trail or [Something] Boulevard in the UK (though see the comments for some counterexamples), and in the US there aren't as many Crescents or (BrE) Closes (pronounced with a /s/, not a /z/). But a problem for making generali{z/s}ations about such things is that the naming of streets or roads varies a lot on the local level in both countries, with different names based on regional differences, urban/rural differences, and terrain differences.

The other day Damien Hall (tweeting as @EvrydayLg) pointed out:
Struck since long B4 I lived there by US habit of omitting eg St, Rd in addresses. We don't.
He hypothesi{z/s}ed that it might be because street is more common in the US and therefore the default. But I don't think that's why. Instead, Americans are happy to say things like
go up Main and take the first right onto Union
...because in most cases that will be an unambiguous statement, since there will typically only be one thoroughfare called Main (Street) and one called Union (maybe Avenue) in a town. I often send packages to a friend who lives in Tennessee. I've never bothered to find out if she lives on Woodland Street or Woodland Lane because I only need to put '140 Woodland' (and the city, state and, if I'm nice, the zip code) on the package and it gets there. As you can see from this (AmE) yard sale sign, the practice of not saying street or road is common. (Where there is more than one with the same name, you'll hear the street/road/lane/whatever more regularly.)

(Side note on codes: Five-digit US zip codes only say which town or which part of a city the address is in, unlike six-or-seven-letter/number UK post codes, which generally indicate the town/part-of-city in the first half (letter-letter+1- or 2-digit number) and which street or part-of-street in the second half (number-letter-letter). Nine-digit US zip codes, called ZIP+4, are a more recent addition* that do indicate street, but which I don't actually use. I couldn't tell you what mine is at the US address I use. 
*It says how old I am that I consider something from 1983 a 'recent addition'. )

'Street'-less street names are so unambiguous that most Americans would immediately recogni{z/s}e that the film title State and Main refers to a corner in a town--mostly likely in the cent{er/re} of the town, since those are common street names in American towns and because we refer to (AmE) intersections or the corners at those intersections in that way. 

(I'm sure I've mentioned before that Main Street is a likely street to have in a town as the main street. It is also metaphorically used to refer to 'the inhabitants of small US towns considered as having a narrow-minded or materialistic worldview' (AHD5). So, politicians might worry about 'what will fly on Main Street'. The High Street is the proverbial main street in a British town (it may or may not be named High Street) and is used metaphorically to refer to the commercial market--i.e. 'what will fly on the high street' is what the masses are likely to want to buy.)

British roads need the street or road (etc.) because little is unambiguous when it comes to British road names.

Take my former (more common in AmE) neighbo(u)rhood as an example:


There is a Buckingham Road, which meets Buckingham Place. Buckingham Street runs parallel to Buckingham Road, but doesn't meet Buckingham Place because halfway through it changes its name to Clifton Street. Off the map are Clifton Road, Clifton Hill, Clifton Street, Clifton Place and Clifton Terrace. But we don't need to leave this map to see that there's also a West Hill Street, West Hill Road and West Hill Place. I also feel bad for the people who live on the parallel Albert Road and Alfred Road who probably get each other's (AmE) mail/(BrE) post all the time.

Once I found an unconscious man on Buckingham Place.  Except I didn't know which Buckingham it was. The ambulance people were (BrE) well (orig. AmE) pissed off at me.

If that weren't bad enough, I now live on a road that shares its entire name with another road in the same city. When I tell taxis where to take me I have to say "X Street, off Y Road". We always tell plumbers and such which one to go to (we even give them our post code) and then when they don't show up, we text them to say "no, it's the other one".

A famous exception to the 'one pathway per name' rule in the US is New York City, which has both a 3rd Street and a 3rd Avenue. Except that it doesn't really have a '3rd Street', since you need to put East or West in front of it in order for the house number to be meaningful--so if someone says they live on East 5th, you know it's East 5th Street. In New York and the US more generally Avenue is often abbreviated in speech (as well as writing) to its first syllable (written as Ave. or Av.). 

And so onto the Easts and Wests. In the US, you can reasonably expect that East Main Street and West Main Street are the same thoroughfare, but that house numbering starts from the where they join (or divide, depending how you think of it). East Main Street will run to the east from that (AmE) intersection/(BrE) junction.

In some cities they put the compass-points after the name and that can mean something different. In Washington, DC, it indicates quadrant of the city that that part of the road is in. So, 7th Street NW and 7th Street SW are one long road that runs north/south, but the parts of it in different quadrants of the city. On the other side of the capitol building, the street numbering starts over, and so 7th Street SE and 7th Street NE run parallel to the other 7th Street NW/SW.

So, the other day, I had to find Brunswick Street East in Hove (UK). Somehow (¡Apple Maps!) I ended up on Brunswick Street West. I knew that Brunswick Street East would not be a continuation of West (after all, the road was running north-south), but I hoped it would be the next street eastward. It was not. At least it was eastward. (I hadn't been willing to trust even that.) But I did get to see Brunswick Place, Brunswick Square and Brunswick Terrace in my explorations.

Finding street names is its own challenge. In the US, street signs tend to be affixed to poles at the corners of roads. At some big intersections, they may hang over the road on the wires that hold the (AmE) stop lights/(BrE/AmE) traffic lights. In the UK, they tend to be on buildings or walls near the end of the road. This may require some searching since some are high and some are low. Here's one of my favo(u)rites from Brighton:






House-numbering, of course, is another nightmare. In the US, it's pretty predictable that even numbers will be on one side of the road and odd ones on the other. In the UK, it might be that way (though you've no guarantee that 92 will be across the street from 93--it might be many houses further down). Another UK way is to have consecutive numbering up one side of the road (1, 2, 3, 4,...) to wherever the road ends and then down the other side, so that, say, 52 and 53 may be across the road from one another, but 1 will be across the road from 104. Another way it might be is that the name of the road on one side is different from the name on the other side--so, for example, the people at 15 Vernon Terrace in Brighton live across from the people at 17 Montpelier Crescent. (And Vernon Terrace only lasts for one (AmE) block, after which its name changes and house numbers re-start twice before you get to the sea, which has pleasantly few thoroughfare names.)

I've talked about differences in house numbering on Numberphile, so (BrE) have a look at the video if you are (orig. AmE) nerdy enough want to hear more about house-numbering:



Read more

Are these British expressions British?

It seems to happen once a week that I'm talking or listening to someone and some interesting new combination of morphemes (meaningful word-parts) is uttered. The conversation will go something like this:
A:  Ooh, this cake has real taste-itude. 
B: Ha! Taste-itude, is that even a word?
Lynne: It is now.
People are saying it, people are understanding it. It's made out of morphemes and it's not a phrase. It's a word. It might not be a word that's going anywhere, but it's a word. And I'd go so far as to say it's an English word, since it's made of English word-parts according to English rules, pronounced with English sounds, and understood by English speakers.

Recently someone on Twitter took me to task for giving BrE versus AmE uses of tortilla as my Difference of the Day, protesting that tortilla isn't even an English word; that the difference is between European and Mexican Spanish, not British and American English. My response was: yes, the word(s) came from those Spanishes, but you can find tortilla in English dictionaries and how English speakers use tortilla can differ from how Spanish speakers use it. So, is tortilla an English word? It is now.

This isn't to say that any non-English word in an English sentence automatically becomes English. If I wrote "My favo(u)rite Swedish institution is fika, the social coffee break", a lexicographer would look at it and say: we don't need to put fika in our English dictionary because (a) it's been marked as foreign (with italics), (b) the writer felt the need to define it, indicating that it's unfamiliar in English, and (c) it describes something in another non-English-speaking culture. When the glorious time comes that English-speaking cultures embrace fika, we'll say things like "I'm just going to fika with Jo. Care to join us?" and the lexicographers will put it in English dictionaries.

All of this is preamble to thinking about what a "British word" is and what happens when an American word "becomes British". When words/meanings/expressions move from one dialect to another, it's not so easy to tell that they're foreign, because we don't tend to get those markers of 'foreignness' that we got in the fika example. The words are generally made out of English parts, and often their meaning is recoverable from the context. If we say that an American expression has 'become British' (or the reverse--but let's stick with one scenario) we could mean:
  • the expression has become less specific to America, and therefore British people say it as well as American people because it is now 'general English'.
  • the expression used to be American, but now British people say it and Americans don't. Thus, it is not 'general English', but 'British English'. 
This kind of thing has come up on the blog before when British media have distributed complaints about "Americanisms" coming to Britain, and people like me point out "Many of your so-called 'Americanisms' came from Britain, but the British forgot about them". (A nice example of that is now-AmE expiration versus more-recent-BrE expiry.)

This week, we can analy{s/z}e whether the same happens when Americans talk about Britishisms. (Of course, what's different is that Americans are likely say "That's so cute! I'm going to start saying that!" rather than "Those people are ruining our language with these silly expressions!")
Here's a list of "British expressions" that has been going (a)round the web:



Like many things on the interwebs, there's no source-citing here. Judging from the 'we say' at zed, it's by an American who knows a bit about Britain. Some of the translations are fairly poor and some of it is fairly dated (chap illustrates both these charges).

What struck me about the list was that I was pretty sure that some of these were American English (originally, if not currently). And at least one I knew to be an Australianism. So, since I have finished my external-examining (it's a British academic thing, and it's a lot of work), I am celebrating by looking into all the items on the list. I won't bother to say "yes, that's originally British" about the majority that are. (Some of them have been discussed already on this blog; you can use the search box on the right to look for them.) But let's think about the ones that aren't.


(the) bee's knees This is 1920s American slang, and as far as I can tell it has never been more popular in the UK than the US. Yes, some British people say it, but Americans are saying it more. And whoever is saying it, they're probably elderly or affecting a vintage style.

know your onions Another old US phrase (the first two OED citations - 1908 and 1922 - are American; first British one comes in 1958). It is definitely used more in the UK now than in the US. World Wide Words has a nice post on it.

wicked to mean 'good, cool' is something that may have been re-invented in the UK (negative words have a way of being made positive in slangs), but it was certainly something I said in the 1980s in the US, earlier than it was being used in UK. OED lists it as 'orig. U.S.' and cites F. Scott Fitzgerald for its first recorded use:
1920   F. S. Fitzgerald This Side of Paradise i. iii. 119   ‘Tell 'em to play “Admiration”!’ shouted Sloane... ‘Phoebe and I are going to shake a wicked calf.’
(a) tad To quote the OED: "colloq. (orig. and chiefly N. Amer.)." The 'chiefly' there is out-of-date; it's well used in BrE now (new ways of achieving understatement are always helpful in BrE). But it's never gone out of use in AmE, so its presence on the list is a puzzle.

(a) shambles To mean 'a scene of disorder or devastation', the OED says 'orig. U.S.' And yet it is in the list twice. (It is used more in the UK, but it's not unused in the US.)

skive Now, I've written about this word before (great word--didn't know it before coming to the UK), but in doing so I failed to mention that it started out in America, seemingly derived from French esquiver. Again, from the OED:
 1. intr. U.S. College slang. At the University of Notre Dame: to leave the college campus without permission. Also in extended use with reference to other disciplinary matters. Freq. with away, out, etc. Cf. skiver n.3 1. Now disused.
 2. trans. orig. U.S. College slang. To avoid (work or a duty) by leaving or being absent; (now) esp. to play truant from (school). Now chiefly Brit. colloq.
nosh comes from Yiddish and is "Originally: to nibble a snack, delicacy, etc. (chiefly N. Amer.)" (OED). Nowadays, in BrE it refers any food, not just a snack or delicacy. Use of the word in the US is particularly New-Yorkish (as Yiddish-derived words often are), and the verb is not used so much in BrE.

uni Here's the Australianism. BrE speakers above a certain age will tell you it came into Britain through the soap opera Neighbours in the 1980s. BrE speakers of university age now probably have no idea it came from Australia. It is used a lot in the UK.


So, about 12% of the lists are expressions used by the British, but not invented by the British. So, they're British expressions in the sense that British people say them.

Some are not invented by the British and not exclusively said by the British. Seems a bit odd to call those ones British expressions.

These not-so-British expressions on the list probably indicate that the writer fell into an old trap: if you don't know an expression and then you hear someone with a different accent say it, it's easy to conclude that the expression is a regionalism that is particular to people with that accent. I fall into the trap too, like when I assumed station stop was a Britishism because I had only heard it in Britain (but then, I take trains more in Britain).  It's our duty as people who care about language to try to resist those easy conclusions, because we have to admit that our individual experience of vocabulary is an imperfect, biased, and ahistorical view of the language.

The other problem with the phrase British expressions (and one that plagues this blog) is what's "British enough" to be British. For something to be called a British expression is it enough that it is used in Britain? Is a Yorkshireism or a bit of slang from Multicultural London English a British expression? Or, for an expression to be British does it have to be used across the whole country (or at least the whole island)?

So, what do you think: should we call the originally-not-British items on this list British expressions? The next time a British person says Can I get a latte? and someone else says "That's not British!" should we say "It is now!"



Postscript: I just can't resist mentioning what I've learn{ed/t} about a British-British item on the list:

arse-over-tit is British through and through, but it was originally arse-over-tip. Its current form lends support to my belief that British English will find any excuse to say tit as often as possible.

Read more

f(o)etus and f(o)etal —and a bit on sulfur/sulphur

If you're looking for discussion of other (o)e or (a)e words, please click here to see/comment at the more comprehensive post on the topic.

So, as we've seen in that aforementioned blog post, British and American spelling differ sometimes in the use of the ligature (connected letter) œ, or as it's more often written now, the digraph (two letters for one sound) oe. To give a quick summary of the story so far:
  • English took a lot of its œ words from Latin.
  • Latin got them from Greek. œ is Latin's way of representing the Greek .
  • American English (following Noah Webster and other spelling reformers) usually simplifies the Latin/Greek oe to e
But then there's foetus (or fœtus). This is a British spelling of the Latin word fetus. That is to say, the œ might look like it comes from a classical language, but it just doesn't. Sometime in the 16th century, someone (mistakenly, one might say) started spelling it with an œ, and it stuck.

This creates a dilemma for British spellers who know a bit about Latin. Spell it foetus and commit a little etymological crime. Spell it fetus and get accused of Americanization by people who don't know about the Latin—and maybe even by some who do know about it. And if there's one thing worse than committing Latin sins, it's being accused of spelling like an American.

But still, brave British doctors have fought to get rid of the o, mostly by writing letters to the editor of major medical journals. Here's one:

I shall resist to the  last ditch any movement for the general replacement of diphthongs* by single vowels – the American practice. But when, etymologically, the foreigner is correct and we are wrong, it would seem that by adhering obstinately to a false diphthong we are weakening our case for maintaining our justifiable diphthongs in the face of contrary “common usage” by far more than half the English-writing world. (Napier, L. Everard. 1 Nov. 1952. The correct spelling of medical terms [Letter to the Editor]. The Lancet vol. 260, pp. 885-6.)

The Lancet and the British Medical Journal now consider fetus and fetal the ‘correct’ spellings, and the Oxford Dictionaries entry for fetus remarks:
The spelling foetus has no etymological basis but is recorded from the 16th century and until recently was the standard British spelling in both technical and non-technical use. In technical usage fetus is now the standard spelling throughout the English-speaking world, but foetus is still found in British English outside technical contexts

At the foetus entry, it just says: "Variant spelling of fetus (chiefly in British non-technical use)."

How true is this, that it's the accepted technical spelling in the UK? In The Lancet and the BMJ, it's doctors writing for other doctors. What about the rest of the medical professions? What about when medical types communicate with patients?

My first stop was the NHS Choices website, where the readers are would-be patients. A search for foetus brings up 27 hits, but fetus has 7. But, going the other way, foetal has 66 hits and fetal 82. What's going on?

I contacted the website to ask if they had a policy on this and they were extremely helpful (as the NHS always has been for me ♥). They put me in contact with their Head of Editorial Production, who sent me both a link to their style guide (which has fetus as an Americanism to be avoided) and his own document entitled 'Fetality', which he wrote when the Fetal Anomaly Screening Programme (so spelled) asked if the rest of the website could switch to fetal/fetus. In his paper he gives several arguments for retaining foetus/foetal, even on pages where it will conflict with the FASP program(me)'s spelling, but I think this first one is key:
NHS Choices is a ‘British English’ service and, as stated in its Editorial Style Guide, is bound to:
·       Write plain English
·       Avoid medical jargon and technical terms as far as possible
On the basis of those two points, if it is accepted that foetus is the general spelling and fetus the technical-medical, NHS Choices should use foetus.
(
Bolton, Barry. 2014. Fetality. Internal document, NHS. Received with thanks from the author.)
Looking again at the o-less hits on the NHS Choices site, many of them seem to be in comments from site users—so the house style doesn't apply. Are they misspelling it, or do they know the 'technical' spelling? Why so many more fetals? Possibly because it's in the name of a lot of things, not just the FASP program(me), such as the 'Fetal Medicine Unit team at St George's Hospital', which is indeed how the hospital spells that unit's name.

It's an interesting mixture: the NHS website keeps the traditional British spelling in communication with patients in order to avoid technical language, but the hospitals and such seem quite happy to foist the technical spelling on patients in the names of units and program(me)s.

To investigate this a little further, I did a little survey in which I asked for UK medical personnel to tell me which spelling they would use in a work context: foetus or fetus, sulphur or sulfur and amoeba or ameba. F(o)etus was the only one that respondents disagreed about:

 
(The 'it depends' person gave that answer for every question and said they'd use the American spelling if they were writing to an American.)

I invited respondents to explain their preferences to me, but unfortunately only four did, and two of those used the space to tell me about words I hadn't asked about. The two relevant comments were:
I am an allied health professional who wouldn't use these words much in my work, but these were how I was taught to spell them at school. I've heard in the past that "foetus" is completely wrong, though I can't quite remember why and I write the word so infrequently that I wouldn't change my spelling of it anyway!
and apparently not knowing about the etymology of fetus:
Homogenisation of the English language to accommodate American English is a pernicious assault on the richness and diversity of English usage. It shouldn't be tolerated!
Unfortunately, I didn't ask for demographic information beyond country of abode, so I can't see whether the people who prefer fetus are in professions in which they need the word more often than the ones who prefer foetus.

But my impression is that fetus/fetal seems to be something of a medical shibboleth in the UK now. Doctors use the e spelling and it sets them apart as 'in the know', and maybe they don't mind that the rest of the country goes about putting the o in it. All the better to tell who the truly educated are. I'd love to hear from people 'in the know' in the comments. Have I got that wrong?

And before I leave, a note about the other false etymological form that readers of The Lancet (well, at least one) have tried to change. Here's another letter to the editor:
SIR,-Spelling is a curious blend of phonetics, etymology, tradition, and nonsense ; we should take care not to let the last preponderate. Dr. Napier (Nov. 1) is to be congratulated on his attack on the absurd o which it is customary now to insert into fetus. I would like to raise support for a similar attack on the ph with which we generally mis-spell sulfur and the other words derived from it. Sulfur comes from a Latin word. Undeniably some Latin authors used the ph form, but there is good reason to think that this was a blunder, and most of the European languages that use the Latin root have not followed the erroneous spelling. The spelling sulfur was common in Britain from the 14th to 18th centuries, and this presumably explains its present day use in the U.S.A. It is in no sense an American innovation.  (Pirie, N.W. 15 Nov. 1952.The correct spelling of medical terms [Letter to the Editor]. The Lancet vol. 260, pp.987-8.)

The argument for sulfur seems not to have been heard—sulphur still rules Britannia absolutely.


Footnote
*It's a digraph, not a diphthong, but what do doctors know?


In other news...
Votes, please? I failed to be self-promotional enough to make it to the voting round for Bab.la's Top Language Lovers blog competition this year. (I foolishly assumed being nominated was enough to get to the voting round.) But I did get to the finals for my Twitter feed, under my name (Lynne Murphy), rather than my Twitter handle (@lynneguist). But if you (BrE) fancy helping me out with a vote (or sabotaging me with a vote against!), please click here to go to the voting page.

Cheeky Nando's: Marking season is to blame for many things, including my failure to do a timely, topical post on the Buzzfeed 'Cheeky Nando's' phenomenon. But happily Ben Yagoda has done one at the Chronicle of Higher Education Lingua Franca blog, so now I probably don't have to!  (To discuss cheeky Nando's, I recommend leaving comments at his post.) What I have done a post on is the BrE use of 'a [fast-food type]' to refer to a fast-food meal (a Chinese, a Burger King and, of course, a Nando's).

Thanks for reading to the bottom—this is longer than the (BrE) first-year essays I assign!
Read more

tape measure / measuring tape

Emma, an English friend now living in Canada, asked me:
Have you ever looked at measuring tape/tape measure for UK/US? A Canadian friend said she uses the first for the bendy fabric kind and the second for the more rigid, retractable builders' kind.
And I said 'That's how I do it too. What do you do?'  Since this was on Facebook, I now know that I know four Englishpeople who say tape measure for both. Everyone who's commented so far follows the English/North American division that Emma and her Canadian friend observed.

In other words, I learned to call this a measuring tape:

Photo by Ben Watkins: https://www.flickr.com/photos/falcifer/

and this a tape measure:

Photo by redjar: https://www.flickr.com/photos/redjar/with/136165399/

...and my BrE-speaking friends call them both tape measure.


What's interesting is that neither the North American semantic distinction nor the North America/UK difference is recorded in most dictionaries. They (both UK and US ones) tend to say measuring tape is another word for tape measure (Merriam-Webster [learner's dictionary], Oxford). Collins has measuring tape as an alternative for tape measure in its British English listings, but doesn't include it at all in American English. The American Heritage Dictionary doesn't have measuring tape at all. (The OED's first record for measuring tape is in 1805. Tape measure is 1873.)

Now, before you say 'maybe the distinction is a regional Americanism', note that Emma's friend is from western Canada, I'm from New York state and another Californian friend has reported that he makes the same distinction. There doesn't seem to be anything else similar among us either--male and female, people who sew and people who don't. Searching on Amazon.com, the distinction is not solid, but it's a tendency--one sees more of the metal things if searching 'tape measure' and more of the cloth things when searching 'measuring tape'. (The corpora just tell us that both terms are used in both countries.)

What the dictionaries do tend to tell us is that tape line is an American alternative for tape measure--but this is a term that's completely new to me. There is only one US example in the Corpus of Global Web-Based English, and in that one the author felt the need to clarify that they meant 'some kind of measuring tape of some sort'. In the Corpus of Contemporary American English, only one of the eight examples of tape line (as part of surveyors' tools) might be relevant--most are about making a line of tape (e.g. on a floor). And in the Corpus of Historical American English, the most recent relevant example is from the 1930s. The original citation in the OED is from Webster's American Dictionary of the English Language (1847), and it seems to have just been repeated in dictionaries ever since. So this looks much less current than the measuring tape/tape measure distinction. Attention lexicographers!
Read more

shock



In case you weren't paying attention, the UK had a general election yesterday, and the exit polls and final results were a surprise, given that the previous day's polls had indicated a much closer result. Because this is a language blog, I'm going to stick with a language observation, however tempting it is to do otherwise...

David S in the US emailed me with the following this morning:
Some time within the last year or so I started noticing the distinctive usage of the phrase "shock poll" in the British news media; since then it seems to have migrated to the US, though apparently not in major news outlets. It appears so far as I can tell to mean simply "poll with startling results", with adjectival "shock". Some googling shows that "shock survey" and "shock study" are out there as well.

Is this use of "shock" as an adjective in fact coming out of British newspaperese, and is its usage spreading beyond a delimited set of nouns?
Are British readers surprised to know this is a Britishism? Indeed it is. The dictionaries I've checked have no separate entry for shock as noun premodifier meaning 'surprising', but it's very much there in the language, as can be seen in this screenshot from the Corpus of Global Web-Based English.

The columns of numbers are: TOTAL || US Canada UK Ireland Australia.


This list of words comes a good way down the list of [shock + noun] items in the corpus (hence the lack of column label(l)ing) because there are other premodifying uses that don't mean 'surprising', but have to do with more physical senses of shock, such as shock absorber, shock treatment and shock wave. These are General English, not specific to any country.


Another premodifying shock means 'intending to shock', as in shock rock (theatrical rock music, intended to shock/offend) and shock jock (i.e. a radio DJ who expresses unpopular opinions in order to gain attention and responses). The OED lists these as American in origin, but shock jock now has a much stronger showing in Australia in GloWBE--and it's known and used in BrE too. Some of the examples that are showing as British in the table above could also be interpreted as 'intending to shock' --particularly shock tactic. But for most of them, what is meant 'a [noun] that the media didn't see coming'. Shock value, which also indicates 'intention to shock' is not American in origin, as far as I can tell. The OED's first example of that phrase is from the UK in the 1930s.

Though the OED doesn't list this the 'surprising' sense of shock, it does have a 1974 example of shock news, which seems to be of the same ilk:  
1974   Times 3 Apr. 1/1 (heading)    Shock news is broken to EEC ministers.
Like David S, I blame British media. British headlines are notorious for "noun piles", and shock poll is a two-word noun pile that is conveniently (for headline writers) shorter than shocking poll result.  I recommend reading Language Log on the subject of noun piles, but here's an example (without a shock):





The OED does, though, cover another 'shocking' BrEism: shock horror. This is used as a compound noun on its own or in a premodifying position, as in these OED examples:

1977   Gay News 7 Apr. 15/3   The message must have got through: certainly there were no shock-horror reactions and fun was had by all.
1980   Times Lit. Suppl. 31 Oct. 1240/4   The shock-horror world of the media men.
1981   Brit. Med. Jrnl. 18 Apr. 1312/2   The shock-horror TV Eye of recent weeks.

For some of us, the news today is less shock and more shock-horror. Oops, I got political. 





Read more

pleonasms

A pleonasm is a word or phrase with semantically redundant parts. So, for example, at this moment in time is a pleonasm because there are no moments outside time, so we don't really need to say in time. But people do.

Pleonastic expressions are things that language haters like to hate on. (These people often claim to be language lovers, but they don't seem to be very good at the love part.) So, they're the kind of thing that people complain to me about, with the Americans saying "Why do the British say X? It's repetitive and illogical", and the British saying "Why do Americans say Y? It's repetitive and illogical."

At their worst, these complaints come out as "Why do Americans/Brits always add extra words?"

When I get those complaints, I reply with some phrases from the speaker/writer's own dialect that have 'illogically redundant' words (it's not hard to do) and I say something like "language is not logical and it thrives on redundancy".

I mean, why say Yesterday we baked a cake? Yesterday is in the past, so why bother with the past tense marking on the verb? So redundant. Chinese wouldn't put up with that.

Thinking about these accusations that Brits/American always add extra words, I put a call out on Twitter and Facebook for BrE/AmE-specific pleonasms that others have noticed. We can see from the resulting lists below that there are no innocent parties in the Pleonasm Wars. Many of expressions aren't only said in the 'offending' dialect, but they are more common in one than the other. To indicate the relative "Americanness" or "Britishness" of a phrase, I've given a ratio, which indicates the proportion of instances of the phrase in the British and American portions of the Corpus of Global Web-Based English. (The minority uses in the other dialect may be things like "Can you believe the British call beets beetroot?". That is, the fact that there are some in the other dialect doesn't mean it's necessarily really used in that dialect. The ratios help indicate the chances that it really is AmE- or BrE-specific.) I've bolded the bit of the expression that could arguably be left out without a change in meaning and put links to places I've discussed these before, if available.

American expressions that British folk might find pleonastic
irregardless       5:1  (though generally considered non-standard in AmE)
in and of itself   3:1
tuna fish            3:1 (0 BrE instances as closed compound tunafish)
where I( a)m at  2:1  (again, not exactly standard AmE; and the corpus numbers have a lot of 'noise')

(An American one I didn't count was off of because the of is there for grammatical reasons not semantic ones. See the old post for discussion.)

British expressions that American folk might find pleonastic
beetroot             22:1
hosepipe            13:1
in N days' time  10:1
goatee beard      9:1
go and [verb]    e.g. go and see = 6:1 versus go see 1:2; note that go+verb predates go and verb in English--the and has been added in BrE, not deleted in AmE
postgraduate      6:1
station stop         4:1
at this moment in time    4:1
chocolate brownies         3:1
general consensus        1.6:1
late addition (2019): marker pen 24:1


You might want to argue that some of these are not redundant. It is a matter of perception. Brits might say beetroot isn't redundant because it distinguishes that part of the plant from the greens, but beetroot is redundant to Americans in the same way that carrotroot would be. Chocolate brownies is redundant because in AmE if it's not made of chocolate, it has to be called something else (e.g. blondies). (Americans do have the word brownie for other things too, the context is enough to let us know it's a baked good and not a fairy.) It's been argued to me that station stop is not redundant because trains sometimes have to stop (e.g. for a signal) when they're not at a station, and they sometimes pass stations without stopping. Did you know there's a tuna fruit?

In the end, the Twitter and Facebook and email people gave me more British [alleged] pleonasms than American ones.  Possible reasons for this:
  • Maybe British English does have more of them.
  • Maybe my social media posts were at better times for the US than the UK. (My waking hours don't quite fit the UK, in spite of 15 years' residence.)
  • Maybe Americans notice British pleonasms more than Britons notice American pleonasms (I was required to buy a copy of Strunk and White at college. I can't imagine the same happening in UK, where writing isn't a required university subject. So, maybe Americans are trained to cut extra things out of language where British folk are not. We're the country most likely to excise extra letters in the spelling system too.)
 Feel free to raise the American pleonasm count (or the British one) in the comments. If I like them, I may retroactively add them to the list here.



All my linguistically-correct tolerance for pleonasms aside, I am a ruthless redactor of extra words in academic writing. I train my students in Strunk and White's Rule 13: Omit needless words. If they write
Another reason why the categorisation of chocolate* is significant for humans derives from the fact that humans are essentially and uniquely a ‘languaging’ species.

...they get back the following, with an obnoxious note along the lines of "Your way: 24 words; My way: 11 words. Don't make me read twice as many words as I have to!!": 
Another reason why the categorisation of cChocolate* is also particularly relevant  significant for humans derives from the fact that humans are essentially and uniquely as a ‘languaging’ species.
[i.e.
Chocolate* is also particularly relevant for humans as a ‘languaging’ species]
* The noun has been changed to chocolate in order to protect the author's identity. But chocolate is particularly relevant to humans as a 'languaging' species. Without it, we couldn't have Cathy cartoons.


In writing academic essays for which (a) you have a word limit, so (b) the more words you use, the less you can say, and (c) you can be assured that your reader is going to be tired and grumpy before they even start reading, pithiness rules the day.


Acknowledgements
Thanks to those who contributed pleonasms to the list: Amanda P, Barbara J, Catherine P, David L, Iva, Jennifer, Kim E, Naomi N, Nicole S, Pam T, Rebecca M, Richard H, Sian C, Simon B.
I don't give full names unless I'm given permission to, and I am always happy to link your name to your blog/Twitter/webpage. So, if this applies to you, let me know and I'll add surnames and/or links.
Read more

an appreciation

I'm overdue for blogging here (I have a few topics lined up and partially researched) in part because I spent a very, very long time on US taxes and FATCA. This is definitely worthy of a rant. The US treatment of its expatriated citizens is absurd. But lots of other people are doing that rant. And I come here not to rant, but to appreciate.

I feel extremely privileged that writing this blog has led to so many interesting, polite, cooperative, informative, entertaining and just plain rewarding interactions--mostly online, occasionally in real life. Last week, a reader, correspondent and virtual friend died unexpectedly. I'm finding it strange to realise that you can miss someone you've never met. But the fact that the world is missing such a funny, interesting/interested, and generous person is difficult knowledge to have. That's before one even starts to consider that there are people who loved him closely who will be affected far, far more than internet acquaintances like myself. My heart goes out to them, though I do not know them.

Writing for the internet public about language is hard. It's also fun and has lots of perks. But it is hard because it's risky. There's always someone there to tell you that they think you're wrong, that they think you're unqualified, that you didn't talk about what they wanted you to talk about. It's hard because you can't always tell if people who respond to you are joking or talking down to you, if they're exasperated or just brief. And there are certainly people out there who haven't yet figured out how to tell when I'm any of those things. My strategy is to always try to read anything sent to me in the most positive way possible--to imagine a kind smile on their imagined faces and to try to have a sincere smile on mine when I reply. If I can't do that, it's better not to reply at all. I don't always succeed in not-replying to perceived rudeness, but with practice it gets easier.

Anyhow, that all said, my life on the web has been easy. (Which is good because the grief I do get is plenty enough!) Even though there is no shortage of people willing to be very rude on the internet about the national dialects I write about, they don't seem to come here (or to my Twitter feed or Facebook page) very often. Or maybe they do, but they behave themselves when they come here. If so, I'm very grateful to them for that restraint.

But more generally, the people who hang around this blog and virtually interact with me seem to be lovely people. If we knew each other in real life, we might well drive each other (BrE) bonkers, but maybe not.* There have been readers/commenters who were active for a while and then faded away; I'll never know if they're just lost to the blog or lost to the world. There are others who've been the blog's constant companions for years. And I'm sure the majority drop in for a word then forget about the blog. Whichever one you are, I just want to take a moment to appreciate the interactions we've had and will (I hope) continue to.




At any rate, here's to Marc Naimark. He is missed. As a tribute, here are some of the blog posts he inspired:
finger-tip search
write (to) someone
the big list of vegetables


* Marc and I got to know each other on a more personal level than some of us have, because we became Facebook friends in the early days when I accepted friend requests from names I recogni{s/z}ed from the blog. I now rarely accept friend requests from people I've not met in person. Sometimes I think I should do so, knowing how valuable I found those interactions with Marc, but on the other hand there were other requests that I accepted, then later became uncomfortable at having let those strangers into my family life. I'm sticking with that anti-social social-media policy (and directing people I don't know to interact with me on the Lynneguist page or Twitter feed) not because I don't want to get to know you better, but because my child's privacy is my priority. That said, perhaps we'll meet...
Read more

The book!

View by topic

Abbr.

AmE = American English
BrE = British English
OED = Oxford English Dictionary (online)