Endless Fun With Google Ngrams. A Few Examples to Get You Started.

Google Labs

Without much fanfare search giant Google has launched an incredibly addictive tool upon an unsuspecting world.

It’s called the Ngram Viewer and it quickly gives users a colourful graph which shows how the use of any word or phrase has changed over the last few centuries.

In fact it spans a period (astoundingly) of 500 years, though the best results are reserved for the years between 1800 to 2000. This is possible because Google has been quietly carrying out the digitisation of around 15 million books since 2004.

Now a subset of those books (actually around five million of them, accounting for around 4% of all the books ever published) can be searched for the 500 billion unique words included within them.

What does this mean? Well, in data visualisation terms anyone can now compare the impact certain people, events, names or phrases have made on popular culture down the centuries.

I’ve seen graphs to show how Frank Sinatra and Elvis Presley jostled for a place in the social psyche of the last century. Or another showing how El Dorado and Atlantis have vied for position of the most written about mythical places in the English language.

You can also find out why this is likely to be an absolute delight to linguists, wordsmiths and lexicographers of every stripe, as described in this fascinating take (including a bunch of interesting examples) on Joel Segal’s blogThis Guardian article also includes a bunch of interesting Ngrams pulled together by punters, while a fast-growing collection of examples can be found on Twitter at #ngrams.

Nothing is spared and users can delve into the thorniest subjects including politics, religion and sport.  Of course, most people are just going to use it to compare historical usage of swear words (yep, that includes me!). Ach – here, it’s easier if I show you:

Since we’re talking about pop culture let’s start with … well, pop culture. Two of the biggest and most influential acts of the past 40 years were The Beatles and the Rolling Stones. Indeed, the rivalry between the bands and their respective fans during the 60s and 70s is now the stuff of music legend.

Beatles V Stones

But which was really the most written about? Well, the results are surprisingly close. While The Beatles ace it, there was a long spell in the 80s and 90s when Keef, Mick and co were a lot more written (and presumably talked) about.

Moving on, though still sticking with The Beatles, just which of the lovable, mop tops made the biggest impact with the chroniclers of the time? The Lennon-McCartney song writing pairing will live forever, but has that been reflected in the literature of the age?

John, Paul, George and Ringo

As you can see, even before his 1980 murder, John Lennon was written about considerably more than Macca. Since his untimely death his position as the most written about Beatle has increased significantly.

All of which started me thinking about the most valuable dead celebrities. Searches can only be conducted up to 2008, before Wacko Jacko snuffed it. So I decided to see how mentions of poor old John Lennon compared with those of other mouldering stars who were snatched from us too soon.

The enduring myths, legends and conspiracy theories around the deaths of Marilyn Monroe and Elvis Presley have kept both earning (and being written about) from beyond the grave.

Dead Celebrities

I was also intrigued to see how mentions of James Dean spiked dramatically in the mid and late 80s – matching my own memories of how he was (rather inexplicably, as far as I was concerned) something of an icon at the time I was hitting young adulthood.

None of those celebs will be making a Lazarus-like rise from the dead. Unlike that other act which seemed dead and buried, Take That. The boy band’s 2006 reunion was the start of a remarkable comeback.

But with results only searchable up to 2008, mentions of the group actually seem to have tailed off between 2006-2008, suggesting the reunion did not make it into bookish lore in time to register on this chart.

Manufactured Bands

What really surprised me was that clean cut Irish warblers Boyzone managed to keep within touching distance of Take That (bet Louis Walsh wishes he was in touching distance of the lads) in terms of book mentions during and beyond the mid 90s.

Neither boy band came anywhere near the phenomenon that was the Spice Girls. Having seen the success of the Take That reunion, I suspect we’ll be seeing a money spinning get-together of Ginger, Posh, Baby, Sporty and Scary in the not-too-distant future.

Religion also throws up some fascinating – if useless – results. Here in Scotland parts of the community have been blighted by sectarian divisions between Catholics and Protestants, pretty much as a symptom of the Troubles in Ireland. Here’s how historical tomes have recorded the words ‘Catholic’ and ‘Protestant’.

Catholic or Protestant

Catholic or Protestant

On a more global scale, the world has more recently been divided between Christians and Muslims. To refine that search I popped in the phrases ‘Christianity’ and ‘Islam’.

As you’ll see the English language repository of books was far and away dominated by ‘Christianity’ until the end of World War One, when usage evened out. In the 1970s mentions of ‘Islam’ spiked for a while. However by the late 1990s, mentions of Islam in English language books were outstripping mentions of Christianity.

Christianity and Islam

This was happening several years before the 9/11 attack on the Twin Towers. Since then, the following graph (for the period 2000 to 2008) shows how mentions of ‘Islam’ are now far exceeding those for ‘Christianity’.

Post 9/11

In politics, the following graph shows how recent American presidents have fared in books published to 2008. Ronald Reagan is still referenced more than Bill Clinton while the term ‘George Bush’ (despite two presidents with that name), lags some way behind.

American Presidents

All three are on a downward trend while, from 2004, name checks of Barack Obama  in the books of the age have been headed just one way – up. It will be fascinating to see how that develops over the next few years.

Closer to home I wanted to see how the two giants of recent politics have fared in print – the Tory party’s Iron Lady, Margaret Thatcher, and New Labour’s colossus, Tony Blair.

Thatcher v Blair

As the graph shows, mentions of Labour’s most successful ever leader eclipsed those for the country’s first woman leader around early 2002.

A staple annual fixture in the Scottish media is the report on the most popular boys and girls names in any given year, culled from birth certificates lodged with the Registers of Scotland.

In 2010 the top four boys’ names in Scotland were Jack, Lewis, James and Logan. As this graph shows, number three name, James, has been the runaway most popular in print. Lewis and Jack have been pretty closely matched. Logan has come from nowhere.

Popular Boys' Names

Last year the most popular girls names were Sophie, Olivia, Ava and Emily – fascinating to see the peaks and troughs in the popularity of each over the past 200 years.

Popular Girls' Names

But I couldn’t pass an opportunity like this without having a look at my own chosen field – the media. As this graph shows, in the past 100 years the term ‘newspaper’ was eclipsed first by ‘radio’ and later by ‘television’.

Right up to 2008, however, mentions of all three traditional media models are still more prolific in print than the term ‘internet’. That won’t last long.

Media types

Changes in the Media

And finally, as a journalist turned PR man, I wanted to know how the respective professions have fared over the past 100 years or so. I have to say, I didn’t expect the term ‘public relations’ to be  anywhere near as well-recorded in print as the term ‘journalism’. I’m pleasantly surprised by this:

Journalism and PR

Journalism and PR

So what’s the point of all this? Nothing. But it is fascinating and highly addictive – and I’d love you to share any interesting Ngrams you come up with.


3 thoughts on “Endless Fun With Google Ngrams. A Few Examples to Get You Started.

  1. Pingback: Quiet News Day – Podcast 61 « Quiet News Day

  2. But I couldn’t pass an opportunity like this without having a look at my own chosen field – the media. As this graph shows, in the past 100 years the term ‘newspaper’ was eclipsed first by ‘radio’ and later by ‘television’.

  3. Pingback: Quick market analysis with Google N-gram Viewer | Designing Home Automation Control for Smartphones

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s