Imagine trying to make sense of the world with only a thousand different words at your disposal. It might cause difficulties, but perhaps not the ones you might expect.
We can see it what it might look like in xkcd’s amusing Up Goer Five, which despite being deliberately silly is, in its own context, a perfectly adequate intralingual translation of a diagram that demonstrates the main points of how the Saturn V launch vehicle was constructed. This is rocket science. But it only uses only the “ten hundred” most common US English words (the word “thousand” is ironically not one of them).
It’s a brilliant poster and I want one, but like many hilarious things, it would be horribly inappropriate if applied to most real-world situations. I would not actually use it even in a primary school classroom, at least not as a tool for explaining how rockets work. I think that nearly everyone would agree that an adequate or appropriate translation of “hydrogen” in the context of the Hindenburg Disaster is very unlikely to be “the kind of air that once burned a big sky bag and people died”. But the frequencies of that sentence’s words are not the essential problem. Sensitivity and culture are: the sentence is marked, unexpected, just not the kind of thing that anyone ever says.
The English word “hydrogen” is only a couple of centuries old and derived from the Greek (via French) for “bringing forth water”, which as an explanation of hydrogen’s nature is only illuminating in specific contexts. “Inflammable air” might have worked if you and I were 18th century chemists. It is not necessarily any individual lexical choice, or indeed the sizes of the lexicons from which we choose, that determine fidelity and capture intention. Words are not often uttered as individual, stand-alone entities in their citation form. We speak as other people expect us to speak, almost all of the time. Language and meaning are co-constructed. It can take bloody ages.
In German, that most whimsical of languages, hydrogen is wasserstoff – “water matter/material”. What does “a word” mean in a fusional language where Rhabarberbarbarabarbarbarenbartbarbierbierbarbärbel is allowable? Whether or not my in-laws and their recent ancestors really have “a word for hydrogen”, it didn’t stop them developing rockets. (Germans also name their televisions fernseher, the “far seers”, which might sound like something from Lord of the Rings but is actually the same intended meaning as the borrowed Greek root tele- and Latin root visi-.)
It can take several drummings-home to get sign language students and aspiring interpreters to stop asking the question “What is the sign for …?” – a question that implies an underlying mindset that different languages always have word-for-word, code-for-code correspondences – and ask for the context instead: “How would you sign … when we’re talking about …?” No-one can say exactly what the size of the British Sign Language (BSL) lexicon is: existing online dictionaries are on the order of about 4,000 to 5,000 signs. Corpus linguistics may shed more light on that in future, but we will still have to deal with the question of how conventional and widespread a sign has to be before it “officially” becomes BSL – fortunately we have no Académie française staffed by dusty, conservative Immortals in the Deaf community – and even what is meant by “a sign”. At any rate, I’m pretty confident in asserting that there is no well-established individual sign in BSL that encapsulates exactly the same concept as “hydrogen” does. Instead, Deaf scientists and interpreters borrow from English, just as English borrowed from the French which borrowed from the Greek. That is only a problem if you want it to be one.
There is an urban myth that “newspaper” The Sun has less than a thousand different words in it. Like many things we really, really want to be true, it isn’t even close. Prolific linguist and lecturer David Crystal emphatically debunks the idea on his blog, stating that the linguistic diversity of a copy of The Sun with proper nouns removed is about 8,000 words – roughly equivalent to that of the King James Bible, one of history’s most influential and discussed translations (for better or worse). Professor Crystal’s demolition is the best antidote to some of the bilious, contradictory and pathetically incompetent reporting which tried to paint British teenagers as only knowing or using 800 words: sorry, Daily Mail, but you are having a bad problem and you will not go to space today.
Past a certain point, a fairly low bar for complexity, it doesn’t practically matter to discourse what the size of a lexicon is. To say that meaning explodes out of language is like describing the Chicxulub Impact as a bump in the night: we simply cannot even start to visualise the staggering immensity of our own linguistic potential. If you want to try the impossible, or at least get a sense of the awe-inspiring magnitude of the numbers involved, you can turn once again to xkcd and get an answer to the question “How many English sentences can fit into a ‘tweet’?” The answer would be, for practical purposes, the same as for “How many BSL sentences can fit into a ‘Vine’?”: enough to fill a universe of universes.