## Franklin Pezzuti Dyer

### Ambiguity in Language

#### Visual and mathematical ambiguity

You’ve probably seen the following famous picture (or some variation of it) before:

Is it two black faces in the foreground over a white background, or a white goblet in the foreground over a black background? Of course there’s no definite answer to this question - this picture is ambiguous, meaning that it has more than one possible interpretation. There are lots of visually ambiguous pictures like this, such as the famous Necker Cube:

We can’t tell what configuration the depicted cube is in because there’s no indication of which edges or faces are in front of the others. Really, there aren’t just two possible interpretations of this picture, but infinitely many, because we could be looking at a small cube close up, a large cube far away, or anything in between. When a three-dimensional object like a cube is depicted using only two dimensions in a flat image, some information is lost. This makes sense, since it’s not always possible to translate information from one medium to another without some information loss or errors, especially when cramming complex three-dimensional information into two measly dimensions.

I can’t resist including the following classic use of deliberate visual ambiguity to humorous effect:

However, I don’t want to spend too much time just looking at optical illusions and trick photography. The focus of this blog post is ambiguity in language - when some phrase or sentence, spoken or written, has more than one possible meaning. It’s a fun puzzle to deliberately construct ambiguous phrases or look for alternative meanings of sentences in a text while reading. However, in addition to this, analyzing ambiguity in language (and comparing it across languages) tells us something very interesting about the nature of human language and how our brains process language. I’ll spend the majority of this post dissecting and classifying various (often silly) examples of linguistic ambiguity and comparing across languages (mostly just between English, Spanish, and German). Towards the end of the post, I’ll briefly discuss the relevance of this analysis to the way humans naturally parse language.

Before jumping into language syntax, it would be helpful to first consider a mathematical example of ambiguity. Suppose I asked you (verbally) to compute “two times one plus three.” There’s clearly some ambiguity in this request: I could be asking you to compute twice the sum of one and three or the sum of twice one and three Similarly, if I asked you to calculate “one plus three squared,” I could either be referring to or This is why your middle school math teacher made such a big deal about PEMDAS. Mathematics is supposed to be an extremely precise language, which means that misunderstandings or equivocations need to be avoided at all costs. When we write a mathematical expression, it should refer to one thing and one thing only, not multiple possible things. When dealing with arithmetic and operations like addition, multiplication, and exponentiation, a standardized order of operations, coupled with the careful use of parentheses, is used to resolve ambiguity. For an example of parenthesization gone wild, see this question from Mathematics Stack Exchange.

### Words and sentences

The most obvious examples of linguistic ambiguity arise from words like “bear” and “bare” that sound the same, or words that can serve as more than one different part of speech, like “man,” which can be either a noun or an adjective.

But I’m not just going to fill this blog post with a long list of words like this, because not only are they the most boring type of linguistic ambiguity, but they also aren’t actually very common. When we communicate with each other, we don’t just blurt out words or phrases. We (usually) speak and write in full sentences, and the context provided by other words in a sentence helps us eliminate ambiguity. If someone says to me

“Meet me here at the same time tomorrow.”

I won’t wonder whether or not they intended to say

“Meat me here at the same time tomorrow.”

because that just doesn’t make any sense. “Meat” is almost always a noun, so the second sentence doesn’t contain any verb, and is completely grammatically awry. However, if someone tells me

“I have to go shopping for vegetables and meet for lunch with my friend.”

I really might not be sure whether the speaker intended to say

“I have to go shopping for vegetables and meat for lunch with my friend.”

In the second sentence, some other things are also unclear too. Are you going shopping with your friend, or eating lunch with your friend? Are the vegetables and meat both for lunch, or just the meat? I want to explore this kind of complex grammatical ambiguity, not just list a bunch of words with multiple meanings. Something happened in that sentence that allowed a certain word to serve either as a noun or a verb. What’s going on?

Let’s take a step back and talk about what exactly a sentence is. First of all, our language is creative, meaning that we can form sentences that we’ve never read or heard before by applying a few rules. Unlike many animals with a finite number of “signals” that have fixed meanings (bark means “I’m excited,” tail wag means I’m happy, butt sniff means “let’s get to know each other,” etc.), we can combine words and phrases in new ways to convey new ideas that we haven’t learned anywhere else and that certainly aren’t hardwired into our brains. If I so choose, I can create the sentence

“Today I saw a purple octopus with a top hat scuttling down the street.”

even though I’ve certainly never heard or read it before. Not only that, but I can use it to make you imagine and visualize a ridiculous event that never happened!

Our language is also recursive, meaning that we can build words or phrases and then stick them into other structures or grammatical templates to create arbitrarily long and complex sentences. For instance, take the sentence

“He stinks.”

Now, here’s a nice grammatical structure in the English language that allows me to convey secondhand information: if person X tells me sentence Y, then I can tell you

“X said that Y.”

Suppose my friend Alice tells me “that guy stinks.” If Alice’s observation stands in for Y and Alice stands in for X, then I can say

“Alice said that he stinks.”

Not only that, but if Alice heard this information from Bob, then I can go one layer deeper and say

“Alice said that Bob said that he stinks.”

And if Bob also received this as secondhand testimony from Chad, I can say

“Alice said that Bob said that Chad said that he stinks.”

I could keep going, but you get the picture. This is of course only a simple example; we apply many such templates and substitution rules in everyday conversations, and surely many more in formal writing.

#### Word-chain devices

Linguists have been trying for a long time to formalize and model the process that we use to build and parse sentences. In The Language Instinct, Steven Pinker mentions a model called a word-chain device that was proposed as the mechanism by which we form recursive sentences. Here’s an example borrowed from Pinker’s book:

By starting at “the/a/one” and following arrows until we reach the end, we can form all kinds of grammatically correct sentences, like

“A dog eats candy.”
“The happy boy eats ice cream.”
“One happy happy happy happy happy girl eats hot dogs.”

It seems like a good enough model for this simple example, but Pinker points out that it falls apart when we try to create word-chain devices for more complex sentences. To borrow another example from Pinker, consider the sentences

“Either the girl eats ice cream, or the boy eats candy.” “If the girl eats ice cream, then the boy eats hot dogs.”

This leads to the following word-chain device:

It seems to work at first, but closer inspection reveals that this broken mechanism allows us to produce inanities like

“If the dog eats ice cream or the girl eats candy.”
“Either the boy eats candy then the girl eats hot dogs.”
“If the happy dog eats ice cream then the dog eats ice cream then the dog eats ice cream.”

We could probably salvage this model by introducing some more complicated rules regarding which arrows to follow when, and how many times each arrow can be followed, and so on. But at that point, Occam’s Razor would obligate us to shave the whole thing away. Clearly language is more complex than just a series of movements from one word to the next.

#### Syntactic trees

Currently, the prevailing model of grammar in sentences is the syntactic tree, which I will describe momentarily, and which will be very useful for allowing us to understand syntactical ambiguity. However, be warned that I am not a professional linguist, and the syntactic tree model that I lay out might not be the most current version. In fact, there are some particular modules of the syntactic tree that are commonly used, but that would overcomplicate things for the sake of this post (e.g. the complementizer phrase and inflectional phrase). A simplified version of the syntactic tree will suffice for our purposes.

For any reader who isn’t already familiar with syntactic trees, have a look this example, and then I’ll explain what each part means:

This is a syntactic tree for the sentence

The cat sat on the mat.

Here’s a quick description of the types of modules of this tree:

• S represents an entire sentence, which branches into a noun phrase (NP) and a verb phrase (VP).
• NP represents a noun phrase, which could just be a noun (N) by itself like “cats” or “Jerry.” It could also be a noun preceded by a determiner (D), as in “the dog.” It can also contain an adjective.
• N represents a noun.
• Det represents a determiner like “a,” “the,” or “one.”
• VP represents a verb phrase, which always contains a verb but can also contain other modules, like another noun phrase (NP) to stand in as a direct or indirect object, a prepositional phrase (PP), or even another verb phrase.
• V represents a verb.
• PP represents a prepositional phrase, like “in Nevada” or “on the table.” This typically consists of a preposition (P) like “in” or “on” followed by a noun phrase like “Nevada” or “the table.”
• P represents a preposition.

Let’s have a closer look at the example sentence from earlier:

“I have to go shopping for vegetables and meet for lunch with my friend.”

If we take this to be a spoken sentence, we must also take into consideration the phonetic ambiguity between “meet” and “meat.” I detect at least five different possible meanings, and they can be represented by the following trees:

What monstrosities! By the way, the little triangles mean “there should be some more branches in here but I’ve left them out.” Again, be warned that my dissection of this sentence might not be exactly correct (for instance, I’m not sure if the verb phrase “go shopping” should instead be the noun phrase “to go shopping”), but it at least demonstrates my point that a simple sentence can have many possible meanings. To facilitate reading these trees, here are explanations of the specific meanings of the sentences in each one, respectively:

• I have to go shopping for veggies, and then I have to meet my friend so that we can eat lunch together.
• I have to go shopping with my friend, so that we can buy vegetables and meat, both of which will be eaten for lunch.
• I have to go shopping for meat and vegetables, both of which will be eaten when my friend and I eat lunch together.
• I have to go shopping for vegetables and meat, the latter of which will be eaten when my friend and I eat lunch together.
• I have to go shopping with my friend, so that we can buy vegetables and meat, the latter of which will be eaten for lunch.

That was a pretty complicated example. To really isolate and understand different sources of ambiguity, we’ll have to deal with some simpler examples. However, before moving on, I’d like to make an analogy that helps (me, at least) understand why such ambiguity occurs. As the syntactic tree indicates, the actors, actions, and abstractions in a typical sentence interact in a tangled web of nested relationships. When we attempt to pack all of this information into a single linear sentence, we lose information in the same way that we lose information when we draw a three-dimensional cube on a two-dimensional piece of paper. Loading up a simple sequence of words with so much relational information for the reader/listener to unpack is almost like compressing data on a computer, sending it somewhere else, and decompressing it on another device.

#### Lexical ambiguity

The examples of ambiguity in language that most easily come to mind are those caused by these three common culprits:

• Homophones are words that sound the same but have different meanings. A common example is the pair ”meet” and ”meat”.
• Homographs are words that are written the same but have different meanings. For instance, ”rebel” can either be a verb or a noun (however, when spoken, the noun is pronounced “rebel” and the verb is pronounced “rebel”).
• Homonyms are words that are identical in sound and appearance, but have different meanings. There are plenty of words like “rebel” that can be nouns or verbs but are pronounced identically in either form, such as “man,” “dog,” “run,” and even “sandwich.” Another classic example is the word “bank,” which could either refer to the edge of a river or a financial institution.

In languages like Spanish and German whose words have genders, sometimes the meaning of a word depends on its gender. “El corte” means a cut or edge, while “la corte” refers to a royal court. “Der Moment” means an instant in time, while “das Moment” can refer to physical momentum. There even exist autoantonyms, or words that can take on two opposite meanings (called Januswörter in German, based on Janus, the two-faced Greek god), like “inflammable” in English.

Like I said earlier, this type of ambiguity by itself isn’t very interesting. There really isn’t much to analyze in sentences like

“John went to the bank.”
“That dog smells.”
“Er starb nach einem Kampf gegen einen starken Krebs.”

in which two or more meanings of a particular word are interchangeable without changing the structure of the sentence. It’s worth noting, however, that this type of ambiguity runs amok in Spanish verbs, since the first-person and third-person conjugations of verbs in the imperfect and subjunctive tenses are often identical. For instance, the simple sentence

“Lo tenía.”

Could either mean “I had it” or “he had it.” Also, the present-tense conjugations for verbs in third-person and formal second-person are always identical. Thus, the sentence

“¿Tienen hambre?”

could mean “are they hungry?” or, formally, “are y’all hungry?” In English and German, this doesn’t happen often because every sentence has to have an explicit subject. However, in German, the third-person plural pronoun “sie” sounds identical to the formal second-person singular pronoun “Sie,” so the sentences

“Haben Sie Hunger?”
“Haben sie Hunger?”

are distinguishable on paper but not in speech.

#### Syntactic ambiguity

Here’s an example of a sentence that is ambiguous because of a homonym, but its alternate meanings have different structures:

“He collects seeds and plants.”

It’s not as drastic of a structure change as in the earlier example with “meet” and “meat,” but here are the two trees representing its possible meanings:

Because the two meanings of “plants” are different parts of speech, the structure of the tree can’t be preserved. The difference between the two trees above is the position of this “twig”:

The “and” can conjoin a noun to another noun or a verb to another verb, but not a verb to a noun. When “plants” is a noun, “seeds and plants” makes sense and “collects and plants” does not, but the opposite is true when “plants” is a verb. Interestingly, this kind of confusion couldn’t happen in written German, since all nouns are capitalized. Also, German is a V2 word order language, meaning that the second word in a sentence/clause must be its verb.

In many sentences, however, no words change meanings at all - only the relationships between the words change. Consider, for example, the sentence

“I saw the man with binoculars.”

The trees corresponding to the two possible meanings of this sentence are given by

Either I looked at a guy using binoculars, or I looked at a guy who had binoculars. In this case, the prepositional phrase “with binoculars” is causing trouble, since it can modify either “saw” or “the man.” Actually, prepositional phrases cause all kinds of trouble, as in each of the following sentences:

“The dancers leapt, twisted, and turned on the television.”
“We have to liberate the prisoners from the American embassy.”
“Please put the cake on the plate in the fridge.”
“Der Schmetterling verwandelt sich in einem Baum.”

The second-to-last sentence could mean that a butterfly in a tree is metamorphosing, or that it is actually morphing into a tree (the same trick works in English, and you may recognize it from the old joke about a car turning into a driveway). Prepositions can really cause confusion if we nest them multiple times in long sentences, like this:

“Put the cake in the box in the fridge.”
“Put the cake in the box in the fridge in the corner.”
“Put the cake in the box in the fridge in the corner in the kitchen.”
“Put the cake in the box in the fridge in the kitchen in the neighbor’s house.”

The first sentence has $2$ possible meanings, the second has $3$ possible meanings, the third has $4$, the fourth has $5$, and as you can probably guess, a similar sentence with $n$ prepositional phrases tacked onto the end will have $n$ possible meanings. However, the preposition “in” is harmless compared to something like “next to.” If we know that X is in Y and Y is in Z, then we can conclude that X is in Z, making “in” a transitive relationship. However, if X is next to Y and Y is next to Z, it does not follow that X is next to Z. Consider the following sentences:

“Put the cake next to the pie next to the stove.”
“Put the cake next to the pie next to the stove next to the window.”
“Put the cake next to the pie next to the stove next to the window next to the clock.”
“Put the cake next to the pie next to the stove next to the window next to the clock next to the fridge.”

In this case, the first sentence has $2$ possible meanings:

“Put the cake next to (the pie next to the stove).”
“Put (the cake next to the pie) next to the stove.”

But the second sentence has $5$ possible meanings:

“Put the cake next to (the pie next to (the stove next to the window)).”
“Put the cake next to ((the pie next to the stove) next to the window).”
“Put (the cake next to the pie) next to (the stove next to the window).”
“Put (the cake next to (the pie next to the stove)) next to the window.”
“Put ((the cake next to the pie) next to the stove) next to the window.”

The third sentence has $14$ possible interpretations, and the fourth sentence has $42$. If you’re curious, the number of possible interpretations of a sentence like this with $n$ prepositions at the end is equal to the nth Catalan Number, meaning that it grows exponentially. Of course, nobody would ever say sentences like these, but if you’re a theorist like myself, you’ll find them fascinating anyways.

Prepositions aren’t the only parts of speech that cause such trouble. Almost any word that refers to another word can be similarly ambiguous. For instance, the adjective in the following sentence creates ambiguity in the same way as the prepositions in the previous examples:

“I see scrawny cats and dogs.”

Are the dogs scrawny too, or just the cats? Here are the trees corresponding to the two interpretations of this sentence:

#### Comparison with Spanish and German

In Spanish, different kinds of ambiguity can be created using adjectives because they follow the nouns they modify, rather than preceding them as in English. For instance, the sentence

could either mean that a lady shouted at her angry mother, or an angry lady shouted at her mother. However, sometimes the requirement of gender agreement between an adjective and its referent eliminate ambiguity, as in this sentence:

There’s no confusion here - the word “enojada” refers to the dude’s “madre”. If it referred to him, it would have to be “enojado.” German takes this even farther by requiring adjective endings that not only must agree in gender, but also in case (distinguishing between subjects, direct objects, indirect objects, etc.).

However, there is one variety of ambiguity that can occur only in German, not in Spanish or English. In sentences like

“Viele störenden Vögel vertreiben die Geschäftsleute.”

the subject and the direct object of the verb are unclear. This sentence could be telling us about some businessmen chasing away a bunch of annoying birds, or a bunch of annoying birds chasing away some businessmen. Personally, I find it surprising that a language that insures against so many other kinds of ambiguity (by anally requiring case endings and V2 word order) would allow for confusion about something as crucial to the meaning of the sentence as its subject and object.

To summarize, there are three main types of syntactic ambiguity that I’ve noticed in English, Spanish, and German:

• The most boring kind, in which the two meanings of a word like “bank” serve the same grammatical function and are interchangeable without altering the sentence’s structure
• A sentence in which a word can take on two different meanings, but these two meanings have different parts of speech, so the sentence structure is fundamentally different in the two possible interpretations
• A sentence in which the meanings of words are the same in all possible interpretations, but their referents/relationships change

Another thing that I find extremely interesting is the fact that some languages protect against ambiguities that I didn’t even realize ran amok in English. For instance, the sentence

“He ran in the gym.”

could either indicate that someone ran around inside of a gym, or that someone ran into the gym from outside. In German, these two possibilities would be expressed respectively by the following two sentences:

“Er lief in die Sporthalle.”
“Er lief in der Sporthalle.”

There are many other examples of this in other languages. In Spanish the verbs ”ser” and “estar” would both translate into the verb “to be” in English, but they distinguish between permanent, essential characteristics (“He is a human” or “It is brown”) and transient, circumstantial ones (“He is happy” or “It is on the shelf”). Some languages like Sanskrit, in addition to just having singular and plural numbers for nouns, also have a dual case, referring to exactly two objects. Some languages even add affixes to words or statements to indicate how certain the speaker is of the information being communicated, whether the speaker heard the information secondhand from someone else, and so on. Alas, if only I knew some more languages, then I could list more examples here.

#### Why we might have ambiguity

In English, we have $26$ different letters, meaning that the number of possible words with $n$ letters is equal to $26^n$. This means that the number of possible five-letter words is equal to

Okay, I’ll admit that some of these words, like “kqqxv,” would be virtually unpronounceable. But even if we only counted words in the much more pronounceable form “CVCVC,” where “C” stands for a consonant and “V” stands for a vowel, the number of words would be

However, the English language has fewer than $250,000$ words, and that includes thousands of archaic and obsolete words. If you think about it, there’s no good reason for us to have any words with more than five letters, since there are vastly more five-letter combinations than words that we currently use. So why the heck do we have homonyms with more than one possible meaning packed into a single word? There are clearly more than enough letter/sound combinations for us to give each concept its own word.

Of course, languages aren’t designed, but evolved. Prehistoric humans did not all sit down together and collectively decide to make a language - it formed naturally and gradually. Even modern attempts to standardize and control language are usually futile, because no single authority can control how thousands and thousands of people speak on a daily basis. Existing words undergo subtle changes in meaning, new words form as memes spread through a population, existing languages mix together during intercultural interactions, and slang forms. This all suggests that language is probably influenced more by evolutionary laws than deliberate human attempts to improve it by, say, reducing ambiguity.

But even an evolutionary perspective seems to suggest that a living, evolving language should tend to become less ambiguous. Wouldn’t a population of people with a more ambiguous language be more prone to misunderstanding? Shouldn’t the group of people with the most efficient and least ambiguous language have a better chance of survival?

Some have suggested that some amount of ambiguity actually makes a language more efficient, not less. The human brain is really good at making subtle inferences, reading between the lines, and making good guesses about missing information based on context and prior knowledge. Therefore, a language that eliminates as much ambiguity as possible might be extremely cumbersome to speak, whereas loosening things up by allowing a little bit of uncertainty, which the listener could easily (in fact, almost automatically) resolve, might save the speaker a lot of trouble while making communication only slightly less reliable.

If you think about your everyday exchanges in spoken and written language, this should ring true. How often does a grammatically ambiguous sentence actually cause a significant misunderstanding between you and another person? Although the esoteric and convoluted language present in bureaucratic forms and legal documents can produce frequent ambiguities with real and weighty consequences (which Pinker discussed briefly in The Language Instinct), these types of confusion wouldn’t have affected our ancestors while language was first emerging.

In fact, most real-life examples of ambiguity that come easily to mind are deliberate and intended humorously. Homonyms and homophones feature ubiquitously in cheesy jokes like

“What did the fast tomato say to the slow tomato? ‘Ketchup!’”
“Did you hear about the guy whose whole left side was cut off? He’s all right now.”
“One morning I shot an elephant in my pajamas. How he got into my pajamas I’ll never know.”
“Time flies like an arrow, and fruit flies like a banana.”

...not to mention that every single one of those telephone pranks that Bart and Lisa Simpson play on Moe the Bartender makes use of phonetic ambiguity between a name (“Amanda Huggenkiss”) and some potentially embarrassing phrase that the hapless Moe is tricked into shouting across the bar (“a man to hug and kiss”). There’s also a whole class of puns revolving around phonetic ambiguity in Japanese, and some of them are quite amusing (although I’m sure they would be even more amusing if I spoke Japanese).

So the reality is that ambiguity probably adds value to a language, not just by counterintuitively increasing its efficiency, but also adding all kinds of culturally priceless corny puns. And with that realization, I’ll conclude this blog post. I encourage you, reader, to keep an eye out for ambiguous grammatical structures whenever you’re reading, writing, speaking, or listening. Personally, this not only provides me with unlimited grammatical puzzles and word games with which to fill my idle time, but it also gives me a nice chuckle now and then.