Cultural Evolution 8: Language Games 1, Speech

I'm bumping this post, from 2010, to the top of the queue for two reasons: 1) the section "Language Games and Game Theory" is germane to my recent post, Why do we need a genotype-phenotype distinction for cultural evolution? Because minds are built from the inside.The post proposed as an answer: minds are built from the inside. From that it follows that we can't read one another's minds, which is my point of departure in this post. 2) The following section, "What is a language and what are the memes?," is where I first worked out my current approach to the genetic component of culture, which I have since come to call coordinators. The rest of the posts in this particular series are gathered under the tag CE workshop. Note: You might want to read the comments for this post.

* * * * *
The key to the treasure is the treasure.
– John Barth
But I’m not talking of language games in Wittgenstein’s sense, though the Wittgenstein of the Tractatus had a considerable influence on me as an undergraduate. No, I’m thinking of game theory, not something I’ve studied, though I did have an undergraduate course on decision theory taught by R. B. Braithwaite. But I’m getting ahead of the game.
As the title says, this post is about language. There’s been a fair amount of work done on language from an evolutionary point of view, which is not surprising, as historical linguistics has well-developed treatments of language lineages and taxonomy, the “stuff” of large-scale evolutionary investigation. While this work is directly relevant to a consideration of cultural evolution, however, I will not be reviewing or discussing it. For it doesn’t deal with the theoretical issues which most concern me in these posts, namely, a conceptualization of the genetic and phenotypic entities of culture. This literature is empirically oriented in a way that doesn’t depend on such matters.
The Arbitrariness of the Sign In particular, I want to deal with the arbitrariness of the sign. Given my approach to memes, that arbitrariness would appear to eliminate the possibility that word meanings could have memetic status. For, as you may recall, I’ve defined memes to be perceptual properties – albeit sometimes very complex and abstract ones – of physical things and events. Memes can be defined over speech sounds, language gestures, or printed words, but not over the meanings of words. Note that by “meaning” I mean the mental or neural event that is the meaning of the word, what Saussure called the signified. I don’t mean the referent of the word, which, in many cases, but by no means all, would have perceptible physical properties. I mean the meaning, the mental event. In this conception, it would seem that that cannot be memetic.
That seems right to me. Language is different from music and drawing and painting and sculpture and dance, it plays a different role in human society and culture. On that basis one would expect it to come out fundamentally different on a memetic analysis.
This, of course, leaves us with a problem. If word meaning is not memetic, then how is it that we can use language to communicate, and very effectively over a wide range of cases? Not only language, of course, but everything that depends on language. Literature obviously – which I’ll take up in the next post – but much else as well.
Speech as a Means of Communication Willard van Orman Quine has given us a classic thought experiment that points up the problem of word meaning. He broaches the issue by considering the problem of radical translation, “translation of the language of a hitherto untouched people” (Quine 1960, 28). He asks to consider a “linguist who, unaided by and interpreter, is out to penetrate and translate a language hitherto unknown. All the objective data he has to go on are the forces that he sees impinging on the native’s surfaces and the observable behavior, focal and otherwise, of the native.” That is to say, he has no direct access to what is going on inside the native’s head, but utterances are available to him. Quine then asks us to imagine that “a rabbit scurries by, the native says ‘Gavagai’, and the linguist notes down the sentence ‘Rabbit’ (of ‘Lo, a rabbit’) as tentative translation, subject to testing in further cases” (p. 29). And thus begins one of the best known intellectual romps in the philosophy of language.
Quine goes on to argue that, in thus proposing that initial translation, the linguist is making illegitimate assumptions. Perhaps he begins his argument by noting that the native might, in fact, mean “white” or “animal” and later on offers more exotic possibilities, the sort of things only a philosopher would think of. Quine also notes that whatever gestures and utterances the native offers as the linguist attempts to clarify and verify will be subject to the same problem. Quine’s argument is thorough and convincing.
When he did that work, however, he did not, of course, have access to a range of more recent work in cognitive anthropology and evolutionary psychology that indicated that our adapted minds have a preferred way of parsing the world, as do baboons. To be sure, this is “overwritten” and augmented in culture-specific ways, but those underlying perceptual and cognitive systems do not disappear. To consider a specific example, the work on folk taxonomy (Berlin 1992) suggests that there is a so-called basic level of designation, and that is at the level of “rabbit” and not “animal” (in fact, many languages don’t even have a word at that level of generality). So the linguist is reasonable in assuming “rabbit” is a more likely translation than “animal.” Other considerations are likely to rule out “white” or Quine’s other suggestions. I have no reason to believe that this cognitive architecture so constrains matters that there is only one possible referent for “Gavagai.” But I do think that it is likely to turn out that, all other things being equal, “rabbit” is in fact the best guess.
This situation, of course, is rather different from that of ordinary speech between people who share a common language. In the common situation both parties would know the meaning of “Gavagai.” Yet, however effective it is, ordinary speech sometimes fails to secure understanding between people and, where such understanding is achieved, that achievement has required back-and-forth speech. The mutual understanding is achieved through a process of negotiation. As William Croft reiterates in chapter 4 of Explaining Language Change, we cannot get inside one another’s heads and so must negotiate meanings in conversation.
That is to say, communication through language is not a matter of sending information through a pipeline. It does not happen according to what Michael Reddy (1993) has called the conduit metaphor. Reddy’s article is based on 53 example sentences. Here are the first three (p. 166):

1. Try to get your thoughts across better
2. None of Mary’s feelings came through to me with any clarity
3. You still haven’t given me any idea of what you mean

Reddy’s argument is that many of our statements about communication seemed to be based on the notion of sending something (the thought, idea, feeling) through a conduit, hence he calls it the conduit metaphor. He knows that communication doesn’t work that way, but that’s not is central issue. His central concern is to detail the way we use the conduit metaphor to structure our thinking about communication.
Reddy’s argument is reminiscent of a somewhat earlier argument by Paul de Man, “Form and Intent in the American New Criticism” (1983, first published in 1971). Consider this passage (p. 25):

“Intent” is seen, by analogy with a physical model, as a transfer of a psychic or mental content that exists in the mind of the poet to the mind of a reader, somewhat as one would pour wine from a jar into a glass. A certain content has to be transferred elsewhere, and the energy necessary to effect the transfer has to come from an outside source called intention.

De Man’s point was that, when we read a text, the intention (de Man uses the term in its somewhat rarified philosophical sense) that gives life to those signs on the page is our intention, not the author’s. And he is right.
De Man’s insight, and similar ones by Derrida, Barthes, Foucault and others, had an electrifying effect on literary critics in the United States, leading to a tremendously fertile period in academic literary criticism that, however, became increasingly sclerotic in the 1990s. But that story’s neither here nor there. My point is simply that these thinkers were attempting to deal with a real problem and, ultimately, they failed.
What, for example, could Derrida (1976, p. 158) have possibly meant by proclaiming “There is nothing outside of the text”? What he did not mean is that the world is nothing but a text and a text created by more or less arbitrary social conventions. Read sympathetically, and in context, the phrase seems to mean something to the effect that there is no way we can “step outside” language so as to examine, in full omniscient and transcendental objectivity, the relationship between language and the world. And that, it seems to me, is true. We’re always going to be immersed in “language,” whether natural or the various languages of science and mathematics.
How, then, do we fly free of the bottle? We play games.
Language Games and Game Theory Where de Man argues that intent cannot be transmitted from one speaker to another like pouring wine from a jar, William Croft points out that linguistic communication is tricky “precisely because our thoughts cannot leave our heads” (2000, p. 111). Croft is a linguist who has undertaken to explain language change using an evolutionary approach. He defines a language to be “the population of utterances in a speech community” (p. 26), thus focusing our attention, not on some abstract language system, but on the concrete production of speech.
How does Croft deal with the fact that we cannot transmit thoughts directly to another’s mind? He argues that meaning is negotiated in the back-and-forth of conversation and draws on game theory to make his argument (p. 95):

There is a problem here: the hearer cannot read the speaker’s mind, and she can’t read his. This is what is called a COORDINATION PROBLEM. In speaking and understanding, speaker and hearer are trying to coordinate on the same meaning.

Croft then introduces the notion of a third-party Schelling game in which two players “are presented by a third party with a set of stimuli” which helps them converge on the same meaning. Sometimes it works, sometimes not. One possibility, he argues, is to use “natural perceptual or cognitive distinctiveness [as] a COORDINATION DEVICE” (p. 96). That gives us the adapted mind that I invoked in discussing Quine’s problem. Croft goes on to discuss a variety of linguistic devices as non-conventional coordination devices.
While the details are interesting and important – I recommend his discussion to you – we need not worry about them now.
Save one. Croft notes that, in order for speaker and hearing to reach agreement in conversation their mental states “need not be identical, though it is assumed that they are systematically related” (p. 99). Later on he notes that (114):

successful communication involves not the recovery of and original, ‘correct’ interpretation of the speaker’s original intention, but instead an interpretation that evolves over the course of the conversation, and is assessed by the success or failure of the higher social-interactional goals that the interlocutors are striving to achieve.

One reason why this effort is not doomed to failure from the beginning is the fact that although we cannot read each other’s minds, we do inhabit a shared world.
Croft’s general point, then, is simple, speech communication is a two-way interaction, not the one-way transmission of meaning, information, whatever, though a channel. De Man’s problem is thus solved for the case of face-to-face interaction, a common case, and surely the most basic one. Note that this solution does not involve recourse to a transcendental signified nor to stepping outside the text, nothing like that. It involves the ordinary and obvious means of interactive speech. In this sense, the key to the treasure, is the treasure. Nothing else is required.
But what, you may ask, of written communication, where direct interaction is not possible? After all, de Man was a literary critic, writing about the reading of written texts. What about that?
Good question. I’m going to punt on it. But I observe that some written communication – correspondence – does involve interaction, but at a slower pace than conversation, often much slower. In the case of literary texts, yes, readers cannot ordinary interact with authors, but they can interact with one another. I’ll say a little about that in the next post. Beyond that, yes, there are issues, serious issues. But this is not the place to address them. My concern here is just to get things started.

Note: Mathematician and psychologist Mark Changizi (1999) has an interesting argument about why vagueness of word meaning is essential to the proper functioning of language. His argument is grounded in considerations of computability and I recommend it to you. It makes an interesting complement to the game-theoretic conception of speaking.

Addition: See subsequent post reporting an experiment that David Hays did at RAND in the mid-1950s. It’s relevant to the game theoretic treatment of conversation.

What is a language and what are the memes? Now I want to shift gears a bit and work my way back to the physical “side” of the linguistic sign, because that’s where we’re going to go looking for memetic entities.
Throughout this post I’ve been assuming that we know what a language is. Now I want to get picky. Here’s what Sidney Lamb has to say in Pathways of the Brain. He’s talking about Roman Jakobson, the great linguist (p. 41):

Using the term language in a way it is commonly used . . . we could say that he spoke six languages quite fluently: Russian, Czech, German, English, Swediksh, and French, and he had varying amounts of skill in a number of others. But each of them except Russian was spoken with a thick accent. It was said of him that “He speaks six languages, all of them in Russian.” . . . the evidence indicates that from a neurocognitive point of view there is no such unit as a language. What exists from a neurocognitive point of view is not so much one linguistic system as a group of interconnected systems, relatively independent from one another.

Lamb goes on to assert that (p. 42):

Professor Jakobson’s internal linguistic information included a single phonological system, that of his native Russian, together with separate systems of grammar and lexicon for Russian, Czech, English, German, French, and Swedish – with some overlap in these grammars and lexicons . . . along with his more limited abilities in various additional languages; plus a conceptual system connected to them all.

So far we’ve been concerned with how meaning is negotiated, where meaning is a matter of the conceptual system. That’s on one “side” of the arbitrary sign, the side inside the brain. Now we’re going to look at the other “side” of the sign, the side that’s in public view, the physical sign. It’s that physical side that most differs among languages.
The question before us is: How do we conceptualize the memetic elements of language? In glossing the emic/etic distinction in a comment to John Wilkins I remarked that (now I’m simply repeating that comment) the distinction originates in linguistics, in the distinction between phonetics and phonemics. The former is about the psychophsics of speech sound while the latter is about phoneme systems. These are obviously very closely related matters, but they aren’t the same. We tend to perceive the speech stream as consisting of discrete sound entities, syllables and phonemes; this is the domain of phonemics. But the speech signal is, in fact, continuous. If you look at a sonogram of some chunk of speech, you don’t draw a series of vertical lines through it separating one phoneme from another; nor can you snip a tape recording into phoneme-long or syllable-long segments and reassemble it into something that sounds like natural speech. The aspects of the speech stream which are phonemically active differ from one language to another, which is why foreign languages all sound like “Greek.” Independently of the fact that you don’t know what the words mean or how the syntax works, you can’t even hear the phonemes in the speech stream.
Now, that’s the distinction I’m after, between phonemes and the raw speech stream. That’s the distinction I drew in my discussion of music (third post). Phonemes are those properties of the speech stream that are linguistically active. We need, however, to distinguish between segmental phonemes and suprasegmental phonemes. The segmental phonemes are roughly parallel to the letters of an alphabetic writing system. Suprasegmentals include tone, stress, and prosodic patterns. And then we need to consider ordering as well, as the order in which elements occur is certainly a property of the speech stream, and a most important one.
Before thinking about order, thought, we need to think a bit more about what’s going on. Roughly speaking, two things need to be extracted from the speech signal: 1) word identities (to be somehow linked to word meanings), and 2) the relations between the words (syntax). My quick take on matters – I’m not a linguist and I’ve not thought this through – is that both segmental and suprasegmental phonemes are involved in both of those processes. Relations between words are often indicated by word affixes, which are realized through segmental phonemes. Word identities are certainly realized by segmental phonemes, but tone and accent are involved as well.
Beyond this, relations between words are signaled by word order. In linguistic typology, typical word order is the primary trait on which classification based. Thus one has SVO languages (subject-verb-object), VSO languages (verb-subject-object), and so forth. As those designations suggest, word order indicates grammatical function, that is, relations between words.
Thus between word order and phonemes we’ve got a rich set of memetic elements. And we could also consider morphology in here as well. Taken together these aspects of the speech signal seem to be as memetically rich and abstract as the musical properties we looked at in discussing Rhythm Changes (first post).
* * * * *
We’ve got one last matter to attend to in this section: how does this discussion square with Croft’s treatment? After all, I’ve pretty much adopted in view of meaning as being negotiated in speech, what about his view on memetic issues? That’s tricky. So, in some ways our views are quite different. Different terms, different definitions. But we agree that selection acts on the physical material of spoken language, on utterances. That’s enough to get this conversation started.
In what follows I offer a few notes about the tricky differences in our terms and definitions. Croft has been influenced by Richard Dawkins and David Hull and so thinks of memetic elements as things that replicate rather than as properties that allow a common apprehension of the speech signal.
The emic/etic distinction seems to play little or no role in his thinking about how to characterize language as an evolving phenomenon, while it is central to mine. And he thinks of the phenotypic element as an interactor in Hull-Dawkins terminology and identifies that with the speaker, which is really quite different from my identification of the phenotypic element with the physical speech stream itself. And I do that – in parallel with my treatment of music – because that’s what’s acted upon in selection. But on that particular issue, Croft takes the same view that I have. He defines a language as (p. 26) “the population of utterances in a speech community.” He regards the utterance as the linguistic equivalent of DNA, which is fine, and coins the term “lingueme” to designate the memetic elements of language. Utterances consist of linguemes.

In Conversation

By way of starting to bring this post to close, let me paraphrasing and recasting a passage from my essay-review of Steve Mithin’s The Singing Neanderthals (Benzon 2005).
In these discussions I have been assuming that the nervous system operates as a self-organizing dynamical system as, for example, Walter Freeman (e.g. 1995) has argued. Using Freeman’s work as a starting point, I’ve argued that, when individuals are making music with one another, their nervous systems are physically coupled with one another for the duration of that musicking (Benzon 2001, pp. 47 ff.). There is no need for any symbolic processing to interpret what one hears or so that one can generate a response that is tightly entrained to the actions of one’s fellows.
My earlier arguments were developed using the concept of coupled oscillators which has been applied to the phenomenon of synchronized blinking by fireflies (Strogatz and Steward 1993). Such tightly synchronized activity, I argued, is a critical defining characteristic of human musicking. What musicking does is bring all participants into a temporal framework where the physical actions – whether dance or vocalization – of different individuals are synchronized on the same time scale as that of neural impulses, that of milliseconds. Within that shared intentional framework the group can develop and refine its culture. Everyone cooperates to create sounds and movements they hold in common.
There is no reason whatever to believe that one day fireflies will develop language. But, obviously, human beings have already done so. I believe that, given the way nervous systems operate, musicking is a necessary precursor to the development of language. And there is evidence that talking individuals must be within the same intentional framework. Consider an observation that Mithen offers early in his book (p. 17). He cites work by Peter Auer who, along with his colleagues, has analyzed the temporal structure of conversation. They discovered that, when a conversation starts, the first speaker establishes a rhythm to which the other speakers time their turn-taking. That is, even though they are only listening, other parties are actively attuned to the rhythm of the speaker’s utterance (cf. Condon 1986). What if this were necessary to conversation, and not just an incidental feature of it?
Thus we have the memetic features of the speech stream coupling speaker and hearing together into a single dynamical system. Assuming, for the sake of argument, that we have a two-party conversation. When the parties enter into their conversation they each give up many degrees of behavioral freedom and agree to cooperate in arriving at a mutual understanding. Each party is internally partitioned so that meaning “happens” on one side of the partition, but not the other. Let’s call the meaning side the meaning system and the other side the signifying system. It is the two signifying systems that are physically synchronized with one another. And it is the meaning systems that are playing a cooperative game with one another. They play the game by manipulating their respective signifying systems so as to send signals to one another.
A most peculiar activity.
Next Two Posts The next post will discuss literature, conceptualizing literary texts performances in group-wide coordination games. The final post will attempt to wrap things up by discussion Nina Paley’s film, Sita Sings the Blues.
A Note About John von Neumann It has long been obvious to me that the cognitive sciences are what happened when the computation and the computer hit the behavioral sciences as a source of models and metaphors. That means that the cognitive sciences owe a debt to John von Neumann, the Hungarian mathematician who is widely credited for coming up with the scheme for realizing digital computation in electronic circuitry. Though the term “von Neumann machine” is widely applied to machines based on serial computation, it is worth noting that von Neumann is also invented the concept of a cellular automaton, which is a scheme for parallel computation.
More to our point in this post, von Neumann is also one of the founders of game theory. Thus he stands behind some of the most interesting work that’s been done in behavioral biology over the past two or three decades. Though he’s not often mentioned in discussions of evolutionary psychology – at least not in the discussions I’ve read – his formative role in game theory has him standing behind evolutionary psychology as he stands behind the cognitive sciences.
Both of them are disciplines ultimately grounded in computation. Thus, if humanists are to fully benefit from the newer psychologies, we must come to terms with computation in one way or another. We need not become expert in either the theory or the practice of computation, but we must become sufficiently comfortable so that we can fruitfully collaborate with experts.
References Benzon, William (2001). Beethoven’s Anvil. Basic Books.
Benzon, William (2005). Synch, Song, and Society. Human Nature Review 5, 2005, pp. 66-85. http://www.human-nature.com/nibbs/05/wlbenzon.html
Berlin, Brent (1992). Ethnobiological Classification: Principles of Categorization of Plants and Animals in Traditional Societies. Princeton, Princeton University Press.
Changizi, Mark A. (1999) Vagueness, rationality and undecidability: A theory of why there is vagueness. Synthese 120: 345-374.
Condon, W. S. (1986). Communication: Rhythm and Structure. Rhythm in Psychological, Linguistic and Musical Processes. eds. J. R. Evans and M. Clynes. Springfield, Illinois, Charles C Thomas • Publisher: 55-78.
Croft, William (2000). Explaining Language Change: An Evolutionary Approach. Longman.
De Man, Paul (1983). Form and Intent in the American New Criticism. Blindness and Insight. University of Minnesota Press, 20-35.
Freeman, W. J. (1995). Societies of Brains: A Study in the Neuroscience of Love and Hate. Hillsdale, NJ, Lawrence Erlbaum.
Lamb, Sydney (1999). Pathways of the Brain: The Neurocognitive Basis of Language. John Benjamins Publishing Company.
Quine, Willard van Orman (1960). Word and Object. MIT Press.
Reddy, Michael J. (1993). The conduit metaphor – a case of frame conflict in our language about language. Metaphor and Thought (2nd edn), ed. Andrew Ortony, 164-201. Cambridge University Press.
Strogatz, S. H. and I. Stewart (1993). "Coupled Oscillators and Biological Synchronization." Scientific American (December): 102-109.
Previous Posts in this Series
Cultural Evolution 1: How “Thick” is Culture?
Cultural Evolution 2: A Phenomenological Gut Check on Gene-Culture Coevolution
Cultural Evolution 3: Performances and Memes
Cultural Evolution 4: Rhythm Changes 1
Cultural Evolution 5: Rhythm Changes 2
Cultural Evolution 6: The Problem of Design
Cultural Evolution 7: Where Are We At?
See also The Sound of Many Hands Clapping: Group Intentionality, which considers a very simple case of group behavior and thus is relevant to the issue of culture's collective nature. Consider this post an elaboration of my discussion of music in CE3: Performances and Memes.

Culture Magazine

Cultural Evolution 8: Language Games 1, Speech

About the author

Author's Latest Articles

Magazines

COMMUNITY CULTURE