### Author Topic: Man vs. Beast  (Read 40763 times)

#### Daniel

• Experienced Linguist
• Posts: 1970
• Country:
• English
##### Re: Man vs. Beast
« Reply #45 on: July 07, 2015, 09:51:52 AM »
Quote
I also don't see that it is an open question as to whether language is associated with cognition or communication -- I think it is uncontroversial that it is associated with both. Perhaps you mean something more specific than "associated with".
What I mean is that others would completely disagree with you. It's a controversial topic, more than an open question: most people seem to have an answer, but that answer differs, with each perspective having a very strong view.

#### panini

• Linguist
• Posts: 175
##### Re: Man vs. Beast
« Reply #46 on: July 07, 2015, 09:36:34 PM »
Quote
I also don't see that it is an open question as to whether language is associated with cognition or communication -- I think it is uncontroversial that it is associated with both. Perhaps you mean something more specific than "associated with".
What I mean is that others would completely disagree with you. It's a controversial topic, more than an open question: most people seem to have an answer, but that answer differs, with each perspective having a very strong view.
I still don't understand the nature of the controversy or disagreement that you are pointing to. I am utterly unaware of there being any controversy over the proposition that communication is a specific instance of the concept cognition. What exactly is it that you think is controversial or not obviously true about my position, which is, specifically, that language is fundamentally a tool for cognition, and that communication is a sub-case of cognition?

#### Daniel

• Experienced Linguist
• Posts: 1970
• Country:
• English
##### Re: Man vs. Beast
« Reply #47 on: July 07, 2015, 10:32:21 PM »
It seems to me that a lot of people link language directly to communication, while others (perhaps more) make the argument you're making. That's all.

#### Guijarro

• Forum Regulars
• Linguist
• Posts: 97
• Country:
• Spanish
##### Re: Man vs. Beast
« Reply #48 on: July 08, 2015, 03:49:57 AM »
“It is often pointed out that, thanks to their grammars and huge lexicons, human languages are incomparable richer codes that the small repertoire of signals used in animal communication.

Another striking difference –but one that is hardly ever mentioned– is that human languages are quite defective when regarded simply as codes. In an optimal code, every signal must be paired with a unique message, so that the receiver of the signal can unambiguously recover the initial message. Typically, animal codes (and also artificial codes) contain no ambiguity. Linguistic sentences, on the other hand, are full of semantic ambiguities and referential indeterminacies, and do not encode at all many other aspects of the meaning they are used to convey. This does not mean that human languages are inadequate for their function. Instead, what it strongly suggests is that the function of language is not to encode the speaker’s meaning, or, in other terms, that the code model of linguistic communication is wrong. (pg.332).

[…]

The human mind is characterized by two cognitive abilities with no real equivalent in other species on Earth: language and naive psychology (that is, the faculty to represent the mental state of others). […] It is because of the interaction between these two abilities that human communication was able to develop and acquire its incomparable power. From a pragmatic perspective, it is quite clear that language faculty and human languages with their richness and flaws, are only adaptive in a species that is already capable of naive psychology and inferential communication. The relatively rapid evolution of languages themselves, and their relative heterogeneity within one and the same linguistic community –we see these two features as linked– can only be adequately explained if the function of language in communication is only  to provide evidence of the speaker’s meaning, and not to encode it.

In these conditions, research on the evolution of language faculty must be closely linked to research on the evolution of naive psychology”. (pg.338).

(Sperber, Dan & Gloria Origgi (2012): “A pragmatic perspective on the evolution of language” in Wilson, Deirdre & Dan Sperber (2012): Meaning and Relevance, Cambridge, C.U.P. )

« Last Edit: July 09, 2015, 01:49:02 AM by Guijarro »

#### Daniel

• Experienced Linguist
• Posts: 1970
• Country:
• English
##### Re: Man vs. Beast
« Reply #49 on: July 08, 2015, 06:20:58 AM »
I've always found that argument (regarding imperfections in human communication) to be very confusing, for two reasons:

1. It may just be inherent that, given the complexity of our message and use in the real world (rather than, say, transferring a file by email with machines that are inherently accurate for every bit of information-- literally), such imperfections are just natural. Nothing "human" about it, just part of having a complex code.

2. It is, I think, a distraction from the real issue: it confuses interpretation and information, in that no code is fully unambiguous regarding interpretation, and no code is fully explicit. For example the data stream for an audio file of the human voice has exactly the same ambiguities and underspecifications as that human voice plus more when the channel is noisy and certain sounds are less clear.

So we must distinguish between the code itself (the combination of sounds->words being transmitted from one individual to another) and their purpose in communication (where these imperfections really arise).

There are of course some legitimate cases of ambiguity in the code itself. But this is just an effect of a natural evolution of the code by trial and error, where some messages happen to look alike. They are also rarely truly ambiguous in context and with intonation. Furthermore, this ambiguity is an effect of linearization of the signal more than anything.

The reason it (that is, ambiguity) doesn't come up for animals is that their codes aren't that complicated (at least as far as we understand them), but remember that they are certainly underspecified quite often ("danger" -- of what?).

#### Guijarro

• Forum Regulars
• Linguist
• Posts: 97
• Country:
• Spanish
##### Re: Man vs. Beast
« Reply #50 on: July 08, 2015, 09:26:17 AM »
Sperber & Origgi do not talk about imperfect communication, as far as I understand their text. On the contrary, they think that the undertermination of the linguistic code when trying to represent the speaker's meaning favours human freedom to use other means to arrive at the speaker's intention in a richer and more accurate communicative process. What they claim is that linguistic coded material couldn't have evolved and developed at such a pace if coding/decoding was the only process involved in human communication (in their paper they give an extensive example of how this could work in reality to account for the constant changes of human linguistic codes --which, understandably, does not show in other species codes). Our languages are all that complex and rich in comparison with other animal codes precisely because changing elements of the code does not hinder the achievement of accurate communication. Animal codes are so unchangeable (and unable to expand) because they are the only means some species have to communicate and a failure matching is counter-effective and may hinder communication.

ASIDE: They don't mention it, but I believe that we are able to understand some of the messages sent by our pets for we use our naive psychology to arrive at some kind of "mind reading" and, therefore, become aware of some of their intentions. The more I hear people talking about their pets (for instance, Malfet's son with his cat), the more I think this process is taking place in their relationships.

But I may be wrong, of course.
« Last Edit: July 08, 2015, 12:29:00 PM by Guijarro »

#### Copernicus

• Linguist
• Posts: 61
• Country:
##### Re: Man vs. Beast
« Reply #51 on: July 14, 2015, 06:31:31 PM »
I'm not at all sure why one would conclude that animal communication is unambiguous or less ambiguous than human language.  If language is evolved to facilitate replication of a train of thought--an understanding of the intentions of other animals--then it may well be that the more limited modes of expression available to non-human animals may be open to far many more interpretations.  The human "call system"--cries, laughter, screams, etc.--is certainly open to lots of different interpretations.  For example, crying can indicate sorrow, but also relief.  Human language gives us the advantage that it can be far more precise about thought content in different contexts than a limited set of calls, postures, and expressions can.

#### jkpate

• Forum Regulars
• Linguist
• Posts: 130
• Country:
• American English
##### Re: Man vs. Beast
« Reply #52 on: July 16, 2015, 11:24:54 AM »
“It is often pointed out that, thanks to their grammars and huge lexicons, human languages are incomparable richer codes that the small repertoire of signals used in animal communication.

Another striking difference –but one that is hardly ever mentioned– is that human languages are quite defective when regarded simply as codes. In an optimal code, every signal must be paired with a unique message, so that the receiver of the signal can unambiguously recover the initial message. Typically, animal codes (and also artificial codes) contain no ambiguity. Linguistic sentences, on the other hand, are full of semantic ambiguities and referential indeterminacies, and do not encode at all many other aspects of the meaning they are used to convey. This does not mean that human languages are inadequate for their function. Instead, what it strongly suggests is that the function of language is not to encode the speaker’s meaning, or, in other terms, that the code model of linguistic communication is wrong. (pg.332).

It depends on what you mean by "optimal" here. One definition could be no errors, but another could be an acceptable error rate. Formally, we can understand the error rate by considering the conditional entropy of the message $M$ given the signal $S$, denoted $H(M | S)$. If the signal is completely unambiguous, $H(M | S) = 0$ and there is no risk of an error. If there are two equally likely messages for $S$, then $H(M | S) = 1$ bit and there is risk of an error. If there are two possible messages but one is much more likely, then $H(M | S) < 1$ bit. The Noisy Channel Theorem establishes a bound on codes, using this conditional entropy, that have a pre-specified error rate greater than 0 as well as for codes that have an error rate arbitrarily close to zero. A non-zero error rate might well be tolerable if it is easy to recover from the errors (indeed, this is how lossy compression algorithms, such as JPEG and MP3, manage to provide such exceptional compression).

Moreover, many potential ambiguities are ruled out by the real world context -- let's call it $C$. By a general property of entropy, the conditional entropy of the message given the signal and the context is less than or equal to the entropy of the message given the signal alone: $H( M | S, C ) \leq H( M | S )$, with equality iff the signal and the context are statistically independent. Presumably for natural language the signal and the context are not statistically indepedent: some utterances are more likely in some contexts than others (i.e. $P(S | C) \neq P(S)$). So, for natural language, an information-theoretic approach entails that isolated utterances are more ambiguous than situated utterances.

This is all just to say that the concerns you raise do not challenge the "language as code" view, and to show how an information-theoretic approach provides a natural treatment.
« Last Edit: July 16, 2015, 01:04:08 PM by jkpate »
All models are wrong, but some are useful - George E P Box

#### Guijarro

• Forum Regulars
• Linguist
• Posts: 97
• Country:
• Spanish
##### Re: Man vs. Beast
« Reply #53 on: July 19, 2015, 09:46:37 AM »
Your argument seems impressive --at least for a fellow like me who becomes dizzy when I see formulae in a text. As I am a perfect nullity in information theory, I could not make heads and tales of your last posting. But I am also a member of another forum concentrating on Relevance Theory, where I wrote asking for helping me decipher (and, if possible, respond to) your text.

Here is what I got:

if I understand it correctly, your interlocutor might have a point in saying that, since (if only because of channel noise) there is no such thing in reality as an absolutely unambiguous signal, the optimality of a code is a gradual (sc. statistical) notion in code theory; from their point of view, this would make an "ideal" code a straw man.

Where this could be countered, I think, is the introduction of "the real world context" C as a factor in disambiguation. As shown by Sperber & Wilson, the context is not given, but "chosen". And this choice is not a matter of decoding (even in a more sophisticated sense of decoding); moreover, in metaphor, irony or ad hoc concepts, this choice may even make us override linguistic code rather than disambiguate it.

In fact, there has been some work in information science, esp. with respect to information retrieval systems, on the problem of "relevance" and the need to include context and users' knowledge and preferences in the definition of relevance. So possibly an overly simple code model could even be challenged on its home ground?
(Jan Straßheim)

#### jkpate

• Forum Regulars
• Linguist
• Posts: 130
• Country:
• American English
##### Re: Man vs. Beast
« Reply #54 on: July 27, 2015, 11:49:46 PM »
Could you clarify how the Sperber and Wilson notion of context selection enters into the argument? My understanding of context selection in Relevance Theory is that listeners select features from the current environment and their background knowledge, but the set of features from which they select is fixed (when interpreting a given utterance). The conditional entropy $H(M|S,C)$ similarly will be sensitive to only those features $c_i$ of $C$ that are not statistically independent of the message given the utterance. For example, if a listener's background knowledge can be represented as a (potentially infinite) causal graph, only those features of the context that are not d-separated from the meaning by the features of the current situation will change $H(M|S,C)$.

Under this view, I understand this context selection business as saying that finding all non-d-separated nodes is computationally intractable, and speakers employ a variety of accessibility heuristics to find most of the most important ones. Is that inconsistent with your understanding?

« Last Edit: July 27, 2015, 11:53:08 PM by jkpate »
All models are wrong, but some are useful - George E P Box

#### Copernicus

• Linguist
• Posts: 61
• Country:
##### Re: Man vs. Beast
« Reply #55 on: July 28, 2015, 12:35:15 PM »
I can't speak for Guijarro, but I can try to express how I understood his point.  It might help to consider Michael Reddy's conduit metaphor of language, which he developed in the 1970s.  Basically, the idea is that language is thought of as a "pipe" through which information flows.  It is packaged or encoded at one end and decoded at the other.  This metaphor has dominated thinking about language for a very long time, and it is very powerful.  Formal languages tend to conform to it.  So to extract information from a linguistic signal, all one has to do is simply decode the content contained in it.  Information theory is essentially about signal processing, not natural human language processing.  The conduit metaphor is misleading for human language, but not formal symbolic systems.

An alternative metaphor--one that Charles Fillmore once expressed to me--is that language is "word-guided mental telepathy".  That is, it is thought replication that uses linguistic expressions as keys to unlock associative clumps of information.  So, if you look at his FrameNet approach to semantics, the "frames" represent clumps of conceptual associations that one can then map words onto.  So the semantic structure of a sentence is not actually structured like the linguistic signal, but the linguistic signal evokes it through association with frames.  Frames then structurally represent the information that the speaker is trying to communicate.  So I suppose that that is one way of looking at what you call "accessibility heuristics".  However, there could be non-linguistic information that also contributes to the slot-filling activity of assigning roles to entities.  Formal languages are always literal and compositional in usage, whereas natural language can have layers of non-literal and conventional significance.  The natural linguistic signal is always going to be a defective component of a discourse context.
« Last Edit: July 28, 2015, 12:37:36 PM by Copernicus »

#### jkpate

• Forum Regulars
• Linguist
• Posts: 130
• Country:
• American English
##### Re: Man vs. Beast
« Reply #56 on: July 30, 2015, 02:16:34 PM »
Hmm, I still don't see how these points argue against the code view. Formal languages may be literal and compositional, but information-theoretically optimal codes often aren't. Arithmetic coding, for example, approaches information-theoretic limits and is not compositional.

I think Fillmore's metaphor is fully compatible with an information-theoretic approach. Listeners don't just have a model for relating semantic representations to strings; they also have a models for relating real world situations to semantics, models of likely real-world situations, models of other talkers, etc.. Probability theory provides a natural and mathematically well-grounded language for expressing and testing potential models, and information theory is so profoundly tied up in probability theory that probabilistically sensible behavior is bound to be also information-theoretically sensible.

Maybe probability theory will end up being inadequate, but opponents of the code view are going to need to make much more specific criticisms to convince me.

---

Incidentally, the compositionality of natural language is actually an argument against the view that natural language achieves an information-theoretic optimum. Briefly, if language is compositional, then the length of the signal $l_{\mathbf m}$ for a message $\mathbf m$ is equal to the sum of the lengths of each component $m_i$ of the message:

$l_{\mathbf m} = \sum_{m_i \in \mathbf m} l_{m_i}$

It turns out that this summation over lengths corresponds to an assumption that the message components are statistically independent from each other (more discussion and derivation at the above link to my blog). In natural language, of course, what we usually consider to be components (words and constructions) are not statistically independent -- a sentence that mention scrambled eggs is more likely to mention orange juice or coffee. Non-compositional language phenomena provide an opportunity to provide much shorter signals than would be possible in an exclusively compositional approach (think of the difference in length between "United States of America" and "USA"). So, the question then is whether talkers choose non-compositional forms in a way that moves language closer to the information-theoretic optimum, and there's some evidence that they do (e.g. Frank and Jaeger, 2008.).
« Last Edit: July 30, 2015, 02:25:38 PM by jkpate »
All models are wrong, but some are useful - George E P Box

#### Copernicus

• Linguist
• Posts: 61
• Country:
##### Re: Man vs. Beast
« Reply #57 on: July 30, 2015, 08:42:28 PM »
Hmm, I still don't see how these points argue against the code view. Formal languages may be literal and compositional, but information-theoretically optimal codes often aren't. Arithmetic coding, for example, approaches information-theoretic limits and is not compositional.
IMO, the problem is that you still think of the meaning of an expression as somehow fully encoded in the signal.  The point I was trying to make is that it isn't.  The signal is semantically defective, but it contains information that enables the receiver to assemble the meaning, given assumptions made by the sender.  That is, linguistic meanings are essentially inferred from components of the signal, not encoded in it.  My criticism of the information-theoretic approach is that it fully buys into the "conduit metaphor" view of language.  That metaphor seems to hold up at the sentence level, but it ignores the fact that sentences only convey meaning in an assumed context.  However, once you start looking at the level of discourse processing, it breaks down rather quickly.

Quote from: jkpate
I think Fillmore's metaphor is fully compatible with an information-theoretic approach. Listeners don't just have a model for relating semantic representations to strings; they also have a models for relating real world situations to semantics, models of likely real-world situations, models of other talkers, etc.. Probability theory provides a natural and mathematically well-grounded language for expressing and testing potential models, and information theory is so profoundly tied up in probability theory that probabilistically sensible behavior is bound to be also information-theoretically sensible.
I think you are basically agreeing with me that the meaning of linguistic expressions requires context in order for a listener to discover it, but the information-theoretic approach is basically about signal processing, not meaning comprehension.  To understand an expression is essentially to integrate it with one's experiences--what you might call a "world model".  I'm not saying that the problem is computationally impossible, but that it involves a lot more than mere signal processing.  Probabilistic approaches work very well on a gross level for disambiguating word senses primarily because mutual information is an extremely powerful concept.  I think that they have proven their worth in applications such as text mining large amounts of data.

Quote from: jkpate
Maybe probability theory will end up being inadequate, but opponents of the code view are going to need to make much more specific criticisms to convince me.
How does probability theory actually help you generate linguistic structure?  There are two sides to language--production and comprehension.  The best that probability theory can do for you is provide you with a cloud of more or less related words.  How do you assemble those words into structured phrases that can be understood in a given context?   Where does probability help you to decide the quantifier scope?  You can extract meaning from clouds of words in a document, but constructing the document requires knowledge about how to structure the information for a discourse context.  Probabilistic approaches help with certain types of linguistic processing, but they are a dead end when it comes to real text understanding.  In fact, I don't think any approach that relies just on signal processing is scalable.  However, the conduit metaphor is well-ensconced in our thinking about language, so signal processing approaches sound very promising at first blush.

Quote from: jkpate
Incidentally, the compositionality of natural language is actually an argument against the view that natural language achieves an information-theoretic optimum. Briefly, if language is compositional, then the length of the signal $l_{\mathbf m}$ for a message $\mathbf m$ is equal to the sum of the lengths of each component $m_i$ of the message:

$l_{\mathbf m} = \sum_{m_i \in \mathbf m} l_{m_i}$

It turns out that this summation over lengths corresponds to an assumption that the message components are statistically independent from each other (more discussion and derivation at the above link to my blog). In natural language, of course, what we usually consider to be components (words and constructions) are not statistically independent -- a sentence that mention scrambled eggs is more likely to mention orange juice or coffee. Non-compositional language phenomena provide an opportunity to provide much shorter signals than would be possible in an exclusively compositional approach (think of the difference in length between "United States of America" and "USA"). So, the question then is whether talkers choose non-compositional forms in a way that moves language closer to the information-theoretic optimum, and there's some evidence that they do (e.g. Frank and Jaeger, 2008.).
If your assumption is that all of the information necessary to decode the signal is in the signal "pipeline", then you are right about that length metric.  I do not believe that that assumption is correct.  The expression "scrambled eggs" represents a very complex web of associations.  The trick is to get the right set of associations in the given discourse context.  If you base your notion of "discourse context" on just the literal meanings of the word cloud, you are going to miss critical information that relates to an analogical mapping.  That is, you aren't going to be able to handle a fundamental aspect of human language--metaphor.

#### Daniel

• Experienced Linguist
• Posts: 1970
• Country:
• English
##### Re: Man vs. Beast
« Reply #58 on: July 30, 2015, 11:44:21 PM »
It seems to me that the idea of "information" in the signal is at least trivially true: something is transmitted and we can quantify that in terms of information. I believe jkpate's point is that we can theoretically optimize over that information at various levels including context. There are some questions of what the best way is, but I don't see a problem with the basic point.