Author Topic: What are linguistic intuitions of well-formedness? Why do we have them?  (Read 2380 times)

Offline Copernicus

  • Linguist
  • ***
  • Posts: 60
  • Country: us
    • Natural Phonology
For a generative linguist, intuitions of well-formedness are essentially what linguistic theory is about.  The mental construct that generativists call the "grammar" generates patterns that inform intuition.  Although I am a generative-trained linguist, having come to it in its heyday, the 1960s, I no longer have the faith in it that I once did.  But, if grammaticality intuitions are not grounded in a specialized cognitive mechanism, then where do they come from?  For me, a theory of psycholinguistics needs first and foremost to explain linguistic behavior.  It is clear that we strive to be grammatical when we produce language, so there has to be some connection between behavioral strategies and linguistic intuitions.  Hence, psycholinguistics needs to at least contain a theory of linguistic well-formedness.  Just because I reject the foundation of the generative paradigm, that doesn't let me off the hook of explaining how well-formedness intuitions fit in with my functionalist approach to language.

I am very much interested in the history of linguistics--especially the 19th century progenitors of modern linguistic theory, Baudouin de Courtenay and Ferdinand de Saussure.  (Well, there were other "progenitors" such as Pierce, but I especially like these two for their views on phonetics and phonology.)  Baudouin created phonemic (phonological) theory when he recognized two types of phonetic alternations--physiophonetic (single-phoneme) and psychophonetic (two-phoneme).  Saussure essentially borrowed Baudouin's concept, but he put an entirely different spin on it.  What was the difference?  Both men talked about well-formedness intuitions, but Baudouin approached language from the perspective of an individual's psycholinguistic behavior.  He thought of phonology in terms of speaker intentions, perceptions, and productions.  Saussure, influenced heavily by Durkheim, the founder of modern sociology, approached language from the direction of social interactions.  So he developed a structuralist theory of language, which was about a kind of abstract system that existed in society, not just the individual's mind.  Nevertheless, humans, being social animals, think about the social consequences of language.  So there is a psychological aspect to Saussure's approach, just as there is a social aspect to Baudouin's.  But Saussure's langue was clearly a social systemic concept.  He was less interested than Baudouin in the parole (linguistic performance) of individual speakers. 

Language is a two-faced abstraction.  It is a system of communication that can be described either as a psychological system or as a social system.  If you are doing historical linguistics or sociolinguistics, then you are probably not going to think much in terms of what goes on in an individual's head.  If you are doing a study of language learning, then you are going study individual behavior (longitudinal study) but look for generalizable trends (horizontal study).  Generative grammar is very much a psychological construct (in Baudouin's sense), not a social construct (in Saussure's sense). Chomsky never seemed comfortable with including sociolinguistics or diachronic linguistics in the linguistics curriculum as anything other than a kind of traditional academic baggage.  Interesting, but not exactly what linguistics should be about. 

My own opinion is that well-formedness intuitions play a functional role in an important social aspect of language--as a driver of language standardization.  As linguists, we are often taught to look askance at efforts to standardize language.  When I began to work on an industrial language policy known as Simplified Technical English, the late Jim McCawley once jokingly asked me how it felt to have joined the "language police".  However, standardization is almost a necessity in a diverse linguistic community, because a major function of language is to communicate thoughts accurately and efficiently.  If you pay attention to the content of discourse, you will find that a good amount of it is devoted to so-called "metapragmatic" behavior, i.e. negotiations over how to use language.  The game of well-formedness judgments is an important part of the social glue that keeps the communication flowing smoothly. 

Where do well-formedness intuitions come from?  They probably have more than one source, not just a specialized "generator" in the head.  If we are confident that our speech is understood and "normal", then we can rely on introspective linguistic production to inform us that a linguistic expression is "well-formed".  The fact that I can pronounce /blak/ easily, but not /bnak/, tells me that /blak/ is well-formed English and /bnak/ is not.  But my attempt to actually say [bnak] is difficult only if what I try to say is /bnak/.  I have no trouble articulating that phonetic form, if I think of it as a syncopated articulation of /bə'nak/, just as [blak] could be a syncopated articulation of /bə'lak/.  Moreover, I tend to mispronounce /bnak/ as [bə'nak].  So the actual calculation of well-formedness can be done entirely on the basis of how I imagine myself producing the syllable.
« Last Edit: June 09, 2015, 07:57:22 PM by Copernicus »

Offline panini

  • Linguist
  • ***
  • Posts: 83
Since naive speakers do not even know what "well-formedness" means, they certainly can't have any intuitions about it. And since most linguists also don't really understand "well-formedness", they can't have intuitions about it, either. The first step, then, is to correctly identify what those intuitions are.

There is a major split in human behavior in this realm, between socially-oriented people and individually-oriented people. Those in the former group judge stimuli against the great mass of experience with other speakers of the language, whereas those in the latter call on their own resources. Though, of course, people can behave inconsistently. When eliciting data from a speaker of a language, you will notice that some speakers say "Yes, some people say that" or "I don't think I've ever heard that", indicating that their standard is more socially based. Others will say "I've heard that, but I wouldn't say that". In other words, speakers intuit about radically different things, and speakers are often extremely wrong about what other people will say.

Speakers vary in their ability to introspect about stimuli, just as individuals vary in their ability to introspect about what is bothering them, when they know something is bothering them. Some speakers are pretty useless because they can distinguish syntactic incorrectness from factual incorrectness from cultural anomaly. Others can generate bizarre gibberish with phonological rules applied correctly, just for the sake of the paradigm. Speakers can be all over the map in accepting actually incorrect stuff, versus rejecting correct stuff, depending on whether they are being generous and accepting your attempt to use their language (their intuitions tell them that what you said is 'well-formed' in the language despite atrocious pronunciation and minor agreement errors: or they can reject a sentence that they themselves generated, because your [ɑ] isn't quite right).

The best hope for intuitions is via a rigorous training program, a.k.a. grad school, where you learn to correctly introspect about syntax, phonology, morphology and semantics. Though, "best" doesn't mean "good". Some intuitions about phonology are easy and legitimate -- you can introspect over whether the plural of "cat" is [kædɨz], and (one hopes) realize that that is not well-formed as the plural of "cat". OTOH you cannot legitimately "intuit" whether you can say [bnak], though you can intuit (know) whether it is a word. Of course, a linguist can take an ideological position that you "can't say" [bnak]. A non-linguist may be legitimately confounded if you present them with a stimulus (assuming a decent stimulus) and ask them to repeat what you said, but we don't know whether "I can't" means that the stimulus is "ill-formed" or they are just freaking out at the annoyance of a linguist harassing them.

Linguist intuitions about what they can pronounce are notoriously useless (i.e. "can you pronounce 'phone book' as [fowm bʊk]?"). There are objective measurement techniques that can be appropriately applied to the question of when the lips close and whether the tongue is raised, and these questions of fact regarding articulatory dynamics are not open to introspection.

Needless to say, people do not have "grammaticality intuitions". That may have acceptability intuitions, but unacceptability comes from many sources, many of which have nothing to do with grammar. In fact, I think that mis-conception was put to rest decades ago, by the end of the 70's.

Despite having latched onto and briefly exploiting this notion of "speaker intuitions" in the 60's, I think the practice of generative grammar has silently abandoned that construct, in favor of the simple question of whether a form "is in the language" or "is not in the language". We still have your dialect / my dialect problems (or your speaker / my speaker), but that is never going to go away, because individuals differ, linguistically.
 
The goal of generative grammar, then, is to model the mental grammatical computations which underlie a speaker's ability to produce and comprehend utterances in the language. This will probably also correspond to speaker intuitions, but I think we now know that the goal is to model the mental faculty, and not the set of phenomena associated with speaker introspection.

Offline Copernicus

  • Linguist
  • ***
  • Posts: 60
  • Country: us
    • Natural Phonology
Since naive speakers do not even know what "well-formedness" means, they certainly can't have any intuitions about it. And since most linguists also don't really understand "well-formedness", they can't have intuitions about it, either. The first step, then, is to correctly identify what those intuitions are.
Thanks for the comments, panini.  I do agree that "well-formedness" is a fuzzy notion when it comes to natural language.  It is clear what it means in the context of true formal languages, but talking about "intuitions of well-formedness" for natural languages can get confusing.  Non-linguists can be especially confused, because they simply don't have the conceptual tools.  Having worked with informants before, though, I can tell you that some "naive speakers" are especially good at making judgments about what is "well-formed" in their language.  This was especially true for my field experience with Breton, because none of our informants had actually been taught grammatical "prescriptions" for Breton in school.  Breton had been suppressed by the French government in their youth.

There is a major split in human behavior in this realm, between socially-oriented people and individually-oriented people. Those in the former group judge stimuli against the great mass of experience with other speakers of the language, whereas those in the latter call on their own resources. Though, of course, people can behave inconsistently. When eliciting data from a speaker of a language, you will notice that some speakers say "Yes, some people say that" or "I don't think I've ever heard that", indicating that their standard is more socially based. Others will say "I've heard that, but I wouldn't say that". In other words, speakers intuit about radically different things, and speakers are often extremely wrong about what other people will say.
I don't know about there being two types of people in that respect, but this is consistent with what I was saying--that there is a discrepancy between what we ourselves can say (or would habitually say) and what we think others would say. 

I am tempted to invoke Chomsky's much-debated e-language/i-language dichotomy here, but I am really talking purely about what is in the individual's head.  His dichotomy was more or less about the legitimate ontological status of linguistics--what linguists ought to bother with--and I am really talking about what Tom Bever used to call "psychogrammar" back in the 1970s.  There is the separate issue of whether "e-language" is a real thing that a linguistic description can be about.  My opinion on that is that it is, but that sense of "e-language" is non-psychological.  So it is something of a separate issue.  Unlike Chomsky, I believe that sociolinguistic approaches to language are every bit as relevant to linguistic theory as psychological approaches.  It's just that the sociolinguistic models (including diachronic studies) describe a different aspect of language--its existence at the level of a speech community.  Because language can be seen both as a psychological phenomenon and a social phenomenon, linguists have long struggled with how best to approach it.  I don't believe that sociolinguists should have to ride in the back of the bus.

Speakers vary in their ability to introspect about stimuli, just as individuals vary in their ability to introspect about what is bothering them, when they know something is bothering them. Some speakers are pretty useless because they can distinguish syntactic incorrectness from factual incorrectness from cultural anomaly. Others can generate bizarre gibberish with phonological rules applied correctly, just for the sake of the paradigm. Speakers can be all over the map in accepting actually incorrect stuff, versus rejecting correct stuff, depending on whether they are being generous and accepting your attempt to use their language (their intuitions tell them that what you said is 'well-formed' in the language despite atrocious pronunciation and minor agreement errors: or they can reject a sentence that they themselves generated, because your [ɑ] isn't quite right).

The best hope for intuitions is via a rigorous training program, a.k.a. grad school, where you learn to correctly introspect about syntax, phonology, morphology and semantics. Though, "best" doesn't mean "good". Some intuitions about phonology are easy and legitimate -- you can introspect over whether the plural of "cat" is [kædɨz], and (one hopes) realize that that is not well-formed as the plural of "cat". OTOH you cannot legitimately "intuit" whether you can say [bnak], though you can intuit (know) whether it is a word. Of course, a linguist can take an ideological position that you "can't say" [bnak]. A non-linguist may be legitimately confounded if you present them with a stimulus (assuming a decent stimulus) and ask them to repeat what you said, but we don't know whether "I can't" means that the stimulus is "ill-formed" or they are just freaking out at the annoyance of a linguist harassing them.

Linguist intuitions about what they can pronounce are notoriously useless (i.e. "can you pronounce 'phone book' as [fowm bʊk]?"). There are objective measurement techniques that can be appropriately applied to the question of when the lips close and whether the tongue is raised, and these questions of fact regarding articulatory dynamics are not open to introspection.
I think that you are approaching this from the perspective of a linguist trying to elicit data on intuitions.  I would say that a more natural case occurs in foreign language classrooms everywhere and every day.  If you want to know what happens when English speakers try to pronounce crazy consonant clusters (from an English perspective), just observe English students of any Slavic language in a classroom setting.  Russian learners know that they have to try to pronounce initial /zd/ clusters, but they habitually either devoice the entire cluster or insert an epenthetic vowel.  To pronounce Russian correctly, they have to learn to suppress those two phonologically-grounded substitutions in their speech.  On that basis, they know a Polish name like "Zbigniew" is not a possible English name, because they don't have to suppress any deviant pronunciations in pronouncing English names.  Any legitimate theory of phonology must have some way of accounting for the intuition that such a name is ill-formed for English, even though one may ultimately learn to pronounce such Polish or Russian clusters effortlessly.

Needless to say, people do not have "grammaticality intuitions". That may have acceptability intuitions, but unacceptability comes from many sources, many of which have nothing to do with grammar. In fact, I think that mis-conception was put to rest decades ago, by the end of the 70's.

Despite having latched onto and briefly exploiting this notion of "speaker intuitions" in the 60's, I think the practice of generative grammar has silently abandoned that construct, in favor of the simple question of whether a form "is in the language" or "is not in the language". We still have your dialect / my dialect problems (or your speaker / my speaker), but that is never going to go away, because individuals differ, linguistically.
Having been a student in the early 60s and 70s, I would say that your chronology is slightly off, but I do like your phrase "silently abandoned".  I do think that there is something of a consensus these days that one ought not to get hung up over the issue that concerned me in the OP.  Chomsky hasn't really said much about it since his 1986 e-language/i-language dichotomy, but it seems clear to me that modern linguistic theory still rests solidly on the competence/performance dichotomy.  That is, linguists can study grammatical structure in a compartmentalized or "autonomous" fashion without having to link it to linguistic production or perception strategies. Optimality Theory seems to abandon a classic performance/competence dichotomy, but it still treats morphonological phenomena as if they were in the same ballpark as phonological phenomena.  The ignored elephant in the room has been the so-called "dead" phoneme for at least half a century now.  So there aren't a lot of illuminating linguistic studies around on what would constitute an optimal alphabetic writing system or how one can explain the nature of rhyme in poetry--two subjects that were once offered in support of the psychological reality of phonological representation before the generative era.
 
The goal of generative grammar, then, is to model the mental grammatical computations which underlie a speaker's ability to produce and comprehend utterances in the language. This will probably also correspond to speaker intuitions, but I think we now know that the goal is to model the mental faculty, and not the set of phenomena associated with speaker introspection.
In that case, you are going to have a serious problem explaining exactly what it is that motivates a generative analysis of any language.  What exactly do you think you are "computing"?  You need to have a linguistic theory that explains something a little less vague than a general "speaker's ability to produce and comprehend utterances", because the theories that generativists come up with fall considerably short of that broad goal.  What is it that generative linguistics is really trying to explain?  Are they nothing more than psychologists who specialize in language?  In that case, there is no need to treat generative linguistics as a subject taught outside of a psychology department.
« Last Edit: June 10, 2015, 06:42:22 PM by Copernicus »

Offline panini

  • Linguist
  • ***
  • Posts: 83
Any legitimate theory of phonology must have some way of accounting for the intuition that such a name is ill-formed for English, even though one may ultimately learn to pronounce such Polish or Russian clusters effortlessly.
Perhaps we can make a deal: you won't insult my theory of phonology by calling it illegitimate, and I won't insult your theory of phonology by calling it something else.
Quote
In that case, you are going to have a serious problem explaining exactly what it is that motivates a generative analysis of any language.  What exactly do you think you are "computing"?
What we are computing are the forms and relations generated by the grammar, which is a mental faculty defining the extension of "a language". We are not psychologists, because there exist no psychological methods for determining what those mechanisms are, hence we use linguistic methods.

The role of mentalism varies a fair amount in generative grammar, ranging from extreme Chomskian "my current theory gives profound insight into the mind" doctrine, to deep agnosticism, where one is unwilling to make a strong claim that "rule iteration" is a property of the mind. There is a significant difference between hypothesizing that a certain linguistic concept might model an actual mental entity, and axiomatically stipulating that any concept justified by a certain canon of linguistic logic is ipso facto an actual mental entity. Any linguistic computation that requires a realized infinity, such as the need to inspect an infinite set of candidates in order to determine an output form, fails the test of psychological possibility. Or, any theory of grammar that requires learning a fact that cannot be learned from any evidence available to a child (such as, for children learning Medieval Yiddish, that [avek] comes from /aveg/).

I will admit that I'm not particularly sanguine about the prospects in syntax, which seems to still be stuck with a terminal case of a priorism, but I won't pretend to understand what syntacticians are up to. And there is no denying that phonology took a bad turn in 1993. Despite these setbacks,  generative linguistics has made significant progress over the past 50 years, so the fact that we are considerably short of the ultimate goal is no more significant than the fact that theoretical physics is, after millenia of research, short of its ultimate goal.

Offline Daniel

  • Administrator
  • Experienced Linguist
  • *****
  • Posts: 1581
  • Country: us
    • English
I see two possibilities:

Short version:
1. We either can or cannot generate a sentence with our grammars.
2. We either do or do not find an easy path toward assigning meaning to the sentence pattern as a whole.


Expanded:

1. (roughly the standard Generativist position) We have a formal system for language and sentences are either generated by it or not. Our intuitions, unless clouded by 'performance' factors (as they say), will reflect whether or not our grammars do in fact generate a given sentence.

2. (along the lines of my current working hypothesis, one version of expanding the short version above) We have lots of habits regarding language, maybe there's even some underlying 'competence' and formal system in there somewhere, but the surface forms themselves are (or at least can be) arbitrary, and what we find to be well-formed is when we have established mappings between parts of the form ("constructions") in such a way that the whole sentence adds up to a meaningful idea.


So, either Generativism or pattern matching. Are there other options?
Welcome to Linguist Forum! If you have any questions, please ask.

Offline Copernicus

  • Linguist
  • ***
  • Posts: 60
  • Country: us
    • Natural Phonology
Any legitimate theory of phonology must have some way of accounting for the intuition that such a name is ill-formed for English, even though one may ultimately learn to pronounce such Polish or Russian clusters effortlessly.
Perhaps we can make a deal: you won't insult my theory of phonology by calling it illegitimate, and I won't insult your theory of phonology by calling it something else.
I'm sorry if what I said offended you.  That is my honest opinion, and I actually expected you to agree with it.  After all, you did seem to think that linguistic analysis ought to be based on forms that are considered "in" the language and what is not.  So I don't really know why you felt I was insulting your theory of phonology.  Still, no offense intended.

Quote
In that case, you are going to have a serious problem explaining exactly what it is that motivates a generative analysis of any language.  What exactly do you think you are "computing"?
What we are computing are the forms and relations generated by the grammar, which is a mental faculty defining the extension of "a language". We are not psychologists, because there exist no psychological methods for determining what those mechanisms are, hence we use linguistic methods.
That leaves me puzzled as to what you think motivates a linguistic analysis.  Somebody has to collect data to analyze.  The type of data a person analyzes is very much motivated by that person's theoretical framework.  And there are really only two ways to construe a "language system"--as a psychological construct or as a social construct.  You have to have some means of determining what counts as valid linguistic forms and relations, don't you? 

The role of mentalism varies a fair amount in generative grammar, ranging from extreme Chomskian "my current theory gives profound insight into the mind" doctrine, to deep agnosticism, where one is unwilling to make a strong claim that "rule iteration" is a property of the mind.  There is a significant difference between hypothesizing that a certain linguistic concept might model an actual mental entity, and axiomatically stipulating that any concept justified by a certain canon of linguistic logic is ipso facto an actual mental entity...
I understand that there are linguists out there who are very uncomfortable with taking a stand on the psychological status of their formal linguistic description, but nobody can be agnostic about the validity of the data they are analyzing.  If one is merely describing patterns in the data and has no interest in the actual processes that produced the behavior, people can certainly question why they should be interested in the description.  How does the mere description of patterns relate to the behavior that generated them?

...Any linguistic computation that requires a realized infinity, such as the need to inspect an infinite set of candidates in order to determine an output form, fails the test of psychological possibility. Or, any theory of grammar that requires learning a fact that cannot be learned from any evidence available to a child (such as, for children learning Medieval Yiddish, that [avek] comes from /aveg/).
But now you seem to take the position that linguistic theory ought to be relevant to psychological processes, and you are making the same kind of generalizations that you chided me for at the beginning of your post.  How can you be agnostic about the psychological status of your analysis and then go on to say that it has to have psychological relevance?

I will admit that I'm not particularly sanguine about the prospects in syntax, which seems to still be stuck with a terminal case of a priorism, but I won't pretend to understand what syntacticians are up to. And there is no denying that phonology took a bad turn in 1993. Despite these setbacks,  generative linguistics has made significant progress over the past 50 years, so the fact that we are considerably short of the ultimate goal is no more significant than the fact that theoretical physics is, after millenia of research, short of its ultimate goal.
IMO, phonology took a bad turn in the 1960s with SPE, and its current lack of popularity has a lot to do with that fact.  But I've already been over that ground.  I agree that linguistic theory has made a lot of advances in the past 50 years, but I'm not so sure that one can say the same for linguistic formalism.  There certainly is nothing like the same level of enthusiasm and interest in linguistics that existed before 1980.  Perhaps what you describe as "agnosticism" has something to do with that.

Offline Copernicus

  • Linguist
  • ***
  • Posts: 60
  • Country: us
    • Natural Phonology
I see two possibilities:

Short version:
1. We either can or cannot generate a sentence with our grammars.
2. We either do or do not find an easy path toward assigning meaning to the sentence pattern as a whole.


Expanded:

1. (roughly the standard Generativist position) We have a formal system for language and sentences are either generated by it or not. Our intuitions, unless clouded by 'performance' factors (as they say), will reflect whether or not our grammars do in fact generate a given sentence.

2. (along the lines of my current working hypothesis, one version of expanding the short version above) We have lots of habits regarding language, maybe there's even some underlying 'competence' and formal system in there somewhere, but the surface forms themselves are (or at least can be) arbitrary, and what we find to be well-formed is when we have established mappings between parts of the form ("constructions") in such a way that the whole sentence adds up to a meaningful idea.


So, either Generativism or pattern matching. Are there other options?
I like your second option better, of course.  :)  The problem with classical generative theory is that it cannot really exist independently of a theory of performance.  So you have to make a priori judgments about what is or is not an aspect of that performance "clouding factor". 

I do find theories that highlight communicative function, for example, Construction Grammar, more interesting than those that try to ignore it.  When we merely look at sentences and try to explain structure independently of discourse, we run into all sorts of problems.  One is that we tend to assume a tacit disambiguating context.  The problem is that the same string of words can mean many different things and have many different structural analyses.  Indeed, you get a combinatorial explosion of structural analyses as sentence length increases.  And it is not insignificant from a semantic perspective that there are many different ways to say roughly the same thing, even in a specific discourse context.  So I have always been fascinated by the question of precisely why we choose the words we do to express the meanings we intend.  I think that sentence grammar can only work when embedded in a theory of discourse. 

Offline Daniel

  • Administrator
  • Experienced Linguist
  • *****
  • Posts: 1581
  • Country: us
    • English
Oddly enough, I sort of look at it the other way around: I'd like to analyze syntax out of context. But doing so feels limited, so maybe in the end I do agree with you. I'm not sure it's context/pragmatics, though, as much as it is arbitrary constructions. I don't see why those have to necessarily be connected.
(But your point about ambiguity is a good one-- I haven't thought of a way to test just how ambiguous sentences are in average usage, but I agree with your intuition that often the sentence-- the words-- stand in a context without worrying about which meaning is intended.)
Welcome to Linguist Forum! If you have any questions, please ask.

Offline Copernicus

  • Linguist
  • ***
  • Posts: 60
  • Country: us
    • Natural Phonology
When I went to work for Boeing in 1987, I was hired to maintain and develop the grammar of a rather neat bottom-up syntactic parser that was grounded primarily in GPSG theory, which had basically an implied semantic component.  So I have had rather extensive experience in trying to build practical applications under a sentence-based, autonomous approach to syntax.  Even in my retirement, I still help maintain that system part time for them.  It is probably one of the most sophisticated parsers in existence, but that is because we have had it under development and deployed in a grammar-checking application since the late 1980s (See Boeing Simplified English Checker).  We named the parser under its covers the "Sapir parser", because Jim Hoard, who was managing the team, was a well-established linguist with a great love of all things linguistic.

Right from the beginning, it was obvious that not having any semantic underpinning was a great disadvantage to our ultimate goal of extracting information from text, but having structural "hooks" on which to hang semantic interpretations was extremely useful.  It turned out that we didn't need much semantics to create a useful style and grammar checker, although the great weakness in it is that it really cannot help with word sense disambiguation except at a very superficial level (mostly part of speech checking).  Jim did create an interesting "semantic graph" methodology for representing superficial aspects of situational semantics--primarily case roles and topics--but we didn't use it in our production system.  Semantic theory and deep understanding really require what we jokingly call "AI-complete" computation.  In other words, you need to implement full "word-guided mental telepathy", and there just isn't any feasible way to implement that, given our present states of knowledge and technology.

What we learned very quickly with our GPSG/HPSG approach was that the grammar grew like a weed.  We ran the system against Wall Street Journal articles at first, and technical aircraft documentation later.  The trick was to converge on the most reasonable parse by constraining individual syntactic rules with arbitrary weights so that the final display (which we call the "fronted tree") looked like what a reasonable writer would have written in the context of our target domain, which became maintenance instructions.  There were a lot of practical compromises, as you might expect, but the system works reasonably well as a standalone generator of single-sentence utterances, as long as you don't particularly care that the parse is exactly right for the writer's intended meaning.  What it needed to get right was to identify the constituents relevant to grammar and style restrictions imposed on technical writers, and that was an achievable goal (except for the need to resolve word senses in context).

What theoretical syntacticians not working in NLP tend to underestimate is the sheer size of the problem of structural ambiguity.  You see that when you have a 20-word sentence in which an ambiguous word or a conjunction might trigger hundreds, even thousands, of potential parses for the entire string of words.  When you start wading through that forest, you see that most of the parses are quite legitimate, given some imaginary discourse context, but the most likely parses were very few.  Well, that is why statistical techniques work so well--the constraining effect of context.  The point is that syntax is not really autonomous.  Syntactic configurations always mean something, and what they mean can only be calculated within a frame of discourse.  This was something that Roger Schank had realized years before in his pioneering work on contextual interpretation.  You can easily eliminate ambiguity, but you can only do so if you already know what you are talking about.

 

Offline panini

  • Linguist
  • ***
  • Posts: 83
So, first: I don't understand the meaning or importance of "motivating a linguistic analysis" (more generally, I don't understand "motivating"). What motivates me to get out of bed is the knowledge that I can tend my garden or write a book, depending on my mood. I assume that you're asking about justification, according to intellectual / epistemological criteria, specifically, "What justifies analysis X of language Q, as opposed to analysis Y of language Q".

But: you then observe that "Somebody has to collect data to analyze.  The type of data a person analyzes is very much motivated by that person's theoretical framework". This suggests that you are not asking about evidence / warrant / justification and other such epistemological constructs, you are actually interested in the personality question "what motivates me to get out of bed?".

I don't understand what you mean by "theoretical framework". In terms of names, I'm a substance-non-abusing Formal Phonologist. I also do field work (more of the latter than the former). My theoretical framework absolutely does not determine of motivate the kind of data that I collect and therefore analyze. I won't deny that some people who are primarily theoreticians but who also dabble with first-hand language description only ask a limited set of questions about long-distance extractions out of islands, but that's a reflection on poor understanding of best practices for field work. As to why I tend to gather / analyze more of one kind of data than another, that is because the day is finite and I believe in greater depth rather than greater breadth.

There are a number of ways to determine what counts as valid linguistic forms and relations in a language. Actual speaker behavior is the main one. If I ask a speaker "Can you say [tuʔəɬəd čəd kʷi sʔuladxʷ]?" and they say "Hell no" (and are consistent, and assuming I've done due dilligence with the context, like saying that I'm talking about a dream), then I am entitled to exclude it from the set. Or, they may say "Hell yes" (and I would of course do due dilligence on that as well). In the latter case, they might even volunteer that sentence. In any event, I may be relying on the speaker's ability to introspect, but what I'm accounting for is the behavior -- not just passive behavior, but active and reflective behavior.

As I mentioned before (and every generative linguist has mentioned before), a grammar is an aspect of the mind, which is an entity of individual psychology. So a grammr is a psychological entity. That does not make generative linguists psychologists -- it makes them mentalists.

The "mere" description of patterns relates to the causal mental mechanisms in it that provides something that a hypothesized mental mechanism actually does -- it is a logical precursor to theory-construction. If for instance there were a hypothesized mental mechanism Fred that had no known effect, Occam's Razor would eventually slice Fred out of existence. Sometimes, the evidence (the merely-described patterns) suggest but do not prove the existence of a causal mechanism. So being agnostic about the mental reality of a particular theoretical construct is well justified.

And I urge you to reflect on the claim that "nobody can be agnostic about the validity of the data they are analyzing". There are three attitudes that you can have about data: "I know that X is part of language Y", "I know that X is not part of language Y" and "I don't know if X is part of language Y". If you don't have sufficient evidence that X is part of Y, you shouldn't conclude that X is part of Y. So not only can people be agnostic about supposed data, they should be.

Given that I know that a grammar is a thing of individual psychology, I know that knowledge of the nature of a grammar is "psychologically relevant". It does not follow that I know that every conjectured specific mechanism of grammar is in fact a mechanism of grammar, and if I don't know whether e.g. rule iteration is a property of grammar, then I don't know if it is psychologically valid.

Offline Copernicus

  • Linguist
  • ***
  • Posts: 60
  • Country: us
    • Natural Phonology
So, first: I don't understand the meaning or importance of "motivating a linguistic analysis" (more generally, I don't understand "motivating"). What motivates me to get out of bed is the knowledge that I can tend my garden or write a book, depending on my mood. I assume that you're asking about justification, according to intellectual / epistemological criteria, specifically, "What justifies analysis X of language Q, as opposed to analysis Y of language Q".

But: you then observe that "Somebody has to collect data to analyze.  The type of data a person analyzes is very much motivated by that person's theoretical framework". This suggests that you are not asking about evidence / warrant / justification and other such epistemological constructs, you are actually interested in the personality question "what motivates me to get out of bed?".
Not at all, and please don't take anything I say here personally.  What I was referring to was more in the sense of a Kuhnian paradigm, where the data under analysis shifts depending on one's theoretical assumptions.  So pre-generative descriptivists shunned linguistic intuitions and pretended that their corpora were somehow acquired in an objective fashion.  So the data one collects depends very much on what one's theory dictates.  Let me digress with an anecdote to illustrate my point.

My first field methods course was with Mary Haas in 1970, who did things strictly by the descriptivist playbook.  We had an Aymara informant, Mr. Yapita, whose language Professor Haas did not know.  As her ideological principles demanded, we were required first to work out the phonetics.  Only then could we proceed to phonemic analysis, but not morphology.  After the phonemic inventory was established, we could proceed to address morphology.  As you can imagine, that didn't sit well with young generative linguists, so we accosted Mr. Yapita outside of the classroom to ask questions about the more interesting aspects of his language and query his intuitions.  When we tried to swear him to secrecy, he laughed and told us that Professor Haas was also meeting him outside the classroom and asking roughly the same questions.  :)  As a competent linguist, she knew the difference between theory and practice, but she was letting theory dictate the practice in the classroom.  She didn't want to have to explain why to her students.

I don't understand what you mean by "theoretical framework". In terms of names, I'm a substance-non-abusing Formal Phonologist. I also do field work (more of the latter than the former). My theoretical framework absolutely does not determine of motivate the kind of data that I collect and therefore analyze. I won't deny that some people who are primarily theoreticians but who also dabble with first-hand language description only ask a limited set of questions about long-distance extractions out of islands, but that's a reflection on poor understanding of best practices for field work. As to why I tend to gather / analyze more of one kind of data than another, that is because the day is finite and I believe in greater depth rather than greater breadth.
Perhaps I should have said "theoretical paradigm" rather than "theoretical framework".  Luckily, one has great tools these days when analyzing the speech of informants.  We used to have to make sound spectrographs on a rotating drum on paper that gave off a poisonous gas.  :)  Nowadays, you can produce acoustic analyses instantly and at will on a portable computer.  Still, the aspects of the data that you pay attention to will very much depend on what research goals are directing your research.  It is well-known that hypotheses act as a filter on data collection, which is not a random activity.

There are a number of ways to determine what counts as valid linguistic forms and relations in a language. Actual speaker behavior is the main one. If I ask a speaker "Can you say [tuʔəɬəd čəd kʷi sʔuladxʷ]?" and they say "Hell no" (and are consistent, and assuming I've done due dilligence with the context, like saying that I'm talking about a dream), then I am entitled to exclude it from the set. Or, they may say "Hell yes" (and I would of course do due dilligence on that as well). In the latter case, they might even volunteer that sentence. In any event, I may be relying on the speaker's ability to introspect, but what I'm accounting for is the behavior -- not just passive behavior, but active and reflective behavior.
If I were analyzing morphonology and not phonology (as I define them, of course), then I might ask such a question.  Indeed, when I studied Breton as a grad assistant, we were sometimes looking at the behavior of Celtic mutations under conditions of language death.  So the lead researcher would try to get a younger and older pair of informants together to ask such questions about intuitions.  He then compared the answers to see which mutations were decaying fastest.  But, when we studied the phonemic inventory, we never paid attention to how speakers thought they would answer such questions.  We listened to recordings and voted on what we thought was being pronounced.  Nowadays, we would just look at the display of a spectrogram.  The problem is that naive speakers have a lot of false intuitions about what they can and can't say.

As I mentioned before (and every generative linguist has mentioned before), a grammar is an aspect of the mind, which is an entity of individual psychology. So a grammr is a psychological entity. That does not make generative linguists psychologists -- it makes them mentalists.
So here is where I push back a little.  Just what is it that you think "an entity of the mind" is?  I do not really understand what you think the difference is between a psychologist and a "mentalist", but I do think that psychologists would consider the subject matter of "mentalism" to be solidly within their purview of study.  I don't think that you can do "mentalism" without at least relating it to behavior that is mentally-driven.

The "mere" description of patterns relates to the causal mental mechanisms in it that provides something that a hypothesized mental mechanism actually does -- it is a logical precursor to theory-construction. If for instance there were a hypothesized mental mechanism Fred that had no known effect, Occam's Razor would eventually slice Fred out of existence. Sometimes, the evidence (the merely-described patterns) suggest but do not prove the existence of a causal mechanism. So being agnostic about the mental reality of a particular theoretical construct is well justified.
I suggest that Occam's Razor applies rather sooner than later to cut Fred out of the picture.  If Fred has no known effect, then he cannot be studied.  Period.  Science never actually "proves" anything in an absolute sense, but empiricism demands at least a modicum of plausibility.  If you hypothesize a "Fred", then you need to be able to say something about what Fred does.  Otherwise, there is no point in keeping him around.

And I urge you to reflect on the claim that "nobody can be agnostic about the validity of the data they are analyzing". There are three attitudes that you can have about data: "I know that X is part of language Y", "I know that X is not part of language Y" and "I don't know if X is part of language Y". If you don't have sufficient evidence that X is part of Y, you shouldn't conclude that X is part of Y. So not only can people be agnostic about supposed data, they should be.
What we are discussing here is what it means to even say whether "X is part of language Y".  How do you know that "X" can be "part of language Y"?  That is not a trivial question.  It requires you to take a stand on just what it is that you are trying to prove.  The same question arose in the last century, when linguists were told to shut up and just confine themselves to studying a "corpus" that was gathered by means that we won't bother to discuss seriously.  That actually led to a lot of confusion and debate that, apparently, has still not been resolved.

Given that I know that a grammar is a thing of individual psychology, I know that knowledge of the nature of a grammar is "psychologically relevant". It does not follow that I know that every conjectured specific mechanism of grammar is in fact a mechanism of grammar, and if I don't know whether e.g. rule iteration is a property of grammar, then I don't know if it is psychologically valid.
I certainly agree with you that "grammar" can be construed as psychologically relevant.  I question that you can get away with remaining agnostic on its psychological status.  That's the kind of thing that really comes back to bite you in the end.

We do know that linguistic structure can be recursively and iteratively defined, but that just means that our theory must account for its iterative and recursive properties.  We also know that people are not random sentence generators.  That, too, must have a theoretical explanation.  Let's make a distinction between structures of infinite length (not linguistic) and structures of indeterminate length (linguistic).  Now we can talk about the factors that limit iteration and recursion.  That's where the interesting questions come in about the relationship between structure and communicative function.  You can't really have one without the other.
« Last Edit: June 12, 2015, 03:26:05 PM by Copernicus »