Linguist Forum

Specializations => Morphosyntax => Topic started by: casey61694 on November 22, 2014, 08:52:03 PM

Title: Another question
Post by: casey61694 on November 22, 2014, 08:52:03 PM
Is case an inevitable part of natural language? Has its evolution ever been traced? Does its "emergence" usually follow the "emergence" of a pronominal system?
Title: Re: Another question
Post by: MalFet on November 23, 2014, 02:52:01 AM
Welcome to the forum!

Why would case be inevitable? There are lots of languages without any case marking.
Title: Re: Another question
Post by: casey61694 on November 23, 2014, 08:41:00 AM
It just seems to me that the idea underlying " case " is always present in language, explicit in some languages and implicit in others. So it is just a matter of time before this idea is inevitably expressed explicitly ( to clarify relationships between verb or action participants ). Are the languages without case marking extremely well-structured to erase any possible ambiguity? 
Title: Re: Another question
Post by: freknu on November 23, 2014, 11:44:11 AM
Just look at English — aside from pronouns and the possessive (could be argued to be a suffix rather than a case, similar to Swedish) there is virtually no cases in English. While languages such as Icelandic and Russian have several cases, Yue and Japanese, AFAIK, have no cases at all, and certain African languages have a metric tonne of cases.
Title: Re: Another question
Post by: Daniel on November 23, 2014, 12:07:22 PM
Case is not found in every language. Sometimes it appears (grammaticalizes) and sometimes it is lost. In that sense, it is not universal as a morphological category so it is not "inevitable". (It is possible that in the long history of languages they'll eventually cycle through it-- it's a relatively common phenomenon, so a language might, every 5,000 or 20,000 years, end up cycling back through having case. In that sense it might be "inevitable", but I don't think that's what you mean.)

On the other hand, if you're asking if it is "not a weird part of language" then that's probably true-- it's found in a lot of places around the world.

However, case is just one instance of marking participant roles in a sentence. Another option is word order as in English. And another is agreement on the verb (where, for example, Hungarian verbs agree with subjects and objects). Case is a sort of dependent marking (rather than head marking, as agreement would be) and, obviously, an overt morphological indication as opposed to just syntax.

Historically speaking, case always (almost always?) originates from post-positions. Interestingly, prepositions rarely become case markers (I don't know of any examples, but there may be some). So it often occurs in SOV languages (verb-final, head-final). In Japanese, for example, the "particles" could be considered case markers or postpositions. Over time, the postpositions become morphologically integrated into the nouns, and you get case. Then it may also extend to adjectives and so forth through agreement. This is just one aspect of a major cycle in which languages go from isolating (no/little morphology, as in Chinese, or, in some ways, English today) to agglutinative (regular morphology, just "glued" together, like Turkish, Swahili) to inflectional/fusional (irregular and non-isomorphic [many-meanings-to-one-form] morphology, like Latin, German, etc.). Then the inflections are lost, as happened from Old English to Modern English for the most part, and you end up cycling again and again over tens of thousands of years.

That's the answer from a typological/descriptive/morphological point of view.

But another answer would come from syntactic theory (specifically Generativism), which I think might be what you're thinking about. In many versions of syntactic theory, case is considered an inherent part of language. Case checking, the assignment of theta roles, and so forth are natural and ubiquitous parts of the compositional/computational system. Some languages overtly manifest morphological encoding for case, but all of them are believed to use case in the derivation of sentences-- for example, structural nominative case must be assigned to the subject or the sentence is ungrammatical. This can explain things like the subject moving "up" to get to where case can be assigned, and so forth. In this sense, case is a structural relationship. (Note that structural case is different from inherent case-- many cases, basically other than nominative and accusative, are not structural. They're just inherent, part of the semantics. That's true in a language like Finnish with 15+ cases.)

But that's just one kind of theory.



So in general the word "case" is ambiguous between structural case in Generative theories, and overt morphological case marking.
Title: Re: Another question
Post by: casey61694 on November 23, 2014, 12:40:13 PM
Wow. Thank you all for making this an awesome resource for amateur linguists! I'll definitely be using this site regularly! Hopefully, in time, I'll be able to help out others and start some interesting talks.
Title: Re: Another question
Post by: casey61694 on November 27, 2014, 11:04:02 AM
I hope it's okay that I post more random questions here. My latest question is: why do embedded/indirect questions seem to break down grammatically after two "embeddings"? For example, I cannot think of an case in which a question is embedded inside of another embedded question which is in turn embedded inside of another question. I cannot get beyond a "double" embedded sentence like "I wonder how you know what I'm bringing" or "I asked why you wondered whether we would be late". Why is this? Why only two?
I appreciate any help. Thank you!
Title: Re: Another question
Post by: freknu on November 27, 2014, 11:25:26 AM
I hope it's okay that I post more random questions here. My latest question is: why do embedded/indirect questions seem to break down grammatically after two "embeddings"? For example, I cannot think of an case in which a question is embedded inside of another embedded question which is in turn embedded inside of another question. I cannot get beyond a "double" embedded sentence like "I wonder how you know what I'm bringing" or "I asked why you wondered whether we would be late". Why is this? Why only two?
I appreciate any help. Thank you!

"I asked why you wondered whether they knew if they had foreknowledge of whether it had any probability of explaining whether what happened was in any way indicative of whether the recent events where the result of earlier problems or whether it begs the question of whether there was a problem to begin with?"

Hm... how many would that be?
Title: Re: Another question
Post by: casey61694 on November 27, 2014, 11:51:17 AM
Where is your respect for the linguist noobie?  This does help me though.  Thank you freknu!
Title: Re: Another question
Post by: Daniel on November 27, 2014, 08:32:01 PM
But freknu, do you understand that sentence? I don't. As a syntactician I can analyze it as grammatical and maybe jot down some notes to keep up then eventually translate it into real English... but I can't process it. It's possible that if you pronounced it with just the right intonation I'd be able to follow, but I think that's going to be tricky :)

The distinction is between competence (where deep embeddings like that are grammatical) and performance (where they are unacceptable and/or incomprehensible). (Many linguists make that distinction, on the assumption that there is a difference between what we know and how that translates into usage/behavior. Some would say that any sentence we can't use is ungrammatical, though and that everything is just performance.)

So the explanation is that while syntactic structure is infinitely recursive, after a few levels, in some cases, we lose track of what's going on and get confused.

The classic example of this is center embedding as exemplified by:
The cat ran.
The cat the dog chase ran.
?The cat the dog the rat bit chased ran.
???The cat the dog the rat the bird saw bit chased ran.
...

In fact, I think I saw experimental evidence that speakers prefer this sentence on first glance:
The cat the dog the rat bit ran.
It's actually ungrammatical, because there aren't enough verbs for the subjects, but we're terrible at parsing deep center embeddings.

The question then is: why is there a limit? Is there a limit to grammatical structure? You can only embed twice? three times? Or is there some limit to our processing ability. Generally it's attributed to the latter, although some would argue it's strange to say that we could potentially parse 100 levels of embedding "in theory", saying there's some actual limit to grammar. Interesting questions there.

One argument I've heard (recently mentioned to me by a colleague) is that this may be working memory-- we can only remember so many things, and center embedding makes you remember more things-- the cat, the dog, the rat, the bird...-- and at some point we will get lost. But other kinds of sentences allow you to package the information together (such as conjunctions-- "the cat and the dog and the rat and the bird...") so you won't get as confused, just using one unit of memory for the whole thing. So the argument would be that the grammatical structure "forks" sometimes and that means you're using more memory, and sometimes merges back together, at which point you're using less. If you reach the maximum (perhaps 4 units of memory or so) you'll get confused.


Now, on the other hand, there are also some specific restrictions, sometimes called or including "island constraints", on what can be questioned one. So that isn't about "only two" but about the specific meaning of the adverbial being questioned. Consider this odd sentence:
How did you wonder [whether] I would fix the car?
Can "how" refer to the embedded clause? vs:
What did you wonder whether I would fix?

In that case, it depends on the structure, not levels of embedding (though embedding, at all [but not counting levels], could have an effect).
Title: Re: Another question
Post by: casey61694 on November 27, 2014, 10:13:58 PM
Forgive me if I ask or talk about something you already covered. I tried to get as much out of what you said as I could. 

In exploring "multiple" embedded questions, I found that you have to be mindful of the order of the interrogative [spec, CP]'s you use since the first has the widest impact on the rest. The second interrogative [spec, CP] has the second-widest impact on the rest and so on. The first interrogative [spec, CP] should therefore be general and not have a specific subject/object role in the following clause (like "why", "whether" ...). After I discovered that, I got "I asked if she wonders why I memorize what day people are born on".

Even if we could parse 100 levels of compounded embeddings, I think most would choose to be economical and break it up. Would it even be possible to create a program to extract meaning out of something so complicated? How would you go about it?

Also, could you expand a little on island constraints?

Sorry if this is really disorganized.  I'm ready for sleep.



Title: Re: Another question
Post by: Daniel on November 27, 2014, 11:04:14 PM
I'm not sure what you mean by "widest impact". Formalizing this might help to answer your own question, actually.
According to generative grammar (and in general most approaches which assume infinite recursion) there is no inherent difference between something at the first level of embedding and the fifth or 100th. Sentences are either grammatical or not. The only explanation that could be compatible with what you're saying would be something about processing-- certainly the first word we hear in a sentence could bias us against the rest. If that's all it is, I don't think it's a very deep issue. But if it's something more, it might be important.

Quote
Even if we could parse 100 levels of compounded embeddings, I think most would choose to be economical and break it up. Would it even be possible to create a program to extract meaning out of something so complicated? How would you go about it?
If you could write a program that parses a few levels (say, up to 5) recursively, and you gave it enough memory and processing power, then yes, certainly it could do 100 or 1,000,000, or whatever.

Quote
Also, could you expand a little on island constraints?
Lots to read about that, with a very long history. Some of the early work that might interest you can be found in Ross 1967:
Ross, J. R. (1967). Constraints on Variables in Syntax (Ph.D dissertation). Massachusetts Institute of Technology, Cambridge, MA. [You can find a free-to-view PDF on the MIT website.]
There will also be a ton of information online.
https://www.google.com/search?q=island+constraints
Title: Re: Another question
Post by: freknu on November 28, 2014, 06:21:49 AM
But freknu, do you understand that sentence?

Of course, I can, silly... :\ *scratches head*

There is also nothing wrong with "the cat that the dog chased that the rat bit, ran", then again, maybe native English speakers don't use structures like that. I might just be using my native logic in speaking non-native English.
Title: Re: Another question
Post by: jkpate on November 28, 2014, 06:47:28 AM
Quote
Even if we could parse 100 levels of compounded embeddings, I think most would choose to be economical and break it up. Would it even be possible to create a program to extract meaning out of something so complicated? How would you go about it?
If you could write a program that parses a few levels (say, up to 5) recursively, and you gave it enough memory and processing power, then yes, certainly it could do 100 or 1,000,000, or whatever.

Just to follow up on this, one reason center-embedding has attracted so much attention is that it seems like it should be easy from a computational perspective. We know how to write grammars and parsers that in principle handle unlimited center embedding, but our computer programs struggle with other things that are easy for humans (like co-reference resolution).

djr33 is right that the general feeling among computational people is that it has something to do with working memory and incremental processing. Computational work on incremental parsers (parsers that build structure word-by-word, left-to-right) has found that some parsing strategies involve extra work for center-embedded structures, but not others. For example, if you apply a left-corner transform to a CFG so it can be parsed incrementally by a top-down parser, the only structures that increase stack depth are (pre-transform) structures with the zig-zag pattern of center-embeddings (http://web.science.mq.edu.au/~mjohnson/papers/acl97.pdf). If there is some kind of hard limit on stack depth, such a parser fails only on sentences with center embeddings. You may also be interested in reading a more recent paper on this topic (http://kmcs.nii.ac.jp/~noji/papers/coling14-left-corner.pdf) that uses the connection between left-corner parsing and transition-based parsing (probably the dominant paradigm for incremental parsing) to build parsing technology for exploring memory limitations and center-embedding cross-linguistically. I can provide a couple more references along these general lines if you're interested in a probabilistic perspective, although they focus more on memory limitations and garden paths rather than center embedding specifically.
Title: Re: Another question
Post by: casey61694 on November 28, 2014, 09:56:50 AM
Thank you jkpate! I really appreciate your input, but it might be a while before I'm ready to read and actually understand the papers you posted. I'm not there yet.

This is what I mean by "widest impact":
A sentence like "I know [whom you will ask] or [whom to ask]" can, I feel, be replaced by "I know [him/her]". This tells me that the CP "whom you will ask" or "whom to ask" is basically expressed in terms of the [spec, CP]. If something is embedded in this CP (like another CP), it too will be expressed in terms of the first [spec, CP] (and so on). Is that basically right?
Title: Re: Another question
Post by: Daniel on November 28, 2014, 10:54:27 PM
freknu,
Quote
Of course, I can, silly... :\ *scratches head*
Hmm.... I re-read it. It seems like normal English, but I get a little confused partway through it. It certainly could work if I already understood the context (maybe in the middle of an argument-- "no, it's actually THIS WAY: ..."). It's also right-attaching (rather than center-embedding) so that might make it a little easier. It's on the border of making sense, but still, if you have 100 levels if could be very confusing. (Some interesting examples of this are early American legal documents (a favorite of mine, where multiple paragraphs are one sentence is http://www.archives.gov/exhibits/featured_documents/emancipation_proclamation/transcript.html ).

Quote
There is also nothing wrong with "the cat that the dog chased that the rat bit, ran", then again, maybe native English speakers don't use structures like that. I might just be using my native logic in speaking non-native English.
Is there a limit, though? Can you go to 4 levels? or 5? or 10? Some languages are said to more easily do 3 levels than English, but few if any can naturally do 4. If you can supply data (in any language) on this I'd be curious. I wouldn't be shocked by 4-5 levels, but beyond that would be interesting! Even 4-5 would be worth publishing about though I think.


--
Quote
This is what I mean by "widest impact":
A sentence like "I know [whom you will ask] or [whom to ask]" can, I feel, be replaced by "I know [him/her]". This tells me that the CP "whom you will ask" or "whom to ask" is basically expressed in terms of the [spec, CP]. If something is embedded in this CP (like another CP), it too will be expressed in terms of the first [spec, CP] (and so on). Is that basically right?
The nature of recursion is that each level is independent and may internally be recursive as well. So I don't really follow. You can embed another level within that, or, also importantly, you could embed that in another level:
"I know that you know whom to ask." And so forth.
So "widest" is relative.
Title: Re: Another question
Post by: casey61694 on November 29, 2014, 01:34:02 AM
This really cleared things up for me, thank you!
Title: Re: Another question
Post by: casey61694 on December 03, 2014, 10:45:20 PM
I have another question. I have to admit that this one's pretty trivial, but I still think it's worth sharing:

 Are "semi"-wh-questions ("How about..." or "What about...") another way of asking yes/no questions?

For example:

"Do you like the color purple?"  "No."
"What about green?"  "No."
"How about red?"  "Yes."
Title: Re: Another question
Post by: Daniel on December 04, 2014, 12:15:24 AM
No. They introduce topics and imply (elided) yes/no questions.

"How about red, do you like red?"

(Certainly they function to do so, just like "You?", but there's no particularly interesting structural relationship between semi-wh- and yes/no- questions.)
Title: Re: Another question
Post by: casey61694 on December 09, 2014, 08:02:14 PM
Pardon the odd example.

"The storm ate up September’s cry of despair, delighted at its mischief, as all storms are."

In this sentence, is "as" acting as a "relative pro-participial phrase" introducing an adverbial-type clause?

Any clarification/help is appreciated. Thank you all!
Title: Re: Another question
Post by: Daniel on December 09, 2014, 09:48:34 PM
Can you explain a bit more?

"As" in that case is what is traditionally called a subordinating conjunction. In formal syntax it's probably best analyzed as a Complementizer. Either way, it links clauses together.

The meaning of that whole clause then is as an adverbial, specifically a manner adverbial, which could answer the question "How [was it delighted]?".

There are pro-forms that aren't pronouns. For example, "do" is a pro-verb (actually more like a pro-VP, though I guess pronouns are usually pro-NPs), and "so" is a pro-adverb[ial] (or sometimes pro-adjective?). But "as" here I don't think counts. Instead, the sentence just involves ellipsis. "As" certainly supports that by suggesting parallelism.
Title: Re: Another question
Post by: casey61694 on December 09, 2014, 11:18:38 PM
That is just how I've always parsed sentences like that. It makes semantic sense to me. It also behaves like other relative clauses. For example:

"This is the green toad, which no one else found."

The relative clause cannot be preposed (unlike a subordinate clause).

"Which no one else found, this is the green toad."

Similarly,

"This toad is green, {as all toads are}."

what is in curly braces cannot be preposed.

"As all toads are, this toad is green."

I think that "as" in this case is functioning as a relative pro-adjective introducing a relative adverbial clause.

Also, what do you mean by the last sentence in your last post?

Thank you djr!





Title: Re: Another question
Post by: jkpate on December 09, 2014, 11:28:35 PM
Similarly,

"This toad is green, {as all toads are}."

what is in curly braces cannot be preposed.

"As all toads are, this toad is green."

The preposed sentence is fine for me. I suppose I might be able to get a comparative reading with degrees of green-ness on the non-preposed version ("this toad is as green as all toads are") that is not available on the preposed version:

"This toad is (as) green as all toads are"
"*As all toads are, this toad is as green"
Title: Re: Another question
Post by: Daniel on December 10, 2014, 12:39:41 AM
Yes, the preposed version is fine, possibly not as natural.

As for my last sentence, usually under coordination there are parallel structures and there is ellipsis:
He sang and [he] danced.
He sang and she did so too.

So when we see ellipsis there, it probably represents something about those events/clauses being parallel. I don't have any specific analysis for 'as' in mind though.
Title: Re: Another question
Post by: casey61694 on December 12, 2014, 03:53:28 PM
Why can't "am" contract with the negative particle?

Is this sort of thing not possible because "am" ends with a consonant?  Don't "is" and "does" and "should" also end with consonants? I know nothing about phonology or phonetics, but I'm guessing there are different "degrees" of consonant.  Is "m" a "harder" consonant than "s" and "d"?

He's brave, isn't he?

You're brave, aren't you?

----------

I'm brave, aren't I? (?)

I amn't scared!
Title: Re: Another question
Post by: jkpate on December 12, 2014, 06:55:24 PM
There is a contracted form for "am not" in some varieties of English:

I'm brave, ain't I?
I ain't scared!

However, this form is stigmatized as uneducated, so many talkers avoid it. Another factor may be the high frequency of "am not," since very frequent forms tend to be irregular. However, they also tend to be short, but "am not" of course is longer than "ain't." In any event, I don't think there's any regular phonological rule involved.
Title: Re: Another question
Post by: casey61694 on December 12, 2014, 07:51:16 PM
very frequent forms tend to be irregular

It's kind of strange to me that the opposite isn't true instead. Shouldn't frequently used forms (like the verb "to be" for example, with its eight forms ) become more streamlined with time and predictably follow "the rules"?

Could you maybe give another example of this? I appreciate your help!
Title: Re: Another question
Post by: jkpate on December 12, 2014, 08:41:17 PM
Perhaps the tendency is easier to understand if I state it in the other direction: less frequent forms tend to be more regular. It's well-known that language vocabularies tend to be Zipfian: most of the word types (i.e. lexical entries) are rare. A system is easier to learn or remember if the rarest 95% of forms follow a rule than if the most frequent 5% of forms follow the rule. You can more easily memorize an irregular 5% and a rule for the other 95% than a rule for 5% and irregular 95%. So, if you want to streamline the system over time, you get more bang for your buck by streamlining the rare types.
Title: Re: Another question
Post by: Daniel on December 12, 2014, 11:36:08 PM
Any word can be irregular, but over time the less frequent ones will be forgotten and regenerated by the grammar as regular. You can, if you want an example, look at the history of irregular verbs in English over the last 1000 years or so. As words are used less frequently, they regularize. We don't know exactly what the past or participle forms of "smite" should be, but they're in early versions of the bible and other texts.

More frequent forms are preserved as irregular, and over time they may end up irregular, so eventually this means they probably will be. The verb "be" is irregular in almost every language (I'm thinking about a broad sample, but this is at least literally true throughout Europe), at least if there are any irregular verbs at all.


There's also something similar where "complexity" (=irregularity?) is preserved by small communities with minimal contact with others. Several recent theories about social dimensions of complexity have focused on this, such as work by Trudgill.
Title: Re: Another question
Post by: casey61694 on December 13, 2014, 12:45:49 AM
So the natural tendency is for verbs (for instance) to be irregular?

If all people hypothetically had unlimited memory space, would it then be reasonable to say that this pattern would've continued and endured for pretty much every verb? Would "irregular" have become the new "regular"?

P.S. I've never thought of verb forms as being somewhat determined by limits on our memory. That's interesting.

Title: Re: Another question
Post by: jkpate on December 13, 2014, 02:35:05 AM
I have a few comments in response to your question. First, it's not necessarily about memory. It could also be about learning (as an infant). It should be easier to learn a rule if you have evidence for the rule from thousands of verb types, even if each individual verb type is rare. Second, even if it is about memory, that doesn't mean that people are not capable of memorizing each form. All we need is that humans prefer not to memorize each form.

I'm not sure what a "natural tendency" is, but I think there are lots of forces that could push verbs to be irregular. For example, some verbs are borrowed from other languages. Also, child acquisition of verb learning exhibits a U-shaped learning curve (http://unt.unice.fr/uoh/learn_teach_FL/affiche_theorie.php?id_activite=53). This is presumably because, early in the acquisition process, a child will have more reliable evidence for a particular verb than for rules that abstract over verbs. The child can learn individual verbs in isolation and get the irregular verbs right initially, learn a rule that overgeneralizes later, getting the irregular verbs wrong, and then learn the exceptions. (statistical models can reproduce this curve, but with extensive debate that is summarized in this paper (http://psych.stanford.edu/~jlm/papers/PastTenseDebate.pdf))