Author Topic: syntactic constituenthood  (Read 12112 times)

Offline mallu

  • Linguist
  • ***
  • Posts: 142
  • Country: in
syntactic constituenthood
« on: December 22, 2013, 06:41:18 AM »
Do the tests of syntactic constituenthood work for all Languages.?
I mean the tests like substitution,topicalization etc.
Could any body tell me the answer?

Offline mallu

  • Linguist
  • ***
  • Posts: 142
  • Country: in
syntactic constituenthood
« Reply #1 on: December 22, 2013, 06:52:59 AM »
syntactic constituenthood - ARE the tests for the syntactic constituenthood  of a string of any language the same

Offline lx

  • Global Moderator
  • Linguist
  • *****
  • Posts: 164
Re: syntactic constituenthood
« Reply #2 on: December 22, 2013, 07:13:40 AM »
Constituenthood isn't a definitive thing even within one language. There's often a mix-and-match approach that is used within one language. You can have units in a language pass some constituenthood tests and fail other ones. The behaviour across a wide range of tests usually gives a good indication as to whether something exists as a definitive constituent in a language.

So, absolutely not.

Some tests for constituenthood are based on features that a specific language allows. For example, in constituenthood tests in the Germanic languages, you can use the feature of subject-verb inversion to look for subjects as constituents, but that wouldn't work in a language like Italian that doesn't have the property of this type of inversion. It's intonation (and context) that canonically signals a distinction between a statement and a question in that case.

The name is a bit of a misnomer, as it would imply a definitive result but that is not the case in pretty much every case I can think of. Constituent indicator would be more suiting.
« Last Edit: December 22, 2013, 07:15:18 AM by lx »

Offline Daniel

  • Administrator
  • Experienced Linguist
  • *****
  • Posts: 2043
  • Country: us
    • English
Re: syntactic constituenthood
« Reply #3 on: December 23, 2013, 12:44:11 AM »
[Note: I just merged your two topics. I hope you don't mind, but it's best to keep these in one place. This is more about formal syntax than typology, so I put it in this forum.]

In addition to lx's answer:

Assuming that "constituents" exist in all languages (which isn't a crazy assumption but might be worth investigating-- cases like Pirahã and 'non-configurational' languages may present challenges), then yes and no.

In the most abstract sense, yes, these tests work very well if you can figure out how to properly apply them.

In a surface-level sense, no, sometimes they don't work very well.

Constituency tests probably work in all languages (or at least in all languages that have constituents).
But English constituency tests only work in languages similar enough to English to properly apply them.

Let's create a list of tests:

1. Replacement
If you can replace some set of elements, then it is a constituent.
This test is messy in the first place-- take an obvious non-constituent like "dog because" in "I want a dog because I like dogs" -- the problem is that this can be replaced: "I want a cat though I like dogs". So replacement relies on constituency intuitions anyway.

That said, I think replacement can work in just about any language, at least inasmuch as it works in English!

2. Pro-form replacement
If you can replace some element(s) with a pro-form, it is a constituent.
This one is more robust-- if you can replace something with "it" then it is a constituent. Or "do so" or other "pro-forms" like "such" and so forth.
Of course it's unclear (still relies on intuitions) whether replacement is accurate-- consider: "I want to study syntax a lot!" then "I want to study it." Does the "it" replace [syntax a lot]? It's hard to prove that it doesn't. We can check the semantics and intuit that it doesn't involve the "a lot" sense, but it's very hard to prove this on a literal level of pro-form replacement (and "it" never means exactly the same thing as what it replaces, in some sense). Still, it's a lot better than general replacement.

Cross-linguistically this I believe works very well, given that all pro-forms replace constituents (at least I can't imagine one that doesn't). But some languages don't have pro-forms for everything. Not all languages have empty verbs like "do [so]" in the same way English does. Some languages can only awkwardly use inanimate pronouns ("it") in certain positions, like subjects.

3. Coordination
If you can coordinate X and Y, then X and Y are both constituents and the same type of constituents.
The second part of this is just false: "John is a republican and proud of it". It works on average, but coordination does not require syntactically equivalent parts, just that both are licensed in the environment ("John is a republican" and "John is proud of it").
But the first part does seem to work out fairly well: if you can coordinate two things, they're both constituents. "X and Y" involves constituents X and Y.
But of course then there's non-constituent coordination as in "I plan to [go with] and [help out] my friends." So that's a problem.
The coordination test is intuitively easy and useful, as long as you don't overuse it for things like that.

But the cross-linguistic problem is a big one: many, many languages just don't have English 'and'. Some languages use only juxtaposition so it's hard to be certain when coordination is being used syntactically, and other languages only allow coordination of some classes. For example, many languages have no coordinator for verbs, only nouns.

4. Ellipsis
If you can elide (skip over) part of a sentence, that is a constituent.
Sort of like replacement, this works out more or less. Of course sometimes trailing off might happen without a constituent or you can skip over two elements (that together don't form a constituent), but on average it works.

I imagine this is just fine cross-linguistically but that some languages have particular rules about what can be elided. Here we find that English is somewhat restrictive-- in Spanish I can say "El hombre habla" ('the man speaks') or just "habla" with the subject omitted-- and that is a constituent.

5. Movement (often fronting)
If it moves together, it's a constituent.
Various versions of this exist, including several well-known tests in English life clefting, pseudo-clefting, topicalization, question formation, etc. Generally these are fairly good tests because movement usually involves a single item. But there can be exceptions-- is "Therefore yesterday" a constituent in "Therefore yesterday we were tired"? No. So you must pick a more restrictive construction and be sure that multiple movements aren't allowed-- some languages allow double topics, for example.

Cross-linguistically the inconsistency is which tests work. How much movement is there in the language overall? What can be moved? The real problem is that most languages have arbitrary restrictions on certain kinds of movement. For example, all words are constituents, but "the" fails most movement tests: "The, I read book." etc.

My favorite test:
6. The book title test
If it can be a book title, it's a constituent.
Someone looked into this and found that something like only three books in the (US) Library of Congress catalog (a huge list, largest library in the US, maybe in the world) were exceptions, with weird titles along the lines of "I saw the."
Of course not all constituents make equally good book titles, but this test is actually very reliable, gives few false positives, and is very easy to apply. It relies on some semantics, perhaps, but that seems to work out well.

I have no information about this for any other languages, but I would imagine this applies very well cross-linguistically. It's so simple, and practically not affected by syntactic variation. Either it's a unit or not.

So in short, yes and no. The tests may fail if they don't work in a language, but the idea and reasoning behind the tests apply equally well in any language, if you can find a way to appropriately apply them.
« Last Edit: December 23, 2013, 01:01:43 AM by djr33 »
Welcome to Linguist Forum! If you have any questions, please ask.

Offline Corybobory

  • Global Moderator
  • Linguist
  • *****
  • Posts: 138
  • Country: gb
    • English
    • Coryographies: Handmade Creations by Cory
Re: syntactic constituenthood
« Reply #4 on: December 23, 2013, 03:22:43 AM »
I have no idea what the answer to this question is, but I'd guess that not all constutuent tests work with all languages.  My hunch is that there can be constituent tests for each language that work, but not each individual test would work for all languages.  I really wish I knew an agglutinative language or one with a really different syntax to test this on!
BA Linguistics, MSt Palaeoanthropology and Palaeolithic Archaeology, current PhD student (Archaeology, 1st year)

My handmade book jewellery:

Offline Daniel

  • Administrator
  • Experienced Linguist
  • *****
  • Posts: 2043
  • Country: us
    • English
Re: syntactic constituenthood
« Reply #5 on: December 23, 2013, 03:41:13 AM »
My hunch is that there can be constituent tests for each language that work, but not each individual test would work for all languages.
Exactly. (Unless some languages don't have constituents. Perhaps unlikely.)

I really wish I knew an agglutinative language or one with a really different syntax to test this on!
I don't think it would necessarily be the large scale typological differences, but rather the small details such as exactly what sort of clefting (if any) a language allows, and this could vary even by idiolect. It's not a question of isolating vs. polysynthesis (or anything else on that scale) but just the details.

As for agglutination, I've just been studying Turkish and I studied Swahili for a while.

Let's see--

1. Replacement: as well as it works anywhere, I don't see why it doesn't work there.

2. Pro-form replacement: in both languages, pronouns are relatively uncommon but I think they're usually grammatical even for light "it" uses. However, objects in Swahili are prefixed to the verb sometimes, so replacement would end up with a different look, something like Spanish:
Veo la cosa = see.1S the thing = 'I see the thing'
La veo = it see.1S = 'I see it'
Beyond that, neither language has a very good translation of English "do", so I don't know that "do so" would work out very smoothly. There's "yapmak" ('to do') in Turkish which works for "what are you doing" but I don't know about the abstract sense of "He does too"-- in fact, I'd think not.

Note that in Turkish there is no use of "I do" in answer to a question. You'd say-- "Do you read [the book]?" "Yes, I read."

Beyond this, I'm not familiar with the full lexicon of the languages but I don't know of other words to test other categories.
In short, it would work fairly well for NPs (or DPs) but probably not much else.

3. Coordination
VERY limited in both languages, so it just wouldn't work out as a general test beyond categories that easily coordinate. You can use "na" in Swahili or "ve" in Turkish to coordinate sentences, but it's awkward and subsentential elements don't always work that way, with "na" meaning 'with' sometimes, and in Turkish juxtaposition taking the place of marked coordination often, even in NPs.

4. Ellipsis:
In Turkish this would perhaps work well-- almost anything can be elided, sort of like Japanese. But it's kind of messy when potentially multiple constituents can be missing at one time.

5. Movement:
Probably would work fairly well in both. The word order is fairly flexible in them, so it would be easy to apply.
But in other languages like Latin (or Dyirbal) subconstituents can be split, so that gets messy. There's an interesting puzzle in Swahili:
Ali wa-na-soma na Juma
Ali 3PL-PRES-study NA Juma
~~Ali study and Juma.
'Ali studies with Juma' or 'Ali and Juma study'
That seems like a problem for constituency to me.
An interesting example from Turkish is:
Ben iyi bir öğrenci.
I good BIR student
'I am a good student.'
BIR is the numeral 'one' acting as an indefinite determiner. I can't really understand how far along this grammaticalization is, so I didn't gloss it specifically.
The phrase "iyi bir öğrenci" means "A good student". I have no idea what's going on with that syntax. Oddly enough, 'bir' can show up in what otherwise appear to be verbal compounds:
bir kitap okuyorum
BIR book read.1SG.PROG
'I am reading a book.'
In a definite context, we would find kitabı, marked with accusative. The reason it isn't marked with accusative in the above example is because it is a verbal compound. Yet 'bir' appears there somehow. A mystery.

6. Book titles:
Sure, why not? I don't know of any reason why this would fail for these languages. Of course there may be certain limitations of what is required to support some other word, so maybe some things would have false negatives on this test.

In short, the tests apply differently, but probably (if adjusted properly) would work out just about as well in these languages as in English, but not without some adjustments.

There are some complicated details with morphological versus syntactic realization of elements though!
Welcome to Linguist Forum! If you have any questions, please ask.