Recent Posts

Pages: 1 [2] 3 4 ... 10
11
Phonetics and Phonology / Re: Spelling letters phonetically without using the letter
« Last post by JackLay on December 26, 2018, 03:16:07 AM »
But no, there is no obvious or clear way to spell out some letter names like "B" without using those letters. I don't see why you'd want to do that, though.

This is where i hit a wall, letters with names that start with the sound that they make. As to why i'm doing this, i don't really have a reason. It just came into my head a couple days ago and i've been obsessing over it since.

You could manipulate letters according to the rules of English spelling and spell the name of k as "kay" or "cay", likewise in Norwegian you could spell the name of the letter "kå". My point is that sounds, letters, names and spelling are different things. If I was forced to spell the name of the letter "b" without using "b", I would write it as "бe", maybe "βe" or "ᛒe" (Fuᚦark is usually considered a different alphabet), and hope that people could figure it out.

Thanks for clarification. I did mean spelling the name of the letters, but couldn't think of a way to phrase it well at the time. You helped me big time by introducing "βe", "ᛒe", etc. as i never thought to us a foreign alphabet before.
12
Phonetics and Phonology / Re: Spelling letters phonetically without using the letter
« Last post by panini on December 25, 2018, 11:00:56 AM »
All I have to add is that you can manipulate sounds, but sounds cannot spell letters. Words are spelled with letters (in some alphabet: there are many alphabets); words typically have some associated sound. Individual letters are self-spelling, for example b is spelled "b", x is spelled "x". In English, k has a name which is pronounced [kɛɪ], and the name of the same letter in Norwegian is pronounced [ko]. You could manipulate letters according to the rules of English spelling and spell the name of k as "kay" or "cay", likewise in Norwegian you could spell the name of the letter "kå". My point is that sounds, letters, names and spelling are different things. If I was forced to spell the name of the letter "b" without using "b", I would write it as "бe", maybe "βe" or "ᛒe" (Fuᚦark is usually considered a different alphabet), and hope that people could figure it out.


13
Phonetics and Phonology / Re: Spelling letters phonetically without using the letter
« Last post by Daniel on December 25, 2018, 10:16:13 AM »
English spelling really doesn't follow any strict rules, so you can pronounce names however you want. I guess that could apply to naming letters too, so if you want to pronounce the name "Z" as "A" then you can do that. But no, there is no obvious or clear way to spell out some letter names like "B" without using those letters. I don't see why you'd want to do that, though.
14
Phonetics and Phonology / Spelling letters phonetically without using the letter
« Last post by JackLay on December 25, 2018, 12:04:12 AM »
Hello,
New to the forum, so i hope i'm posting this in the right place.
Is there a way to manipulate the sounds of the English language in a way to spell letters the way they're called without using them?
The hardest i can think of off the top of my head would be letters like B, N, M, etc.
While Letters like A, C, E, etc are fairly easy to write in this manner.
15
Outside of the box / Re: Croatian toponyms
« Last post by FlatAssembler on December 19, 2018, 11:25:46 AM »
Quote
His proposal is obviously motivated by the Croatian right-wing politics.
Quote
I'm not suggesting anyone's politically-motivated arguments are better than anyone else's.
Why would my hypotheses be politically motivated? I just don't get what you are trying to say. I am not a fan of the Croatian right-wing politics, I am a libertarian. And, even so, somebody motivated by the Croatian right-wing politics would probably claim (without evidence) that the toponyms come from Croatian, not that they don't come from Croatian. Daniel, do you get what that guy is talking about?
Quote
The name of the river Vuka, for instance, obviously comes from the common Serbian personal name Vuk.
Then, how it is that the name Vuka is attested back in antiquity, way before there were Serbs (or even Slavs) there, as Ulca? Also, how is "Vuk" an exclusively Serbian name? It's a word for "wolf" in both Croatian and Serbian.
Also, a bit of an off-topic remark, but if the Serbs there would be so much better off had Serbia conquered Vuka, why don't they move to Serbia then? You know, to a place where everyone shares their language and presumably also their religion? Could it be that they feel they are less oppressed by the Croatian government (which, I agree, doesn't treat people of different nationalities fairly) than they would be by the government of the Serbian president Aleksandar Vucic?
Quote
the name "Serbinon" is attested on the Ptolemy's map,
Historians don't quite agree where Serbinon was, but the map you've linked to doesn't quite suggest it was near modern-day Zagreb. Nor does, as far as I am aware of, any Croatian historian agree that Serbinon was there.
Zagreb is on the Sava river, Serbinon (at least on the map you've shown here) is on the northern bank of the Drava river (that is, somewhere in south-west Hungary). Zagreb would be somewhere near Siscia and Segestica and north-west of them, not north-east (unless you assume all the traditional ubications are somehow wrong).
Besides, the right inference from the data you've shown is that Serbinon was named after the Serapia river (for it was near its confluence, at least on the map you've linked to), not after the Serbs (who, according to the mainstream history, never were there to begin with, and were not even close to there until at least the 5th century CE).
Quote
I find it very hard to believe languages actually behave that way.
How is anything else even conceivable? Sound changes that would, for example, randomly (without a phonological rule) change some 'd'-es to 't'-es, some 'd'-es to 'z'-es and left some 'd'-es unchanged would make a language incomprehensible rather quickly. Or do you think that many words in languages come into being without etymologies, that many words are formed just by picking random sounds? That would also make a language incomprehensible.
Quote
I am pretty sure that if you tested those rules with more such words, they will fail to correctly predict how the shapes of the words changed more often than not.
So, you are not comfortable with hypotheses that can be hypothetically falsified? Sorry, that's what science is all about.
Quote
And I am also pretty sure no linguist specializing in Serbo-Croatian would affirm you the rules in the table are correct.
Have you tried asking some? Chances are, he or she knows about it less than I do and will be impressed by my work.
Quote
So, which approach do you think is more scientific?
Isn't it obvious by now? I'm basing my explanation that "Zagreb" comes from Illyrian *Dzigurevos on a set of potentially falsifiable hypotheses, presented in that table on my web-page. You are running away from making falsifiable hypotheses about sound changes because "I don't think that's how languages work.".
16
Semantics and Pragmatics / Polysemy in derivation
« Last post by vox on December 18, 2018, 05:39:03 AM »
I’m struggling with the treatment of polysemy in derivation. For instance the agent/instrument polysemy of -ER in English. I want to compare different theoretical views dealing with :
1. How to represent the articulation of what is in common and what is distinct in the meanings ?
2. How the meaning is selected/specified ?
3. If a meaning is specified, are the other possible meanings just blocked ? Are they unblockable ? Could ‘eraser’ be reinterpreted as ‘person who erases’ ?

Do you know any author who proposed a treatment for one those questions ? Or maybe you have your own opinion ? I read Rainer and Plag. I have to keep on searching but that would make me save time if ever you could share interesting references.
17
Historical Linguistics / Re: Romance languages not descended from Latin.
« Last post by Daniel on December 16, 2018, 02:02:38 AM »
That addresses some relevant details, but overall, setting aside terminological preferences, I don't think it changes the most general points, as indicated in the conclusion of the article. (I just briefly skimmed it though.)
18
Historical Linguistics / Re: Romance languages not descended from Latin.
« Last post by Forbes on December 15, 2018, 03:39:23 PM »
I came across this article: prudentia.auckland.ac.nz/index.php/prudentia/article/download/840/791
19
Language-specific analysis / Re: What language is this?
« Last post by panini on December 14, 2018, 10:18:58 AM »
Well, the speech rhythm seems all wrong for a Celtic language, but that gives you something to test. You're likely to run into the problem that you faced with the Finnish person you talked to, that individuals may have attenuated ability to recognize a related language. While a Maltese speaker would most likely recognize that a speaker of another Maltese dialect is speaking a "related language", I doubt they would recognize (or be recognized by) a speaker of Gulf Arabic or Amharic.
20
Computational Linguistics / Re: Recommendation on Diachronic Corpus
« Last post by Daniel on December 13, 2018, 04:57:57 PM »
For English? (Although it may be more interesting to look at other languages, the resources will be more limited than what I describe below.)

There are a number of possibilities, and there will be a tradeoff between quality and quantity.

One of the largest corpora is Google Ngrams based on data from Google Books, but the search tools are limited and it isn't full text, and once you get back before 1800 the data isn't quite as good. It's especially limited during the 1500s, and somewhat better after 1600, but still much less data then than later. What that means is that you will have potentially unrepresentative data points in the earlier years (and therefore much more noise in the graphs). Still, even that limited data may be more than you find in some other corpora, but it isn't balanced with the rest.

There are some nice corpora offered at BYU: https://corpus.byu.edu/
They have a good balanced between quality, features, and ease of use, though with some limitations (you can't download the full text for all of them, you have limited queries per day depending on your access type after [free] registration, etc.). COHA is very nice for American English 1800-2000, better (but less) data than Google Ngrams.

A few other specialty corpora there are also helpful:
The Hansard Corpus is especially nice, because it represents spoken British English from parliamentary proceedings since 1803, so that's really unique. Of course the language is formal, but I've found it useful.
Similar, but written and even more formal, is the new American Supreme Court corpus, from 1790.
The Time Magazine corpus is also interesting, though only the 1900s.

EEBO (Early English Books Online) is now available through there (also search online for "EEBO-TCP" for several other interfaces), and that's a good source for earlier material, better than Google Ngrams for the time, and full text is available (at least through some interfaces with login). But it's mid-1400s through 1600s, so it only covers about a century of the time period you're looking for. (Watch out for variant spellings in the corpus, so you'll need to do a lot of manual work to get [and interpret!] the best data, but for that time period it's very useful.)

So as you can see, having a single corpus from 1600s-1900s is going to be difficult, although you could try to compare some features across several corpora to cover the full range. If you can pick just a subset of those years, I would personally strongly suggest the Hansard corpus because it is spoken language, and represents 200 years, so there's a lot to work with.

There are also some smaller corpora that might fit your requirements and maybe for almost the full period you're considering.

For example, Corpus of Late Modern English Texts (CLMET, in several editions, e.g. 3.0) is much smaller than some of the options above, but it's also maybe better balanced, and if you don't need a huge amount of text (either you're looking for relatively common features, or you're planning to look at each example manually so you can't handle a lot of data anyway), then something like that (it's just one example) might be good for you:
https://perswww.kuleuven.be/~u0044428/clmet3_0.htm
Of course for something like that you're probably going to be mostly getting data from published books, though sometimes you'll find some personal correspondence (letters) if you want something more natural. (There's also the question of whether you're prefer probably very formal non-fiction, or possibly unrepresentative but colloquial fiction, e.g., for examples of dialog.)

There are also various specialty corpora such as full collections of all of Shakespeare's works, but those won't easily generalize to the full 400-year period.

Those are just some examples from my own experience. There are some other options as well, even making your own corpus from books you find online (anything about 100 years ago or older is likely accessible online for free from a combination of sources like Google Books, Archive.org, Project Gutenberg, etc.). For example, a simple scenario would be to choose one similar novel from each century and compare them, but there are much more complex ways to do it too. Something else I have seen is using an existing database of collected examples such as those found in the Oxford English Dictionary (searchable online with a subscription, probably through your university). That can work and offers a wide range of texts throughout the history of English, although the selection of examples in the OED is biased for illustrative purposes rather than a real balanced picture of what English was like. And don't assume they've really found the earliest examples for any words in those entries-- the OED is a huge project and therefore of limited accuracy for any individual word if you're most interested in when usage changed. It's good, but not the definitive answer on anything. (I saw a compelling conference abstract about how the OED has biased many research results in this way, with authors thinking something is later than it really was if you look into the details yourself, and I've done the same for my own work.)

Something else to watch out for is that I have personally gotten the impression that it's easy to think whatever phenomenon you're looking at "starts" near the beginning of your corpus, when the frequency is increasing. Be very careful about making such generalizations. (From my own research I know of a published paper, which otherwise appears strong methodologically, but claims something began in the 1800s because it seemed to be increasing at the beginning of that period, but then more recent research has shown it was found starting in the 1500s.) It's very easy to get that impression, and I wonder exactly why this is, but watch out for it, especially for the time periods you're talking about.

In summary, there are a lot of sources, but you'll need to find what works for you. Finding a single good source, even if it wasn't exactly what you had planned, might be motivation enough to reframe your research questions to fit (for example, using the Hansard corpus for 1800s-1900s, rather than starting in the 1600s with a mix of corpora). Corpora also have very different genres represented, so watch out for that both in selecting them in general and also especially if you mix them to look at a larger timeframe. If you must do that, then the most consistent source (but not necessarily best data) will be published books.

The other consideration is your technical skills: if you can write enough code to search, organize and compile the results from plain text, then you might be best making your own corpus from texts available online. If you can't do that, then you should rely on some of the easy-to-use options (some of which are mentioned above) with automatic search functions, etc. The other question is how you will search the data: do you need a tagged corpus (with part of speech and other features) or do you want plain text? Are you looking at syntax? Morphology? That can have an impact on what kind of corpus you need. And also how much data you need, depending on the frequency of the phenomenon in question. An easy benchmark is to pick a corpus of Modern English (maybe COCA or BNC or just Google Ngrams) and then do a basic search to see how many results you get per million words-- that will give you an idea of the smallest reasonable size you can work with.
Pages: 1 [2] 3 4 ... 10