I am not sure I understand. Of course the proportion of words that look similar is higher among cognates than in those that aren't cognates (aside from borrowing). If there are a lot of cognates between languages, of course there are going to be cognates such as Croatian "svinja" and English "swine" or Croatian "sunce" and English "sun". It's not like all the phonemes are bound to change to dissimilar phonemes in a few thousand years.
Correct. I don't disagree with you about that, and the problem is less extreme than the example I gave. My point was that the most likely flaw in your argument (
if there is one) would be that you are finding relatively obvious examples (that
look convincing!), and that perhaps the true etymology is more obscure and less transparent than that. It's not about any of your particular derivations or anything, just a general comment of something to watch out for, example, if you might find some of your etymologies more intuitive than existing proposals (which might actually be correct because they are
less intuitive, if you follow what I mean).
It is great though to see you incorporating the known sound changes into your work, so as I've said, your proposals seem plausible to me, and unfortunately I can't comment further than that.
This is not really related to the topic, but why are you guys so much against the attempts to reconstruct older proto-languages? If the first human languages were sign languages that gradually evolved into fully spoken languages, isn't it reasonable to assume that all spoken languages share a common ancestor?
That's a topic for another thread! But I'd be happy to discuss it. In short, after about 10,000 years the changes have piled up too much to be sure of what's a borrowing or a chance resemblance. I personally am not opposed to trying to go back farther, but there's a point where it becomes almost impossible. Reconstructing proto-world is just not possible, since it's at least 5 or 10 times older than that 10,000 'limit'. Maybe we can push the limit to 20,000 years, but 50,000 or 100,000? The two methods we have are broad statistical comparison that breaks down around 10,000 years (the noise can no longer be distinguished from the real data), or reconstructing based on our reconstructions, which is a viable possibility but ends up stretching our hypotheses (and layering them) too much to be sure of much. I could go into more detail (start a new topic?) if you'd like.
Why do you think those statistical methods work any better than common sense does? If you count the English dictionary equivalents of the Croatian words that start with a 't', do you think that a significant portion of them will start with a 'th'? Wouldn't the early loanwords and coincidences average them out?
I'm not arguing for blind statistics at all! What I'm saying is that if you have a large sample, you'll probably be right on average. If you have a single data point, there's no reason to assume you'll be right that one time. Imagine a game of darts where you win by guessing what the thrower is trying to hit. With a sample of one, you can only assume they were trying to hit wherever the dart landed (or near there). With a sample of 10 or 100 or more, you can much more likely guess where they were aiming overall. So figuring out whether two languages are related (do they share a lot of cognates) is a much easier question than figuring out the etymology of an individual word-- if you're wrong sometimes, you can still be right statistically, but not for an individual data point. Thus without any direct evidence, an individual etymology is harder to figure out and there's no clear way to verify that you're right.
But that doesn't mean we should just dump the dictionary into a statistical program and see what happens! (And far too often non-linguists do something along those lines and claim they've solved a major linguistics problem like where the homeland of Indo-European was-- they're almost always wrong, even though, unfortunately, papers like that can get a lot of attention in the world outside of linguistics.)
Well, finding a few hydronyms such as Colapis or Serapia, or a toponym near the river such as Andautonia, in a native American language would prove that my methodology is flawed. But I don't think there are such.
Your methodology isn't flawed. It's just not clear how to verify that your individual results are correct. (They might be!!) And it's certain that in general many places are named after bodies of water.
My most serious opponent is probably the Croatian etymologist Petar Skok, who ascribed many toponyms to "Mediterranian substratum", that is, the supposed Pelasgian language, spoken all the way from Itally to Turkey. His evidence was allegedly the same or similar elements appearing in toponyms in that area. However, he didn't ascribe any meaning to those supposedly repeating elements. So, I am pretty sure that his hypothesis is invalid.
Unclear. You might be right. The question is how to decide between the two.
I waged a Wikipedia war against his hypotheses there:
Wikipedia is not the place for original research, and a 'war' there is utterly meaningless, even if you 'win'. If you have new ideas, publish them. Sometimes science is no better than just sharing ideas and hoping other people read them. After they are published (and especially if others start to accept them as useful/good/reliable/etc. ideas) then you can cite them on Wikipedia (and elsewhere). And in publishing them you'll get a peer review from someone who really knows the specific subject in detail and can determine whether your contribution is worth sharing. The result will be your hypothesis as a competing hypothesis with the others out there. And time will tell what happens. Maybe nothing. Maybe just two plausible hypotheses with no clear way (currently) to decide between the two (that's what I observe at the moment). But yours will be on par with the other then.
I am a native speaker of Croatian and I know something about linguistics, so I think I can safely tell you that those etymologies are invalid.
Neither of those is a qualification for knowing etymologies. Native speakers have no intuition whatsoever about the historical state of their languages (I wasn't born knowing how Shakespeare wrote, for example, or with any intuition that my American English was somehow inherited from England, or before that older Germanic tribes). And knowing something about linguistics means you can attempt to figure it out (as you are doing!), not that your answers are necessarily correct.
And the same goes for the other supposed explanations using Croatian which I tried to contradict.
The way to dispute them is to show that you have an alternative explanation that is equally coherent, and that there may be some flaws in that argumentation. If so, you probably have something publishable. Again this isn't my personal area so I can't tell you if that's the case.
Etymology is not a major focus of linguistics publishing these days.
Perhaps because it takes less knowledge of linguistics to discuss morphosyntax than to discuss etymology. I could probably write two full pages about what part of speech is "plus" in Croatian without doing any actual research. But that probably won't be interesting to anybody whom I know.
Not that I'm personally offended, but that's a major oversimplification of things.
I don't disagree with you that there are
some things in linguistics (including some published papers) that don't take much background to understand. But there are vastly more things that take a lot of background (e.g., a PhD, or lots and lots of self study). If you really know the subject well, the next step is to publish. If you're not ready for that, that's fine! But until you can publish about it, you're not at that level yet. (Admittedly there are some theory-internal political factors for some journals in terms of what will be likely to be published, but rarely will it matter who you are because for good journals the process is double-blind so they don't know your name or whether you're affiliated with a university or whatever; watch out for predatory pay-to-publish journals that will publish just about anything they receive, but that's another story).
As for the real reasons why etymology isn't a major topic in linguistics today, here are some bullet points:
--Historical linguistics in general is not a major topic anymore. I personally don't like that (and some others also want to promote it), but most research now is about how languages 'work' (in the brain, in the mind, in society, in their structure, etc.) rather than where they came from.
--Emphasis is on how languages work rather than the details of individual lexical items.
--Etymology is still alive and well (relatively speaking) in the field of philology, which is like linguistics, but more interested in describing details like etymology rather than an explanatory science (linguistics) about how language works.
--Plus, etymology requires high level knowledge of individual languages, and (unfortunately in my opinion) the personal language knowledge of individual linguists seems to be in decline, in favor of things like statistical, experimental, computational, corpus, etc., methods. To oversimplify, not all linguists today have studied Latin and Greek (as was once common), and few study Indo-European roots in detail-- Historical linguists do, but there are fewer of them, as I said.
--Overall, there is also the fact that much of Historical Linguistics and etymology, in the sense that it interested the first historical linguists a couple centuries ago (and which eventually lead to modern Linguistics as a science), has actually been solved. There are dictionaries of etymologies for Proto-Indo-European roots, and while they are far from flawless, a large amount of research has been done, and has been done well. So linguists have moved on to other projects, while of course some still work on issues related to those original topics.
(Personally I study syntax, as well as morphology and semantics, mostly from the perspective of 'how language works', but also from historical and comparative perspectives. So I can certainly comment on this topic, but you'll need to find someone who specializes in Indo-European etymology, and perhaps Croatian in particular, to give you specialized feedback on this topic. It's nice to see you working on it though!)