### Author Topic: The language of old Europe  (Read 291836 times)

#### jkpate

• Forum Regulars
• Linguist
• Posts: 130
• Country:
• American English
##### Re: The language of old Europe
« Reply #150 on: February 14, 2014, 04:50:51 PM »
What do you think is statistical chance for the close - heat - sun - fire pattern to arise as a random pile up of letters in two distant European languages like Serbian and Irish. I am asking you for your personal opinion?

While statistical analysis does involve judgement calls, the level of chance is not just a matter of "personal opinion." There are well-established methods for calculating what the "chance" baseline is under various judgements about what constitutes "chance." The chi-square test I mentioned is a good one for a binary notion of "related to summer," while the logit regression is a good one for a continuous notion of "related to summer."

And how can you explain that deliberate action like trying to convey meaning to someone, will result in random babble of sounds devoid of any premeditated meaning. I think sometimes that i might be talking to wrong people. I was hoping to talk to linguists, and i am talking to statisticians.

Linguistics has been getting more quantitative recently, but that's because statistical methods are useful for getting reliable answers to some questions about language. I thought your theory involved a correlation. Statistics gives us the tools to measure and verify the existence of correlations and so should be of interest to you.

How can you statistically explain why this is the symbol for one "|"? Why this is the symbol for two "||"? Why do we use 0 as a symbol for ten ||||||||||?

Or how can you understand and explain this, using statistics?

1 = one
2 = two
12 = twelve

I can understand it using my, from your point of view useless, "looking over data" and comparing the language with how people think and how people live. Can you please explain what statistical or any other method that you regard as scientific, can explain any of the above two linguistics problems.

You think you understand it, but we don't think you understand it because your method appears to be staring at a bunch of data until a hunch develops. It doesn't matter if you have stared at the data for a very long time or you feel a strong personal commitment to the hunch: a hunch is a hunch. The core feature of science is epistemic humility, the acknowledgement that you personally might be wrong, and is what drives people to gather data in impartial and replicatable ways and perform rigorous statistical analysis. As MalFet said, hunches are not themselves results, but are good for inspiring these more difficult and reliable investigations.

Here's the bottom line: how much do you care about checking the correctness of your hunch? There are methods for doing so, but they will take work.
« Last Edit: February 14, 2014, 05:04:38 PM by jkpate »
All models are wrong, but some are useful - George E P Box

#### MalFet

• Global Moderator
• Serious Linguist
• Posts: 282
• Country:
##### Re: The language of old Europe
« Reply #151 on: February 14, 2014, 10:10:31 PM »
MalFet

I am pretty good at understanding people, but i fail to understand what exactly do you mean.

Quote
These *aren't* initial findings. You haven't done the analysis necessary to call them initial findings. These are an amalgam of correlations you've noticed that may or may not be statistically significant.

What do you think is statistical chance for the close - heat - sun - fire pattern to arise as a random pile up of letters in two distant European languages like Serbian and Irish. I am asking you for your personal opinion?

From the Enlightenment on, the singular ambition of science has been to develop techniques for producing and evaluating knowledge independent of personal opinion. That's the punchline here. You're trying to claim that [n] sounds show up more frequently in words related to "boundaries", but you haven't involved yourself in any actual *measurement*. If I had the opposite intuition that you do, what would we do? Stand around and shout at each other until one of us got bored? That's not how science works.

And how can you explain that deliberate action like trying to convey meaning to someone, will result in random babble of sounds devoid of any premeditated meaning. I think sometimes that i might be talking to wrong people. I was hoping to talk to linguists, and i am talking to statisticians.

If you want to make scientific claims, you need to talk statistics. Statistics is the quantification and comparison of measured data. That's what science is. If, on the other hand, you want to just bathe in intuitions, there are plenty of people in fields like literary studies who would be happy to argue with you. Maybe you're more interested in the humanities than the social sciences. That's fine, but linguistics is a social science.

How can you statistically explain why this is the symbol for one "|"? Why this is the symbol for two "||"? Why do we use 0 as a symbol for ten ||||||||||?

Or how can you understand and explain this, using statistics?

1 = one
2 = two
12 = twelve

I can understand it using my, from your point of view useless, "looking over data" and comparing the language with how people think and how people live. Can you please explain what statistical or any other method that you regard as scientific, can explain any of the above two linguistics problems.

The question you're asking here is very large. Any good introductory text on comparative typology will steer you in the right direction. Jkpate has also given some good, concrete suggestions. Are you asking for a methodology to test your hypothesis?

#### jkpate

• Forum Regulars
• Linguist
• Posts: 130
• Country:
• American English
##### Re: The language of old Europe
« Reply #152 on: February 15, 2014, 01:49:37 AM »
Ok, so I had some downtime while I was waiting for a model to fit, and thought I'd take a stab at the little Wordnet project I briefly described for some practice with NLTK (I don't often use python). I printed out the Wordnet-based similarity measures with the following script:

#!/usr/bin/python2

from nltk.corpus import wordnet as wn
from nltk.corpus import wordnet_ic

brown_ic = wordnet_ic.ic('ic-brown.dat')
boundary = wn.synset( "boundary.n.01" )

print( "synset_name,lemma,path_similarity,wup_similarity,lch_similarity,jcn_similarity,lin_similarity,res_similarity" )

for synset in list(wn.all_synsets('n')):
for lemma in synset.lemmas:
print( "%s,%s,%f,%f,%f,%f,%f,%f" %( synset.name, lemma.name, boundary.path_similarity( synset ),boundary.wup_similarity(
synset ),boundary.lch_similarity( synset ), boundary.jcn_similarity( synset, brown_ic ),
boundary.lin_similarity( synset, brown_ic ), boundary.res_similarity( synset, brown_ic )  ) )

Next, I loaded it into R with some more processing:

> synsets <- read.csv( "boundary_synsets.csv" )
> synsets\$has_n <- grepl( "n[^gk]", synsets\$lemma )
> synsets[ grepl( "n\$", synsets\$lemma ), "has_n" ] <- T
> synsets[ grepl( "^n", synsets\$lemma ), "has_n" ] <- T
> synsets <- synsets[ synsets\$synset_name != "boundary.n.01" , ]

The last step discards entries from the comparison boundary synset class.

Now, we want to see how the inclusion of "n" as suggested by the orthography varies as a function of the similarity measures. To do this, we are going to do a regression analysis that finds weights for similarity measures that predict, for each lemma, whether it has an "n". If we find large positive weights, then we have evidence that the probability of having "n" goes up for words that are more similar to "boundary."

There are two basic kinds of measures. path_similarity and wup_similarity are based purely on the graph structure over synset types: words that are closer together are more similar. The other measures also use corpus data (you can see the python script uses the Brown corpus) to incorporate some information about the frequencies of words in the synset types. Since they're all based on the graph structure, they're pretty highly correlated with each other and including all similarity measures would lead to a lot of instability in the parameters of a regression model. Accordingly, I'm going to pick one purely path-based measure and one measure that incorporates corpus information, and I'll pick the pair of measures that are least correlated with eachother. Our two measures are the wup measure and the jcn measure (pearson's r of about 0.12):

> cor( synsets[ , c( "path_similarity", "wup_similarity", "lch_similarity", "jcn_similarity", "lin_similarity", "res_similarity" ) ] )
path_similarity wup_similarity lch_similarity jcn_similarity lin_similarity res_similarity
path_similarity       1.0000000      0.7097631      0.9536815     0.30902202      0.5268264     0.60494673
wup_similarity        0.7097631      1.0000000      0.7331615     0.11670811      0.5667497     0.94429379
lch_similarity        0.9536815      0.7331615      1.0000000     0.28071567      0.4972044     0.58303424
jcn_similarity        0.3090220      0.1167081      0.2807157     1.00000000      0.6623805     0.07985543
lin_similarity        0.5268264      0.5667497      0.4972044     0.66238052      1.0000000     0.56086049
res_similarity        0.6049467      0.9442938      0.5830342     0.07985543      0.5608605     1.00000000

Here's the logit regression:

> summary( glm( has_n ~ wup_similarity + jcn_similarity , synsets , family = "binomial" ) )

Call:
glm(formula = has_n ~ wup_similarity + jcn_similarity, family = "binomial",
data = synsets)

Deviance Residuals:
Min       1Q   Median       3Q      Max
-1.2734  -1.1593  -0.9994   1.1878   2.6317

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)     0.32394    0.01290   25.11   <2e-16 ***
wup_similarity -1.16130    0.04502  -25.80   <2e-16 ***
jcn_similarity -3.31271    0.16292  -20.33   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 202744  on 146343  degrees of freedom
Residual deviance: 201527  on 146341  degrees of freedom
AIC: 201533

Number of Fisher Scoring iterations: 4

The best-fitting parameter estimates are negative, suggesting that nouns that are more similar to "boundary" are less likely to have "n" in them. And, according to the model assumptions, these estimates are highly significant. So here's some evidence that the opposite relationship happens to hold, at least in English. Now, this particular statistical model assumes that each entry is independent, which isn't true, in part because the measures for a lemma are computed on the basis of its synset, which is shared with other lemmas. I'm currently running a mixed logit regression and will report if anything changes (mixed logit regressions are very slow!).

Now, to enter datamining mode, I had a look through the most- and least-related words to boundary with

> tail( synsets[ order( synsets\$jcn_similarity ), ] , n = 1000)

and

> head( synsets[ order( synsets\$jcn_similarity ), ] , n = 1000)

There's not a clear pattern that I can see. There are a lot of "-tion" nouns that are dissimilar to "boundary," and few that are similar, but the negative correlation persists if we re-run the regression with "-tion" words excluded. So I'm not sure what's driving the correlation, and wouldn't be too confident about finding it again on a new dataset or a new language.

-- EDIT
the initial post did not count words that ended or begin with "n" as "having n"

--EDIT #2

Ok, adding random effects for shared synset did not change the overall results:

> has_n_glmer <- glmer( has_n ~ wup_similarity + jcn_similarity + ( wup_similarity + jcn_similarity | synset_name ), synsets, family="binomial" )
> summary( has_n_glmer )
Generalized linear mixed model fit by maximum likelihood ['glmerMod']
Family: binomial ( logit )
Formula: has_n ~ wup_similarity + jcn_similarity + (wup_similarity + jcn_similarity |      synset_name)
Data: synsets

AIC      BIC   logLik deviance
195184   195273   -97583   195166

Random effects:
Groups      Name           Variance Std.Dev. Corr
synset_name (Intercept)     1.382   1.176
wup_similarity  9.334   3.055    -0.35
jcn_similarity 15.692   3.961     0.58 -0.36
Number of obs: 146344, groups: synset_name, 82114

Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept)     0.49530    0.01772   27.96   <2e-16 ***
wup_similarity -1.85586    0.06428  -28.87   <2e-16 ***
jcn_similarity -4.79283    0.23549  -20.35   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
(Intr) wp_sml
wup_simlrty -0.851
jcn_simlrty -0.260 -0.075
« Last Edit: February 15, 2014, 03:22:02 AM by jkpate »
All models are wrong, but some are useful - George E P Box

#### freknu

• Forum Regulars
• Serious Linguist
• Posts: 396
• Country:
• Ostrobothnian (Norse)
##### Re: The language of old Europe
« Reply #153 on: February 15, 2014, 12:08:09 PM »

#### dublin

• Linguist
• Posts: 101
• Country:
##### Re: The language of old Europe
« Reply #154 on: February 16, 2014, 11:40:15 AM »
jkpate

Thank you very much for attempting to help, but i have no idea what all this means.

Can you give me a list of words which according to the software you use relate to boundary. Let start from there. I don't know how the software calculates all this, and i would like to do some manual checks of it's accurateness.

Like does the list contain words like entrance, nip, snap, line nose, snow, no. If it does not then we have software that does not know what words related to boundary are, and any further calculations are based on wrong initial data set. Again, i am very grateful for your help. It would also help if you could explain how did they originally determine the word relationship when they were populating the database. That is crucial. Then what words did they use. Modern English or old English. We need to discard everything that came into English in last 300 years at least. We want only words of European origin, no Asian or African borrowings. And so on....

Quote
From the Enlightenment on, the singular ambition of science has been to develop techniques for producing and evaluating knowledge independent of personal opinion.

The original relationship between words, at the time when the word database was created was someone's personal opinion. Or did they use some magic to determine what is related to what.

Quote
If I had the opposite intuition that you do, what would we do? Stand around and shout at each other until one of us got bored?

No, we would look at each other's examples, data, and we would cross check if our "intuition" works for examples we find for each other. So far, my intuition worked on every example you guys found. I don't expect it to work on every word, but I do expect it to work on every word from the language which associate n with no, negation.

Quote
If you want to make scientific claims, you need to talk statistics. Statistics is the quantification and comparison of measured data. That's what science is. If, on the other hand, you want to just bathe in intuitions, there are plenty of people in fields like literary studies who would be happy to argue with you. Maybe you're more interested in the humanities than the social sciences. That's fine, but linguistics is a social science.

Please answer my two questions using statistics. Also explain to me how can you use statistics to understand a meaning of a word. Like dog. Or cat. You need to use right tool for right job. If you look at contexts in which word dog appears you can statistically conclude that dog is somehow associated with furry four legged animal which says woof. But also you can just look around and listen. Result is the same. But if I ask you how come dog is called a dog, you will not be able to tell me that using statistics. Maybe I am wrong, and I am willing to admit my mistake if you show me how you can tell me why is dog call a dog with statistics? You say linguistics is social science. Then you must consider historical, archaeological, ethnographic data if you want to understand language and its meaning and development. My method is based in comparative study of all these disciplines and linguistics. Does your method include all the data from these disciplines and their relationships? What about psychology, neuroscience, genetics, theory of communication? My method includes correlating data from these disciplines as well. Does your method include this data? If your method does not include the above data, how can you claim that "linguistics is a social science", which it is? What you call hunch and intuition is result of careful study.

I can tell you why 0 is used to represent ten |||||||||| because i use anthropological, ethnographic, historical and archaeological data and i correlate it with linguistic data. And that data tells me that we have two hands with ten fingers which look like sticks. We group sticks by grabbing then with two hands forming a ring which looks like 0. We then tie them together using a string which also has a shape of 0. But the same position of our hands with nothing in them forms the shape 0 which means nothing, empty. The first mention of use of 0 in mathematical texts was to describe ten ||||||||||, a bunch of sticks group together by our hands or by a string. Why was 0 used and not some other symbol? Because it looks like ring, string, ten fingers, like something we use to bind things together in bunches. Did you by the way notice how many boundary N words were in last sentence....Please tell me how can you get this using statistics.

I am not being argumentative, just want to learn. Maybe there is a useful method for analyzing language meaning which i don't know about.
« Last Edit: February 16, 2014, 12:46:18 PM by dublin »
The most important thing in science is to know when to stop laughing

#### dublin

• Linguist
• Posts: 101
• Country:
##### Re: The language of old Europe
« Reply #155 on: February 16, 2014, 01:11:08 PM »
djr33

Quote
Really? Snake has more to do with summer than heat or travel?

First what we associate with summer was not the same stuff primitive people associated with summer. Even their definition of summer was different from ours. For ancient Europeans, there were two parts of the year, light, worm, alive, summer and dark, cold, dead, winter. Summer lasted from beginning of May to beginning of November.

Can you travel in winter? You can. How is this then related to summer alone?
When can you see snakes? Only during the summer months. Snakes are related to summer alone.

Heat is present in summer, and is related to summer but it is also related fire, home, body. But what is the sound of heat?  Which part of "all these s words have in them the sound of summer" did you not understand? The words I listed are not the only words which are related to summer, but they are all linked to sound ssss which is the sound of summer. If you were ever in nature in Europe in summer and actually had your ears opened, you would know what I am talking about.

This is what makes these words a context related cluster. Remember we were talking about word clustering from the point of view of meaning contexts.

You also said that there are hundreds of words which define summer. Can you give me any of these hundreds of words which I have missed and which describe the sound of summer day?

Word sun, sol, suria, sunce is probably derived from the sound of summer. I am not saying that all people noticed this sound of summer and used it in their summer related words.

Look at Serbian language where summer is leto. It could be related to the flight of birds. Leteti means to fly. Proleteti means to fly over, to fly pass. Proleće - spring - the time when birds fly over head, migrate, arrive from Africa. Leto - summer - the time when birds are flying here, when they land, nest.
Why is there a difference in the natural phenomena used by different people for naming summer? Different strokes for different folks. Different priorities, different observational skills, different logic. But the fact is they both used observed natural phenomena to name summer, meaning they both used meaning contexts to develop language.

The most important thing in science is to know when to stop laughing

#### MalFet

• Global Moderator
• Serious Linguist
• Posts: 282
• Country:
##### Re: The language of old Europe
« Reply #156 on: February 16, 2014, 06:09:17 PM »
This doesn't really seem to be going anywhere, so I'll bow out here. As should be needless to say, however, if you want to make broad typological-distributional claims about cross-linguistic phones (which is not really comparable to the definition of "dog"), understanding at least the broad strokes of what jkpate is saying is going to be very important. Good luck with your work.
« Last Edit: February 16, 2014, 07:23:02 PM by MalFet »

#### Daniel

• Experienced Linguist
• Posts: 2066
• Country:
• English
##### Re: The language of old Europe
« Reply #157 on: February 16, 2014, 06:27:08 PM »
Quote
....but i have no idea what all this means.
Something to work on. It's a problem that you don't know how to do what jkpate was suggesting and that you don't have an alternative reliable methodology to offer otherwise. Instead you say things like this:
Quote
Like does the list contain words like entrance, nip, snap, line nose, snow, no. If it does not then we have software that does not know what words related to boundary are, and any further calculations are based on wrong initial data set.
Really? If the data doesn't support your hypothesis then the data and analysis are wrong?

Quote
First what we associate with summer was not the same stuff primitive people associated with summer. Even their definition of summer was different from ours. For ancient Europeans, there were two parts of the year, light, worm, alive, summer and dark, cold, dead, winter. Summer lasted from beginning of May to beginning of November..
...
You're just making stuff up now.
You're telling me that snakes are more related to summer than heat?? The sun pretty much guarantees and all humans (all animals really) will associated heat and summer. What you're saying is crazy.
Travel, sure, maybe that's a new thing. Then again, the winter was impassable in many places so travel during the summer was also the case back then. *shrug*

Quote
If you were ever in nature in Europe in summer and actually had your ears opened, you would know what I am talking about.
Absurd. I've been in Europe during the summer and never heard "ssssss". You're just making things up.

Quote
You also said that there are hundreds of words which define summer. Can you give me any of these hundreds of words which I have missed and which describe the sound of summer day?
No, I can't. This is apparently all in your head, so I have no clue!
But there are lots of words related to summer. For example, "heat". Others might include "light", "tired", "dry", "dirt", "bright", and so forth.

You can do a corpus search to find words that occur near/with "summer" often. That's similar to what jkpate did above.

You NEED a methodology that isn't just "common sense" in your own head!!

Quote
Word sun, sol, suria, sunce is probably derived from the sound of summer. I am not saying that all people noticed this sound of summer and used it in their summer related words.

Look at Serbian language where summer is leto. It could be related to the flight of birds. Leteti means to fly. Proleteti means to fly over, to fly pass. Proleće - spring - the time when birds fly over head, migrate, arrive from Africa. Leto - summer - the time when birds are flying here, when they land, nest.
This is a contradiction: why doesn't the word in Serbian also start with S? You're making no predictions, just telling stories.

Is your goal just to tell stories about perceived relationships? If so, fine. But that's NOT science.

#### dublin

• Linguist
• Posts: 101
• Country:
##### Re: The language of old Europe
« Reply #158 on: February 17, 2014, 04:53:41 AM »
Malfet

Quote
This doesn't really seem to be going anywhere

Remember i told you that if we want to have meaningful conversation we need to use the same language algorithm?

Guys, this is the crux of the problem:

Quote
You can do a corpus search to find words that occur near/with "summer" often. That's similar to what jkpate did above.

When i say word is related to boundary, summer...what ever, i mean that it's meaning is related to boundary, summer.... That it carries the meaning associated with boundary, summer...meaning pattern.

When you say that word is related to boundary summer you mean "found in texts near or related to word boundary, summer".

These are completely different things. The way you understand relationship can be tested with statistics, but answers completely different question, totally unrelated to the question i am asking, which is what is the meaning of the word and how it is expressed through sound and how it was created in the first place.

djr33

I asked if the list of boundary related words jpate's software uses contains words like entrance, nip, snap, line, nose, snow, no for a reason. All these words carry the meaning of a boundary, definition, separation, negation. If jpate's software does not contain these words, then this does not mean that jpate's software is bad, just that it was written to answer different questions. It answers questions about relationship through position, rather then relationship through meaning.  If jpate's software does not use meaning based database, then we have software that does not know what words related to boundary are, and any further calculations are based on wrong initial data set.

We can still use statistics though. Statistics is used to tell us distribution of the answers to a particular question. So we have to use correct question, to gather correct data.

Question is: what words carry meaning which is related to boundary, definition, negation, separation. Assembling this data set can only be done by humans, because machines are not able to understand the deep meaning of words. They can only literally translate symbol patterns to literal meanings but only if a human told them what the meaning was.

Here is an example. Take word blind.

Meaning derived from the meaning of sound blocks from which the word is constructed:

blind = bl + i + n + d = bel + je (is) + no + to (that) = white, clear + is + not + that = dark, obscured

Official etymology:

Quote

Old English blind "blind," also "dark, enveloped in darkness, obscure; unintelligent, lacking mental perception," probably from West Germanic *blinda- "blind" (cf. Dutch and German blind, Old Norse blindr, Gothic blinds "blind"), perhaps, via notion of "to make cloudy, deceive," from an extended Germanic form of the PIE root *bhel- (1) "to shine, flash, burn" (see bleach (v.)); cf. Lithuanian blendzas "blind," blesti "to become dark." The original sense, not of "sightless," but of "confused," perhaps underlies such phrases as blind alley (Chaucer's lanes blynde), which is older than the sense of "closed at one end" (1610s). In reference to doing something without seeing it first, by 1840. Of aviators flying without instruments or without clear observation, from 1919. Blindman's bluff is from 1580s.
The twilight, or rather the hour between the time when one can no longer see to read and the lighting of the candles, is commonly called blindman's holiday. [Grose, 1796]
Related: Blinded; blinding.

blind (v.)

"deprive of sight," early 13c., from Old English blendan "to blind, deprive of sight; deceive," from Proto-Germanic *blandjan (see blind (adj.)); form influenced in Middle English by the adjective. Related: Blinded; blinding.

blind (n.)

"a blind person; blind persons collectively," late Old Engish, from blind (adj.). Meaning "place of concealment" is from 1640s. Meaning "anything that obstructs sight" is from 1702.

Both are figured out from linguistic and historical data. No statistics involved. Do you call the stuff you find in etymological dictionary of English language "telling stories"? This stuff is impossible for a machine to figure out. You need to have people looking at it, as only people can analyze complex interrelated patterns.

So we can see that N in blind comes from negation, no. So blind is related to boundary through negation. It defines the boundary of sight.

Once you get all the words which carry the meaning of boundary into a list, you can then use statistics to see how many have sound N in them. Which is what i have been proposing from the start.

Look at this example. This is the analysis, based on my meaning carrying sound blocks, of words representing idea of one, separate, defined from all the languages you can easily find dictionaries for on the internet:

English - one = o + n + e = object + bound, defined, separate + is
Irish - aon = a + o + n = standing + object + bound, defined, separate
Serbian - jedan = je + da + n = is + object + bound, defined, separate
German - ein = e + i + n = is + continues, persists + bound, defined, separate
Latin - unum = u + n + u + m = in + bound, defined, separate + in + core, i am, is
Albanian - një = n + je = bound, defined, separate + is
Catalan - une = u + n + e = in + bound, defined, separate + is
Danish - en = e + n = is + bound, defined, separate
Dutch - een = e + n = is + bound, defined, separate
French - un = u + n = in + bound, defined, separate
Greek - ena = e + n + a = is +  bound, defined, separate + standing
Icelandic - einn = e + i + n = is + continues, persists + bound, defined, separate
Romanian - un = u + n = in + bound, defined, separate
Russian - odin = o + d + i + n = object + that + is + bound, defined, separate
Welsh - un = u + n = in + bound, defined, separate
Tamil - onraka, oru = o + nra + ka = object + bound, defined, separate, cut + pointing towards, ga; o + r + u = object + cut, separate + in
Berber - ižžən, ištən, yun, yiwen = ižə (you are, it is in early Medieval Serbian today jes) = es + n = is + bound, defined, separate; es + to + n = is + that + bound, defined, separate; je + u + n = is + in + bound, defined, separate;  je + we + n = is + know, see + bound, defined, separate

Every one of these languages uses a word for one, which separates, defines something, which consists of sound blocks which put together give us the meaning of one, separate, defined. How is this possible, if the sounds are random, meaningless?

Not all languages use the same logic to describe oneness. The languages in this group use different sounds from my meaning carrying sound blocks to convey oneness, separateness, definition through visibility. But because they all use the same logic, to create the same message, they all use the same sound blocks:

Corean - han = h(g)a + n = pointing towards, ga +  bound, defined, separate

Sanskrit - ekah, ekam = e + ka + ga (ma) = is + pointing towards, ga +  pointing towards, ga (me, entity, existence)
Arabic - eahada = ea + ga + da = is + pointing towards, ga + that
Armeninan - mek = m + e + k  = me, entity, existence + is + pointing towards, ga
Baskue - bat = b + a + t = matter, hard + stands + that
Bengali - eka = e + ka = is + pointing towards, ga + stands
Estonian - üks = u + k + s = in + pointing towards, ga + surface
Finnish - yksi = u + k + s + i = in + pointing towards, ga + surface + continues, persists
Georgian - ert = e + r + t = is + cut, separate + that
Gujarati - eka = e + ka = is + pointing towards, ga
Hindi  - eka = e + ka = is + pointing towards, ga
Hungarian - egy = e + g + i = is + pointing towards, ga + continues, persists
Indonesian - esa, satu = e + s + a = is + surface + stands + there
Khmer - mouy = m + o + u + y = me, entity, existence + object + in + is
Lao - nung = n + un + g = bound, defined, separate + in + pointing towards, ga
Latvian, Lithuanian - viens = vi + e + n + s = know, see + is + bound, defined, separate + surface
Malay - satu = s + a + t + u = surface + stands + there + in
Maltese - wieħed = vidjet (to see in Serbian Dinaric dialect) = vi + da + je + to = know, see + that + is + that
Maori - kotahi = k + o + ta + i = pointing towards, ga + object + that + continues, persists, is
Marathi - eka = e + ka = is + pointing towards, ga
Mongolian - neg =  bound, defined, separate + is + pointing towards, ga
Nepali  - eka = e + ka = is + pointing towards, ga
Norwegian, Swedish  - ett, en = is + that , bound, defined, separate
Punjabi - lka = l + ka = line, boundary + ga
Telugu - oka = o + ka = object + pointing towards, ga
Thai - hnung, khn = h(g) + n + un + g =  pointing towards, ga + bound, defined, separate + in + pointing towards, ga; kh + n = pointing towards, ga + bound, defined, separate
Turkish - bir, tek = b + i + r = matter, hard + continues, persists + cut, separate; t + e + k = that + is + pointing towards, ga
Vietnamese - mot = m + o + t = me, entity, existence + object + that
Yoruba - okan = o + ka + n = object + pointing towards, ga + bound, defined, separate
Zulu - eyodwa, kunye = e + yo + d + va = is + pointing towards, ga + that + know , see; k + u + n + ye = pointing towards, ga + in + bound, defined, separate + is
Chinese - yi = je + i = is + continues, persists
Javanese - siji = s+ i + j + i = surface + continuous + is +  continues, persists
Japanese - ityi, iko = i + t + yi = continues + that + is

So all the languages fit into only two logical groups when it comes to describing oneness.

Is this just a coincidence, or are we discovering something that has been hidden from us all these years in plain sight? How is it possible that every language on this list uses words for one, separate, defined which are built from sound blocks whose sum meaning means one, separate, defined?

I discovered these meaning carrying sound blocks by comparing Serbian and Irish, and yet I can used them to analyze word for one from any of the above languages? Does anyone wander how is this possible? Maybe because these sound blocks are as old as the original human language?

Jpate i am almost finished my summary. It will take me maybe one or two more days.
« Last Edit: February 17, 2014, 05:01:28 AM by dublin »
The most important thing in science is to know when to stop laughing

#### Daniel

• Experienced Linguist
• Posts: 2066
• Country:
• English
##### Re: The language of old Europe
« Reply #159 on: February 17, 2014, 07:35:36 AM »
Quote
Remember i told you that if we want to have meaningful conversation we need to use the same language algorithm?
If we have a meaningful conversation, it will be because you say something reasonable and convincing. You can't blame us for your inability to convince us-- either because you're wrong or because you don't have any way to defend your argument that is convincing.

Quote
When i say word is related to boundary, summer...what ever, i mean that it's meaning is related to boundary, summer.... That it carries the meaning associated with boundary, summer...meaning pattern.

When you say that word is related to boundary summer you mean "found in texts near or related to word boundary, summer".

These are completely different things. The way you understand relationship can be tested with statistics, but answers completely different question, totally unrelated to the question i am asking, which is what is the meaning of the word and how it is expressed through sound and how it was created in the first place.
Collocations reveal quite a bit about the meaning of words. They aren't substitutes for each other, but they are related meanings. That's why they often appear in the same context. For example, we might very often find the words "Chicago" and "pizza" near each other in texts, much more often than, for example, "Texas" and "potato". What we would find is that there is some statistical relationship between the use/meaning of the two words. Does "Chicago" mean "pizza"? No, certainly not. But we would know that they often come up together.

And that's fine-- you don't have to use THAT methodology. Pick another one if you have a better idea. But come up with a methodology!! Something unbiased--

Quote
Assembling this data set can only be done by humans, because machines are not able to understand the deep meaning of words.
What makes you think that humans (or machines) could possibly come up with your analysis? Surveys are very often used in linguistic research, so go ahead with that. What kind of question would you ask people to attempt to defend your claims?

For example:
What word is most related to summer?
A) Snake
B) Heat
C) Travel
D) Snow

If you actually approach this objectively, it's pretty clear you won't get the answers you expect. Therefore, what you are claiming is simply unreliable and biased.

Quote
I discovered these meaning carrying sound blocks by comparing Serbian and Irish, ...
You did not! You observed what you believe to be patterns, yet no one else sees these patterns and the very idea of it goes against most of what historical linguists have come up with for the last 300 years. It's possible you're just way ahead of everyone else, but it's much more likely you're wrong.

Quote
Jpate i am almost finished my summary. It will take me maybe one or two more days.
Again, remember: 200-500 words. We want to know what you're claiming, that's all. If you actually know what you are claiming, it should not take you too long to write this. But whenever you post it, we'll take a look.

#### dublin

• Linguist
• Posts: 101
• Country:
##### Re: The language of old Europe
« Reply #160 on: February 17, 2014, 08:35:53 AM »
djr33

Quote
If we have a meaningful conversation, it will be because you say something reasonable and convincing.

You don't know what conversation means. We need to use the same language i order to have a conversation. If i say "related to boundary" and I mean has a meaning of boundary of something, and you say "related to boundary" and you mean found in the vicinity of the word boundary, then we are not speaking the same language, and will never convince each other of anything. And you could start reading people's posts from beginning to end before you start replying to them.

Quote
Collocations reveal quite a bit about the meaning of words. They aren't substitutes for each other, but they are related meanings. That's why they often appear in the same context. For example, we might very often find the words "Chicago" and "pizza" near each other in texts, much more often than, for example, "Texas" and "potato".

They could have related meaning but most often they don't. Your examples clearly show that you don't understand concept of "meaning" if you think that "Chicago" and "pizza" are related to the same meaning. This is exactly what i thought was happening. You mistake place correlation with meaning correlation. "Chicago" and "pizza" are only related through well known context, and this is a very good example of association through context. But they don't carry the same meaning. Let me give you example of what would have related meaning:  pizza and pita. Both are flat breads. Actually pizza is mispronunciation of pita. Add here pie. Another word related in meaning with pita and pizza. Now add Chicago. Do you see the difference? If you don't, then i don't understand how you teach linguistics.

Quote
You can't blame us for your inability to convince us

Really?

Quote
What makes you think that humans (or machines) could possibly come up with your analysis? Surveys are very often used in linguistic research, so go ahead with that. What kind of question would you ask people to attempt to defend your claims?

Give me list of words whose meaning is related to boundary, defining, separating, negating of something. They have to convey that meaning by themselves, without help of any other words.

For instance "nail" is boundary of finger, "finger" is boundary of hand, "hand" is boundary of arm. All three have N in them. Hand is also used for forming boundaries, on which we place things. hand = h(g)a + n + d = pointing towards, ga + bound, defined, separate + that

Quote
Old English hond, hand "hand; side; power, control, possession," from Proto-Germanic *khanduz (cf. Old Saxon, Old Frisian, Dutch, German hand, Old Norse hönd, Gothic handus). The original Old English plural handa was superseded in Middle English by handen, later hands.

But nail is boundary of toe is boundary of foot and foot is boundary of leg. You can rightly ask why does toe not have N in it? Well one explanation is that not all boundary words have to have N in them, as we have seen from words for ONE. But also, word toe could have originally had N in it and it lost it over time. This is in fact the case:

Quote
toe (n.)
Old English ta "toe" (plural tan), contraction of *tahe (Mercian tahæ), from Proto-Germanic *taihwo (cf. Old Norse ta, Old Frisian tane, Middle Dutch te, Dutch teen (perhaps originally a plural), Old High German zecha, German Zehe "toe"). Perhaps originally meaning "fingers" as well (many PIE languages still use one word to mean both fingers and toes), and thus from PIE root *deik- "to show" (see diction).
Þo stode hii I-armed fram heued to þe ton. [Robert of Gloucester, "Chronicle," c.1300]
The old plural survived regionally into Middle English as tan, ton. To be on (one's) toes "alert, eager" is recorded from 1921. To step on (someone's) toes in the figurative sense "give offense" is from late 14c. Toe-hold "support for the toe of a boot in climbing" is from 1880.

Word foot is associated with walking the most. Also we can not create boundaries with feet. So foot does not carry meaning of boundary.

Quote
Old English fot, from Proto-Germanic *fot (cf. Old Saxon fot, Old Norse fotr, Dutch voet, Old High German fuoz, German Fuß, Gothic fotus "foot"), from PIE *ped- (cf. Avestan pad-; Sanskrit pad-, accusative padam "foot;" Greek pos, Attic pous, genitive podos; Latin pes, genitive pedis "foot;" Lithuanian padas "sole," peda "footstep", Serbian put - road). Plural form feet is an instance of i-mutation. Of a bed, grave, etc., first recorded c.1300.

I said: "I discovered these meaning carrying sound blocks by comparing Serbian and Irish, ..." And then you said: "You did not! You observed what you believe to be patterns, yet no one else sees these patterns and the very idea of it goes against most of what historical linguists have come up with for the last 300 years. It's possible you're just way ahead of everyone else, but it's much more likely you're wrong."

Please explain how I can use them to analyze word for "one" from all these languages successfully? Look at the list, use logic, add meaning of sound blocks for each word. They are all consistent, there is almost no variation. How is this possible if I am wrong. As I said maybe I am  discovering something that has been hidden from us all these years in plain sight, because of the bias of all these historical linguists you swear by. Give me any other explanation for what i have just demonstrated to you. If you say it is coincidence, please show me the method you used to come to this conclusion, apart from refusing to accept the possibility that I could be right.

Quote
Again, remember: 200-500 words. We want to know what you're claiming, that's all. If you actually know what you are claiming, it should not take you too long to write this. But whenever you post it, we'll take a look.

I actually know what I am claiming. But as it is obvious that you guys don't understand basic terms and concepts like "conversation" and "meaning" in the same way I understand them, I have decided to be as precise as possible, and define all these terms in unambiguous fashion, in order to remove possibility of misunderstanding. This takes time. And thank you, for bestowing me with your attention.
« Last Edit: February 17, 2014, 08:40:44 AM by dublin »
The most important thing in science is to know when to stop laughing

#### lx

• Global Moderator
• Linguist
• Posts: 164
##### Re: The language of old Europe
« Reply #161 on: February 17, 2014, 08:40:16 AM »
That awkward moment when someone doesn't understand collocation and thinks the other person said correlation
« Last Edit: February 17, 2014, 08:43:30 AM by lx »

#### dublin

• Linguist
• Posts: 101
• Country:
##### Re: The language of old Europe
« Reply #162 on: February 17, 2014, 08:45:40 AM »
no Lx, the awkward moment is when everyone who is reading this thread realizes that all of these linguists, don't know what meaning means, and are substituting meaning with correlation derived from collocation.
The most important thing in science is to know when to stop laughing

#### Daniel

• Experienced Linguist
• Posts: 2066
• Country:
• English
##### Re: The language of old Europe
« Reply #163 on: February 17, 2014, 09:26:41 AM »
dublin, you're just making stuff up then becoming frustrated with us when we don't believe you.

Quote
I actually know what I am claiming. But as it is obvious that you guys don't understand basic terms and concepts like "conversation" and "meaning" in the same way I understand them, I have decided to be as precise as possible, and define all these terms in unambiguous fashion, in order to remove possibility of misunderstanding.
As you see, there's no point in continuing this. Your arguments are not scientific or convincing. They only make sense in your head.

You can't keep redefining everything until your theory works-- that's rhetoric, not science.

Let me be very clear, because you seem to have trouble with this: write an abstract (200-500 words) that makes your claims clear and establishes how they can be tested. They should be falsifiable: if the world were a certain way, we should be able to prove them wrong. Then you can show us data that suggests that they are not wrong, because that data does not exist.

For the record, I agree with you that collocation is not the most immediate way to get at the meanings you want to analyze. But it's better than nothing, and that's all you've offered so far (aside from telling stories).

#### dublin

• Linguist
• Posts: 101
• Country:
##### Re: The language of old Europe
« Reply #164 on: February 17, 2014, 10:06:03 AM »
I have been extremely consistent in my claims from the first post. I have given you summary of my theory and the way to prove it and disprove it here:

http://linguistforum.com/historical-linguistics/the-language-of-old-europe/msg1198/#msg1198

But this is not what I am working on and which is late. What I am writing is unified theory of language which is something I am much more interested in. And that takes time.

While i was looking for the original post, i noticed a completely new post from you, which i have never seen before, and which was probably edited after I read the original. It contains some very valid questions from you. So let me try to answer some of them:

Quote
I asked for a short abstract. (We all did.) You haven't written it.

This is the question from the first post after the above post in which i did give you the short summary. Maybe you did not see my post like i didn't see your's...

Quote
Phonosemantics is vacuous if any sound can have any meaning. If you are making a claim that only some sounds have "boundary" as a meaning and not others, then this should be testable. You have not demonstrated a way to objectively test it, just your own intuition.

From my post which you did not read, or maybe did not see: "Get all the words related to boundary, group them by common sound, get percentage for each group with common sound. Compare with threshold percentage. If distribution is uniform, we have randomness. If we have peaks, it is deliberate. Each peak represents one sound which is strongly related to meaning of boundary. "

Quote
If even one word isn't about boundaries, then what are you claiming? Why is N sometimes related to boundaries and not always? What kind of theory would be a "sometimes" theory. (Imagine a "sometimes we have gravity" theory!)
[Note: I actually think, if anything, this is the right approach. But it's crucial that you work out the details and significance.]

Not all words come from the same original family language. Different people hear things differently, see word differently. As i demonstrated on words for ONE there were two types of logic used to create words for ONE.

You need to analyze every word separately, find out which language it comes from, what was the original meaning. If word comes from language which has no as negation word, then N in the words of that language carries the meaning of boundary.

Quote
How do you know? Humans are amazingly good at metaphors, so you just claiming "it's about boundaries" is basically meaningless. As a simple test, give me a word and tell me a concept it is "about". I'll make up a very convincing story. That isn't science or reliable, though. So, again, how can we rely on your analysis when it isn't something objectively testable?

How do you objectively test meaning?

Quote
Another way to consider this would be to wonder about exact opposites. The opposite of boundary might be "wide open space".

No it is not. Boundary defines things. It does not necessarily enclose them as i already shown you.

Quote
What about "savannah", "open", "plain", "plane", "nebula", "navigate", etc.? If N can predict openness and boundaries, then what is it actually predicting?

Savannah comes from an African language.

open - Originally the past participle of the verb *eupaną, *ūpaną, related to *ūp (“up”). Root has no n. Could be from e + up + na = is + up + on

plain - From Anglo-Norman pleyn, playn, Middle French plain, plein, from Latin plānus (“flat, even, level, plain”). From Proto-Indo-European *pelh (“flat”), *pelh₂-. Root has no n. But pleyn can be pl + e + i + n = flat + is + continues, persists + in, on

Nebula - From Proto-Indo-European *nébʰos (“cloud”). Cognate with Ancient Greek νέφος (nephos), Sanskrit नभस् (nábhas). The original meaning is sky. Derived from ne b = not material, hard. Boundary of material world.

Navigate - From Middle English navigate, from Latin navigo, from nāvis (“ship”) + agō (“do”), from Proto-Indo-European *nau- (boat), possibly, from Tamil நாவாய் (nāvāi). From nāvis (“ship”) + agō (“I do”). Actually from na + v + i + go = on + water + persist, continue, float + go. navi = na + v + i = on + water + persist, continue, float

I will try to answer the rest tomorrow. Too busy now. Sorry.
« Last Edit: February 17, 2014, 12:24:00 PM by dublin »
The most important thing in science is to know when to stop laughing