Author Topic: Why do speech-to-text projects not rely on phonology?  (Read 2104 times)

Offline zaba

  • Serious Linguist
  • ****
  • Posts: 272
Why do speech-to-text projects not rely on phonology?
« on: March 12, 2014, 09:13:44 AM »
Most speech-to-text projects are based entirely on statistics, right? Isn't phonology more convenient?

Offline MalFet

  • Global Moderator
  • Serious Linguist
  • *****
  • Posts: 282
  • Country: us
Re: Why do speech-to-text projects not rely on phonology?
« Reply #1 on: March 12, 2014, 09:27:49 AM »
Phonology is the quintessential example of a gestalt effect. Designing algorithms to effectively recognize holistic forms turns out to be incredibly hard.

This fact is, ultimately, one of the better criticisms of computational theories of mind. Human minds and digital computers just seem to work...differently.

Offline zaba

  • Serious Linguist
  • ****
  • Posts: 272
Re: Why do speech-to-text projects not rely on phonology?
« Reply #2 on: March 12, 2014, 10:27:27 AM »
Wow, thanks.
Quote
Designing algorithms to effectively recognize holistic forms turns out to be incredibly hard.
Sure, I guess that sounds right. After all, there's a lot of phonology goin' on!

Quote
Phonology is the quintessential example of a gestalt effect.

Can you kindly elaborate on this with a sentence or two? I'm an ignoramus.

Offline MalFet

  • Global Moderator
  • Serious Linguist
  • *****
  • Posts: 282
  • Country: us
Re: Why do speech-to-text projects not rely on phonology?
« Reply #3 on: March 12, 2014, 10:37:42 AM »
Have you looked at the wikipedia page on gestalt? I'm happy to expand on that however I'm able, but starting from scratch I'll only do a worse job explaining things than any introductory blurb out there.

Offline zaba

  • Serious Linguist
  • ****
  • Posts: 272
Re: Why do speech-to-text projects not rely on phonology?
« Reply #4 on: March 12, 2014, 10:40:39 AM »
Do you think anything is lost in the process for lack of phonologists? How would things be different if there were phonologists?

In what way can I see the repercussion of the lack of phonologists on text-to-speech e.g. on siri?

Sorry to bombard you with these likely idiotic questions.

Offline MalFet

  • Global Moderator
  • Serious Linguist
  • *****
  • Posts: 282
  • Country: us
Re: Why do speech-to-text projects not rely on phonology?
« Reply #5 on: March 12, 2014, 10:49:07 AM »
If a computer had access to a proper phonology, it would understand natural speech as well humans do.

Offline zaba

  • Serious Linguist
  • ****
  • Posts: 272
Re: Why do speech-to-text projects not rely on phonology?
« Reply #6 on: March 12, 2014, 12:39:07 PM »
so with a proper model of phonetics <> phonology  interface, speech to text could be improved, if only theoretically so. IS that true?

Offline lx

  • Global Moderator
  • Linguist
  • *****
  • Posts: 164
Re: Why do speech-to-text projects not rely on phonology?
« Reply #7 on: March 12, 2014, 01:12:06 PM »
There are speech-to-text algorithms that don't use a phonological model, but many, many algorithms and processes do. When you employ any sort of language model, you restrict the domain of applicability significantly, and there is a hell of a lot of money to be made in a system that can be applied across multiple languages. Many algorithms rely on models and statistics from specific languages, but the idea is that you can plug in a hefty-enough corpus and draw the same statistics and have fairly comparable rates. Once you start putting in phonological information, you really do become language-specific and that is less desirable when you want your algorithm(s) to be utilisable across a broad spectrum, but it's still a very popular approach. I would have to take issue with the fact that phonological modelling in speech-to-text is as rare as your first post made it out to be.

Offline jkpate

  • Forum Regulars
  • Linguist
  • *
  • Posts: 130
  • Country: us
    • American English
    • jkpate.net
Re: Why do speech-to-text projects not rely on phonology?
« Reply #8 on: March 12, 2014, 06:25:22 PM »
Basing a system on statistics is not exclusive to taking advantage of phonological facts. If "using phonology" means "using optimality theory constraints" or "using generative rules" then you are right that ASR systems, as far as I know, don't "use phonology." But if you mean the identification of relevant phonological features from data or learning facts about assimilation, then ASR systems do use phonology.
All models are wrong, but some are useful - George E P Box

Offline MalFet

  • Global Moderator
  • Serious Linguist
  • *****
  • Posts: 282
  • Country: us
Re: Why do speech-to-text projects not rely on phonology?
« Reply #9 on: March 12, 2014, 10:29:28 PM »
Right, that's an important distinction. If the term phonology is used broadly to just mean something like "category", then certainly speech recognition uses something along these lines. I'd hesitate to consider phonology to include nothing more specific than category, though.

I'd be curious, though...are there any systems that actually model full phonological systems, including things like allophony and underspecification? That seems very cumbersome for little payoff, but I'd be delighted to learn that some algorithm out there was actually trying to mimic human (rather than computer) perceptual qualities.