Author Topic: Brain-to-text technology demonstrated  (Read 1360 times)

Offline Copernicus

  • Linguist
  • ***
  • Posts: 60
  • Country: us
    • Natural Phonology
Brain-to-text technology demonstrated
« on: June 15, 2015, 02:19:48 PM »
Frontiers in Neuroscience has published a study that purports to demonstrate direct brain-to-text representation from neurological signals:  Brain-to-text: decoding spoken phrases from phone representations in the brain.  Basically, this is text recognized while subjects are reading text displayed on a screen.  It looks like it could be very promising.  What do others think?

Online Daniel

  • Administrator
  • Experienced Linguist
  • *****
  • Posts: 1485
  • Country: us
    • English
Re: Brain-to-text technology demonstrated
« Reply #1 on: June 15, 2015, 04:23:07 PM »
Text to brain to text? Interesting.
Welcome to Linguist Forum! If you have any questions, please ask.

Offline Copernicus

  • Linguist
  • ***
  • Posts: 60
  • Country: us
    • Natural Phonology
Re: Brain-to-text technology demonstrated
« Reply #2 on: June 15, 2015, 06:30:50 PM »
Yes, it is interesting.  Of course, there is no intelligent linguistic analysis here, just statistical speech analysis.  I'm not sure what the signals from the cortex really represented, because the subjects were seeing text, articulating speech sounds, and hearing their own speech while doing the experiment.  The data was correlated with a normalized speech recognition (ASR) analysis of the same text.  I'm not sure how scaleable this kind of program is.  It would likely not work with subjects that spoke with a heavy foreign or dialectal accent, since the acoustic analysis is for SAE (newscaster) speech. 

Online Daniel

  • Administrator
  • Experienced Linguist
  • *****
  • Posts: 1485
  • Country: us
    • English
Re: Brain-to-text technology demonstrated
« Reply #3 on: June 17, 2015, 09:03:33 PM »
Right. There was some fairly recent research out of UC Berkeley (I think) with visual input and fMRI(?) methods that were able to capture what the subject was seeing. This was basically a collection of youtube videos as training data then new videos as the test input. The results were certainly not random but not great (basically compilations of the most similar previous video clips in groups of a few pixels or frames, etc.).

While that seems impressive ("Can we see what people are dreaming?", etc.) it was probably just reading off the input level, and here it's probably the same, reading off the input level of text, rather than some deeper linguistic representation. On the other hand, it was features (letters) rather than raw images (eg, a page), as far as I understand it, so that is a little deeper.
Welcome to Linguist Forum! If you have any questions, please ask.