Subvocal recognition

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Electrodes used in subvocal speech recognition research at NASA's Ames Research Lab

Subvocal recognition (SVR) is the process of taking subvocalization and converting the detected results to a digital output, aural or text-based.[1] A silent speech interface is a device that allows speech communication without using the sound made when people vocalize their speech sounds. It works by the computer identifying the phonemes that an individual pronounces from nonauditory sources of information about their speech movements. These are then used to recreate the speech using speech synthesis.[2]

Input methods

[edit | edit source]

Silent speech interface systems have been created using ultrasound and optical camera input of tongue and lip movements.[3] Electromagnetic devices are another technique for tracking tongue and lip movements.[4]

The detection of speech movements by electromyography of speech articulator muscles and the larynx is another technique.[5][6] Another source of information is the vocal tract resonance signals that get transmitted through bone conduction called non-audible murmurs.[7]

They have also been created as a brain–computer interface using brain activity in the motor cortex obtained from intracortical microelectrodes.[8]

Such devices are created as aids to those unable to create the sound phonation needed for audible speech such as after laryngectomies.[9] Another use is for communication when speech is masked by background noise or distorted by self-contained breathing apparatus. A further practical use is where a need exists for silent communication, such as when privacy is required in a public place, or hands-free data silent transmission is needed during a military or security operation.[3][10]

In 2002, the Japanese company NTT DoCoMo announced it had created a silent mobile phone using electromyography and imaging of lip movement. The company stated that "the spur to developing such a phone was ridding public places of noise," adding that, "the technology is also expected to help people who have permanently lost their voice."[11] The feasibility of using silent speech interfaces for practical communication has since then been shown.[12]

In 2019, Arnav Kapur, a researcher from the Massachusetts Institute of Technology, conducted a study known as AlterEgo. Its implementation of the silent speech interface enables direct communication between the human brain and external devices through stimulation of the speech muscles. By leveraging neural signals associated with speech and language, the AlterEgo system deciphers the user's intended words and translates them into text or commands without the need for audible speech.[13]

Research and patents

[edit | edit source]

With a grant from the U.S. Army, research into synthetic telepathy using subvocalization is taking place at the University of California, Irvine under lead scientist Mike D'Zmura.[14]

NASA's Ames Research Laboratory in Mountain View, California, under the supervision of Charles Jorgensen is conducting subvocalization research.[citation needed]

The Brain Computer Interface R&D program at Wadsworth Center under the New York State Department of Health has confirmed the existing ability to decipher consonants and vowels from imagined speech, which allows for brain-based communication using imagined speech.,[15] however using EEGs instead of subvocalization techniques.

US Patents on silent communication technologies include: US Patent 6587729 "Apparatus for audibly communicating speech using the radio frequency hearing effect",[16] US Patent 5159703 "Silent subliminal presentation system",[17] US Patent 6011991 "Communication system and method including brain wave analysis and/or use of brain activity",[18] US Patent 3951134 "Apparatus and method for remotely monitoring and altering brain waves".[19] Latter two rely on brain wave analysis.

In fiction

[edit | edit source]

See also

[edit | edit source]

References

[edit | edit source]
  1. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  2. ^ Denby B, Schultz T, Honda K, Hueber T, Gilbert J.M., Brumberg J.S. (2010). Silent speech interfaces. Speech Communication 52: 270–287. Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  3. ^ a b Hueber T, Benaroya E-L, Chollet G, Denby B, Dreyfus G, Stone M. (2010). Development of a silent speech interface driven by ultrasound and optical images of the tongue and lips. Speech Communication, 52 288–300. Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  4. ^ Wang, J., Samal, A., & Green, J. R. (2014). Preliminary test of a real-time, interactive silent speech interface based on electromagnetic articulograph, the 5th ACL/ISCA Workshop on Speech and Language Processing for Assistive Technologies, Baltimore, MD, 38-45.
  5. ^ Jorgensen C, Dusan S. (2010). Speech interfaces based upon surface electromyography. Speech Communication, 52: 354–366. Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  6. ^ Schultz T, Wand M. (2010). Modeling Coarticulation in EMG-based Continuous Speech Recognition. Speech Communication, 52: 341-353. Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  7. ^ Hirahara T, Otani M, Shimizu S, Toda T, Nakamura K, Nakajima Y, Shikano K. (2010). Silent-speech enhancement using body-conducted vocal-tract resonance signals. Speech Communication, 52:301–313. Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  8. ^ Brumberg J.S., Nieto-Castanon A, Kennedy P.R., Guenther F.H. (2010). Brain–computer interfaces for speech communication. Speech Communication 52:367–379. 2010 Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  9. ^ Deng Y., Patel R., Heaton J. T., Colby G., Gilmore L. D., Cabrera J., Roy S. H., De Luca C.J., Meltzner G. S.(2009). Disordered speech recognition using acoustic and sEMG signals. In INTERSPEECH-2009, 644-647.
  10. ^ Deng Y., Colby G., Heaton J. T., and Meltzner HG. S. (2012). Signal Processing Advances for the MUTE sEMG-Based Silent Speech Recognition System. Military Communication Conference, MILCOM 2012.
  11. ^ Fitzpatrick M. (2002). Lip-reading cellphone silences loudmouths. New Scientist.
  12. ^ Wand M, Schultz T. (2011). Session-independent EMG-based Speech Recognition. Proceedings of the 4th International Conference on Bio-inspired Systems and Signal Processing.
  13. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  14. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  15. ^ Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  16. ^ Apparatus for audibly communicating speech using the radio frequency hearing effect
  17. ^ Silent subliminal presentation system
  18. ^ Communication system and method including brain wave analysis and/or use of brain activity
  19. ^ Apparatus and method for remotely monitoring and altering brain waves
  20. ^ Clarke, Arthur C. (1972). The Lost Worlds of 2001. London: Sidgwick and Jackson. Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value)..

Further reading

[edit | edit source]
  • Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  • Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
  • Lua error in Module:Citation/CS1/Configuration at line 2172: attempt to index field '?' (a nil value).
[edit | edit source]