For Non-Technical Readers:
In a new patent published by Google on June 1, 2023, the tech giant revealed an innovative technology aimed at improving communication for individuals with atypical speech patterns. For those who aren’t familiar, atypical speech includes conditions such as stuttering, lisp, or other speech sound disorders. The patented technology works to bridge the gap between typical and atypical speech, allowing for better understanding during conversations.
The system functions by utilizing a multi-layered speech recognition technology to transcribe the speech of a user with atypical speech patterns. This transcription can then be displayed to others or converted into synthesized speech. The brilliance of this technology lies in its ability to adapt based on the context of the conversation. This ensures a more accurate transcription, thus facilitating smoother and more effective communication for those with atypical speech.
Technical Details:
The Google patent, published on June 1, 2023, discloses an innovative method for transcribing atypical speech in real-time. This method could revolutionize the way individuals with speech disorders like stuttering, lisp, or voice disorders are able to communicate with others.
The patent document describes a system employing multiple speech recognizers that work in conjunction. These recognizers activate selectively depending on whether the system detects typical or atypical speech. The system uses a general speech recognizer to generate a transcription of speech from a speaker with typical speech. This transcription is then analyzed to identify one or more bias terms, which serve to tune an alternative speech recognizer. The system employs this alternative, biased recognizer when transcribing speech from a speaker with atypical speech.
The novel approach disclosed here is to use the context of the conversation to increase the accuracy of the transcription. This context-aware biasing of the alternative speech recognizer improves the transcription accuracy of speech from a speaker with atypical speech patterns.
Further, the system includes an acoustic feature receiver, a general speech recognizer, an alternative speech recognizer, and a transcription analyzer. The receiver takes acoustic features from a first utterance spoken by a user with typical speech and from a second utterance spoken by a user with atypical speech.
In terms of hardware implementation, the patent discusses the integration of data processing hardware to perform operations such as acoustic feature reception, speech recognition, transcription analysis, and bias term identification.
Additionally, the patent discloses the use of a speech classifier model that helps identify if an utterance is from a speaker with atypical speech. The model has been trained on a range of atypical and typical utterances from speakers with various speech conditions.
Interestingly, the system includes methods for identifying bias terms. This could be achieved by identifying non-functional terms in the first transcription, conducting a term frequency-inverse document frequency (tf-idf) analysis on the first transcription, or identifying mutual information included in multiple transcriptions generated by the general speech recognizer during an ongoing conversation between the two users.
Furthermore, this technology doesn’t require any specialized hardware. It can be implemented on any data processing device, be it a user’s personal device or a remote computing device. Overall, Google’s new patent paves the way for significant strides in assisting individuals with atypical speech in effective communication.