Google's latest patent application reveals Bard Powered Personal Assistant

For Non-Technical Readers

Ever experienced the frustration of your Android phone not correctly understanding your voice commands? Google has just patented a technology that promises to revolutionize how your phone listens and responds to you. It’s all about enhancing Automated Speech Recognition (ASR), making it not only more accurate but also faster. Imagine chatting with your Google Assistant seamlessly, with minimal mistakes and almost zero delay – that’s what this patent is all about.

Technical Details

Digging into the nitty-gritty, the patent application reveals Google’s technique to refine a Two-Pass Streaming Neural Network Model armed with a Listen-Attend-Spell (LAS) decoder. This system identifies various training examples, such as supervised audio-text pairs and unpaired text sequences, and processes them differently to enhance accuracy.

For instance, in case of an unpaired text sequence, the model computes a cross-entropy loss based on a log probability tied to the context vector, and adjusts the LAS decoder accordingly. However, when dealing with a supervised audio-text pair, it refines the LAS decoder and the corresponding acoustic context vector.

What sets this patent apart is its approach towards reducing the Word Error Rate (WER), specifically in the context of long tail entities. To achieve this, Google’s technology uses a combination of two log probabilities – one derived from an acoustic context vector and the other from a text context vector.

The LAS decoder might also operate in a beam search mode, steered by a hypothesis produced by a Recurrent Neural Network Transducer (RNN-T) decoder during the first pass of the two-pass streaming neural network model.

Call to Action

