Can Siri read lips? Apple believes it’s possible

Photo credit: Tada Images / Shutterstock.

Siri, Apple’s virtual assistant, continues to evolve and improve. Over the years, Siri has become better at accurately understanding voice commands, even in noisy surroundings. But what if Siri could also read people’s lips?

A recent patent application from Apple suggests that this may be a future capability of Siri. The patent application describes a system that would utilize a camera to track the movements of a person’s lips and then use artificial intelligence to translate those movements into words.

Using Motion Sensing for Keyword Detection

Apple acknowledges the limitations of current speech recognition technologies like Siri. Other sensors that rely on voice monitoring consume significant battery life and computing power and can be affected by background noise. In this proposed system, the smartphone camera would not be used for lip-reading. Instead, the device’s motion sensors would capture movements made by the user’s lips, neck, or head, and the program would analyze the data to determine if they correspond to human speech. The patent application is titled “Keyword Detection Using Motion Sensing.”

(Image credit)

The patent suggests that these sensors could be integrated with accelerometers and gyroscopes, which are less susceptible to interference than microphones. According to the patent, this motion-detecting technology could be incorporated into AirPods or even “smart glasses,” with the data transmitted to the user’s iPhone. The system would be capable of detecting even subtle facial, neck, or head movements. Although Apple’s ambitions for smart glasses have diminished, the company still has high expectations for its Vision Pro headset.

While the patent application does not indicate when or if Apple plans to introduce this technology to the market, it showcases Apple’s ongoing quest for enhancing Siri’s capabilities.

There are several potential benefits to a lip-reading Siri. Firstly, it would enable Siri to understand individuals even when they are wearing masks or in noisy environments. This would be particularly useful in situations where speaking is challenging, such as in crowded rooms or while driving. Additionally, lip-reading Siri could enhance the accuracy of Siri’s voice recognition. By observing lip movements, the AI could better interpret what someone is saying.

However, there are also challenges to overcome before lip-reading Siri can be fully realized. One challenge is the inherent inaccuracy of lip reading since there are various ways to pronounce the same word. Another challenge is the difficulty of lip reading in noisy surroundings, as the AI would need to differentiate lip movements from background noise flawlessly.

