Automated lip reading

1/3/2023

In this case, though, the neural network identifies variations in mouth shape over time, learning to link that information to an explanation of what’s being said.

The team used that data set to train a neural network, similar to the kind often used to perform speech recognition. Each sentence is based on a string of words that follow the same pattern. As Quartz reported, its system was built on a data set known as GRID, which is made up of well-lit, face-forward clips of people reading three-second sentences. In one project, a team from the University of Oxford’s Department of Computer Science has developed a new artificial-intelligence system called LipNet. But researchers are showing that machine learning can be used to discern speech from silent video clips more effectively than professional lip-readers can. Although the proposed non-complex networks did not provide the highest accuracy for this database (based on the literature), 1) they were able to provide better results than some of the more complex and even pre-trained networks in the literature, 2) they are trained very fast, and 3) they are quite appropriate and acceptable for the robotic system during Human-Robot Interactions (HRI) via sign language.Lip-reading is notoriously difficult, depending as much on context and knowledge of language as it does on visual clues. The accuracy rate of 89.44% and 86.39% were obtained for the presented CNN-LSTM and 3D-CNN networks, respectively which were fairly promising for our automated lip-reading robotic system. In the second one, a 3D-CNN network was used to extract appropriate visual and temporal features from the videos. In the first network, CNN was used to extract static features, and LSTM was used to model temporal dynamics. In a follow up of our previous studies in empowering the RASA social robot to interact with individuals with hearing problems via sign language, we have proposed two automated lip-reading systems based on DNN architectures, a CNN-LSTM and a 3D-CNN, on the robotic system to recognize OuluVS2 database words. In Iranian Sign Language (ISL), alongside the movement of fingers/arms, the dynamic movement of lips is also essential to perform/recognize a sign completely and correctly. Book series (LNCS, volume 13086) Abstract

0 Comments

Automated lip reading

Leave a Reply.

Author

Archives

Categories