A new technology from researchers at Carnegie Mellon University will add sound and vibration awareness to create truly context-aware computing. The system, called Ubicoustics, adds additional bits of context to smart device interaction, allowing a smart speaker to know it’s in a kitchen or a smart sensor to know you’re in a tunnel versus on the open road.
“A smart speaker sitting on a kitchen countertop cannot figure out if it is in a kitchen, let alone know what a person is doing in a kitchen,” said Chris Harrison a researcher at CMU’s Human-Computer Interaction Institute. “But if these devices understood what was happening around them, they could be much more helpful.”
The first implementation of the system uses built-in speakers to create “a sound-based activity recognition.” How they are doing this is quite fascinating.
“The main idea here is to leverage the professional sound-effect libraries typically used in the entertainment industry,” said Gierad Laput, a PhD student. “They are clean, properly labeled, well-segmented and diverse. Plus, we can transform and project them into hundreds of different variations, creating volumes of data perfect for training deep-learning models.”
From the release:
Laput said recognizing sounds and placing them in the correct context is challenging, in part because multiple sounds are often present and can interfere with each other. In their tests, Ubicoustics had an accuracy of about 80 percent — competitive with human accuracy, but not yet good enough to support user applications. Better microphones, higher sampling rates and different model architectures all might increase accuracy with further research.
In a separate paper, HCII Ph.D. student Yang Zhang, along with Laput and Harrison, describe what they call Vibrosight, which can detect vibrations in specific locations in a room using laser vibrometry. It is similar to the light-based devices the KGB once used to detect vibrations on reflective surfaces such as windows, allowing them to listen in on the conversations that generated the vibrations.
This system uses a low-power laser and reflectors to sense whether an object is on or off or whether a chair or table has moved. The sensor can monitor multiple objects at once and the tags attached to the objects use no electricity. This would let a single laser monitor multiple objects around a room or even in different rooms, assuming there is line of sight.
The research is still in its early stages, but expect to see robots that can hear when you’re doing the dishes and, depending on their skills, hide or offer to help.