Home → News → Deep Learning Turns Mono Recordings Into Immersive... → Full Text

Deep Learning Turns Mono Recordings Into Immersive Sound

By Technology Review

January 3, 2019

[article image]

Researchers at the University of Texas and Facebook Research have taught an artificial intelligence (AI) system to convert ordinary monaural sounds into nearly three-dimensional sounds.

The researchers trained the AI to pick up visual cues to determine a sound's originating direction, so when provided a video of a scene and mono sound recording, the machine learning system works out this direction and distorts the interaural time and level differences to generate the effect for the listener.

The researchers said, "We call the resulting output 2.5D visual sound—the visual stream helps 'lift' the flat single channel audio into spatialized sound."

The AI was trained on a database of about 2,000 binaural recordings of videoed musical clips.

The researchers said their next step is "to explore ways to incorporate object localization and motion, and explicitly model scene sounds."

From Technology Review
View Full Article


Abstracts Copyright © 2019 SmithBucklin, Washington, DC, USA


No entries found