We want to capture and reconstruct the spatial acoustic characteristics of a real room, to synthesize immersive auditory experiences
We require only:
We use this information to fit a differentiable acoustic inverse rendering framework (DiffRIR) with interpretable parametric models of salient acoustic features of the scene, including sound source directivity and surface reflectivity.
After training, DiffRIR can recover the fully immersive acoustic field of a room, and:
The DiffRIR dataset contains real RIRs and music from four rooms: A Classroom, an acoustically Dampened Room, a Hallway, and a Complex Room with many surfaces. In the latter three rooms, we collect additional subdatasets where we vary the location and/or orientation of the speaker, or the presence and location of standalone whiteboard panels in the room. These are used to evaluate zero-shot generalization to changes in room layout. The dataset can be found on Zenodo.
@InProceedings{hearinganythinganywhere2024,
title={Hearing Anything Anywhere},
author={Mason Wang and Ryosuke Sawata and Samuel Clarke and Ruohan Gao and Shangzhe Wu and Jiajun Wu},
booktitle={CVPR},
year={2024}}