You can contact me at ycda [at] stanford [dot] edu. Or, you can find me on
Twitter, LinkedIn.
Research Interests
Capturing Real Auditory Scenes: Recently, my work has been on virtualizing real
auditory scenes and acoustic spaces. For instance, imagine being able to capture a video of a
concert, and then moving around the concert space freely. Imagine taking several videos of a fireworks show, and compiling them into
an interactive 3D experience. Perhaps you could capture the
intrinsic acoustic properties of your living room, in a way that allows you to listen to your
favorite artist there.
Differentiable and Inverse Audio Rendering: Audio renderers often require slow and
non-differentiable techniques. This makes it difficult to fit to real scenes via gradient-based
optimization processes, and thus, often results in audio simulations that are not accurate to the
real-world sounds they attempt to replicate. Inspired by visual inverse rendering and capture
techniques, I believe combining physical inductive biases with machine learning can help us fit
simulations to real scenes, and thus make them more accurate.
AI assisted Sound Design and Music-making: Making music requires many steps:
writing melodies/themes, chord progressions, arrangement, sound design, mixing, mastering, etc. Some musicians are more
inclined towards certain parts of this process. My goal is to provide musical artists with controllable
assistance for parts of the music-making process they are unfamiliar with.
We create a method of capturing real acoustic spaces from 12 RIR measurements, letting us play any audio signal in the room and listen from any location/orientation. We develop an 'audio inverse-rendering framework' that allows us to synthesize the room's acoustics (monoaural and binaural RIRs) at novel locations and create immersive auditory experiences (simulating music).
Humans induce subtle changes to the room's acoustic properties. We can observe these changes (explicitly via RIR measurement, or by playing and recording music in the room) and determine a person's location, presence, and identity.
Everyday objects possess distinct sonic characteristics determined by their shape and material. RealImpact is the largest dataset of object impact sounds to date, with 150,000 recordings of impact sounds from 50 objects of varying shape and material.
Education and Experience
MITAugust 2024-?
EECS PhD Student
Cambridge, Massachusetts
SONY AIJune 2024-August 2024
Research Intern, Music Foundation Model Team
Tokyo, Japan
Stanford UniversitySeptember 2022-June 2024
M.S. in Electrical Engineering, specialization in Signal
Processing and Optimization
GPA: 4.22/4.3
Course Assistant for ENGR 108 (3x), EE 178 (1x)
Research Assistant in CS (1x), EE (1x)
The University of ChicagoOctober 2018-June 2022
B.S. in Computer Science with a Specialization in Machine
Learning
B.A. in Mathematics
GPA: 4.0/4.0
Honors: Odyssey Scholar, Enrico Fermi Scholar, Robert Maynard Hutchins Scholar,
Summa Cum Laude
News
04/13/24
02/26/24
02/16/24
09/21/23
02/27/23
First-author submission to ISMIR 2024!
Hearing Anything Anywhere is accepted to CVPR 2024!
Mason is accepted to 5 PhD programs (out of 5)!
SoundCam is accepted to NeurIPS Datasets and Benchmarks 2023!