Apply

We welcome students who are passionate about audio, speech processing, machine learning, and generative AI research.

Application Info

We welcome students who are passionate about audio, speech processing, machine learning, and generative AI research. Applicants are encouraged to send: • CV / Resume • Research interests • Project or coding experience • Relevant coursework or activity records Links to GitHub repositories, portfolio pages, papers, demos, or other technical materials are highly encouraged. We value curiosity, consistency, and a strong motivation to learn and build. For inquiries or applications, please contact: jwoo@kaist.ac.kr

Students interested in joining the lab are encouraged to explore general research foundations in audio and speech AI.

Study

Students interested in joining the lab are encouraged

to explore research topics related to:

• Speech & Audio

speech enhancement

sound source separation

target sound extraction

speech recognition

• Spatial & Acoustic Audio Systems

spatial audio

room acoustics & RIR

microphone array signal processing

• Deep Learning for Audio

PyTorch / PytorchLightning

CNN / RNN / Transformers / Mamba

self-supervised representation learning

• Generative Audio Modeling

diffusion models

flow matching / meanflow

audio generation & restoration

• Audio Foundation Models

WavLM / HuBERT / Whisper

multimodal audio-language models

These topics are intended as preliminary areas of exploration for students interested in audio and speech AI research.

Application

Students are encouraged to engage in research projects related to audio and speech technologies.

• Sound Event Localization & Detection

Rebuilding Class / DoA decoders with Transformer-based architectures

Applying multi-scale structures to DeFT-based blocks (SEMamba++)

Exploring magnitude / phase mu-law compression & decompression techniques

• Multichannel Target Sound Extraction with Spatial queries

Simulating complex multi-source acoustic environments

Designing spatial queries using target direction and distance information

Developing discriminative and generative audio extraction models

Investigating robust spatial representation learning for multichannel audio

• Spatial Audio Understanding for Large Audio Language Models

Building spatial audio understanding benchmarks for multimodal LLMs

Generating simulated spatial audio datasets using audio simulation tools

Exploring multimodal learning with spatial acoustic information

• Spatial Audio Rendering

Simulating Room Impulse Responses in complex rooms

Developing models for Room Impulse Response generation

Exploring evaluation metrics related to human auditory perception

Students may gain hands-on experience with research tasks and projects through participation in audio and speech AI research.