About me

Hi, welcome to my website!

I am Roshan Sharma, a Ph.D. candidate in the Electrical and Computer Engineering Department at Carnegie Mellon University. I started my Ph.D. in 2019 at Carnegie Mellon with Dr. Florian Metze, working on conversation transcription. I had the opportunity to work with Dr. Ian Lane on multi-party conversational speech recognition from 2020.

I began working with Dr. Bhiksha Raj from February 2022 in the Machine Learning for Signal Processing Group on Spoken Language Understanding.I collaborate frequently with my colleagues in Watanabe’s Audio and Voice Lab, led by Dr. Shinji Watanabe

My research interests like broadly in spoken language processing and natural processing. I am interested in teaching machines to do complex tasks that involve understanding speech and language. More specifically, my research spans abstract learning tasks like speech summarization, where the relation between the input and output is indirect to say the least.

Over the course of my Ph.D., I have had the incredible fortune to work on multiple problems in spoken language processing- including speech recognition, enhancement, speech emotion recognition and summarization among others. I am thrilled to discuss potential collaborations and my past work- please reach out if you are interested!