Pavan Seshadri

I'm a second-year masters student at Georgia Tech advised by Dr. Alexander Lerch in the Music Informatics Group. I recieved my bachelors degree in Computer Science from Georgia Tech in 2021, with a minor in Music Technology. My undergraduate research focused on representation learning methods for music performance assessment. I have also spent time working with Dr. Peter Knees (TU Wien) on music recommendation projects.

My current research focuses on audio/music similarity for cold-start music discovery. I am broadly interested in topics spanning speech/audio and language representation learning, including information retrieval, recommendation systems, and multimodal learning.

Prior to starting graduate school, I was a Machine Learning Engineer at Amazon where I worked on NLP research and infrastructure for product classification.

Starting May 2024, I will be joining Music.ai as a research intern, working with Dr. Filip Korzeniowski and Dr. Richard Vogl.

I will be available and looking for full-time roles starting September 2024.

Email / CV / Linkedin / Github / Google Scholar

Peer-Reviewed Publications

ASPED: An Audio Dataset for Detecting Pedestrians
Pavan Seshadri, Chaeyeon Han, Bon-Woo Koo, Noah Posner, Subhrajit Guhathakurta, Alexander Lerch
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP, Seoul, South Korea, 2024
arXiv / code (Coming soon)

We introduce the new audio analysis task of pedestrian detection and present a new large-scale dataset for this task. While the preliminary results prove the viability of using audio approaches for pedestrian detection, they also show that this challenging task cannot be easily solved with standard approaches.

Leveraging Negative Signals with Self-Attention for Sequential Music Recommendation
Pavan Seshadri, Peter Knees
Proceedings of the 1st Workshop on Music Recommender Systems, 17th ACM Conference on Recommender Systems, MuRS @ RecSys, Singapore, 2023 (Oral Presentation)
arXiv / code

We present a study using self-attentive architectures for next-track sequential music recommendation. We additionally propose a contrastive learning subtask to learn session-level track preference from implicit user signals, resulting in a 3-9% top-K hit rate performance increase relative to baseline negative feedback-agnostic approaches.

Improving Music Performance Assessment with Contrastive Learning
Pavan Seshadri, Alexander Lerch
Proceedings of the International Society for Music Information Retrieval, ISMIR, Online, 2021
arXiv / code

Contrastive loss based neural networks are able to exceed SoTA performance for music performance assessment (MPA) regression tasks by learning a better clustered latent space.

Other Publications

AVASPEECH-SMAD: A Strongly Labelled Speech and Music Activity Detection Dataset with Label Co-occurrence
Yun-Ning Hung, Karn N. Watcharasupat, Chih-Wei Wu, Iroro Orife, Kelian Li, Pavan Seshadri, Junyoung Lee
Late-Breaking Demos of the International Society for Music Information Retrieval, 2021
arXiv / code

We propose a dataset, AVASpeech-SMAD, which provides frame-level music labels for the existing AVASpeech dataset, originally consisting of 45 hours of audio and speech activity labels.

Experience

Amazon
Software Development Engineer, Product Knowledge Classification

Amazon
Software Development Engineer Intern, Browse Auto Classification

Template borrowed from Jon Barron