Mus2Vid – Visualize Music | Purdue SERIS
♪  Mus2Vid  ·  Spring 2022 – Present

Visualize Music for
Interactive Entertainment

Music is a powerful and universal language that unites people from all backgrounds — yet for those who are hearing-impaired, it remains largely inaccessible. Mus2Vid uses real-time diffusion models to transform classical music into vivid, emotion-driven visual experiences for everyone.

✏  Please fill out our survey

Introduction

Music is a powerful and universal language with the unique ability to unite people from diverse backgrounds and cultures. Yet, for those who are hearing-impaired, this enchanting language remains largely inaccessible. Our project seeks to break down these barriers by visualizing music — making it not only accessible, but also a source of immersive and interactive entertainment for everyone, regardless of their hearing ability.


Statement of Problem

Imagine one out of every five friends in your life is unable to share the joy of music due to hearing impairment. When you get swept away by the rhythm, or experience laughter and tears stirred by a song, they can only observe in silence — unable to partake in the world of music that surrounds you.

These friends feel a profound sense of regret, missing out on exquisite melodies, harmonies, and the enchanting allure of music. They cannot share in the same emotional connection that music provides to others. This inequity is unjust — everyone has the right to savor music, a universal gift and a testament to the cultural evolution of humanity. We aspire to employ modern technology to restore to them the enjoyment they rightfully deserve.


Proposed Solution

Our project develops innovative technologies that transform music into a visual and interactive experience. We focus on using emotion as the connective thread between senses — aligning music, words, and pictures to achieve real-time music visualization. Here's how our approach works:

01

Music Analysis: Multidimensional Exploration

We dissect music from multiple angles — examining instruments, rhythm, pitch range, harmony, and dynamics. Emotion becomes a pivotal element, capturing the feelings and moods conveyed by each composition.

02

Machine Learning Classification: Understanding Emotional Resonance

State-of-the-art ML models classify music and its associated emotions based on textual descriptions, establishing tangible connections between music and the visual experiences it evokes.

03

Text Prompts: Adding Language to Music

Machine learning models generate descriptive text prompts that encapsulate the emotional core of the music. These prompts act as keys to unlocking visual representations of auditory masterpieces.

04

Generation Models: Visualizing Music

Text prompts feed our diffusion-based generation models, which transform abstract descriptions into vivid, interactive visual effects — from evocative images to mesmerizing videos — each crafted to complement the music.

05

Continuous Updates: Real-Time Personalization

Visual effects evolve in real-time as music plays, harmonizing with its ebb and flow. This dynamic feature ensures every audience member receives a personalized and interactive experience unique to that moment.


Impact

Our project has the potential to bring about significant changes in how people experience and appreciate music across multiple dimensions:

Accessibility for Hearing-Impaired

Opens the world of music to those with hearing impairments, ensuring it becomes an inclusive and enjoyable experience for a broader audience.

Personalized Entertainment

Combining music and visual effects offers captivating personalized experiences that add a new dimension of engagement for all music enthusiasts.

Entertainment Industry

Dynamic visual content synchronized with music provides fresh creative possibilities and new ventures within the broader entertainment industry.

Expanded Horizons

Enables audiences to perceive music in a more multifaceted, emotionally resonant way — deepening appreciation and understanding of music as art.

Inclusivity & Innovation

Exemplifies how cutting-edge machine learning and generation models can transform and enrich human interaction with art and entertainment.


Demo Showcase

Experience Mus2Vid in action — watch real-time diffusion models generate visual accompaniments to classical music:

Mus2Vid demo video thumbnail
Watch on YouTube ↗