Sunday, 10 August 2025

Handle audio in a video file by using VideoFileClip in the MoviePy library

Python  MoviePy 

VideoFileClip is a core class in the MoviePy library that allows you to work with video files in Python. It provides methods to manipulate videos, extract audio, edit frames, and more. It is ideal for Python users who need to quickly extract audio with minimal code, maintain audio-video sync, leverage Python’s ecosystem for further processing and avoid low-level FFmpeg complexities.

Load a video file and get some basic information

from moviepy import VideoFileClip

# Load a video file
video = VideoFileClip("data/my-video.mp4")

# Get some basic information
print(f"Duration: {video.duration} seconds")
print(f"Size: {video.size}")  # (width, height)
print(f"FPS: {video.fps}")


Extract audio from video

Here are several ways to extract audio:

Example 1: Save audio directly to a file

from moviepy import VideoFileClip

# Load a video file
video = VideoFileClip("data/my-video.mp4")

# Extract audio and save to file
video.audio.write_audiofile("data/my_output_audio.mp3")

# You can also specify codec and bitrate
video.audio.write_audiofile("data/my_output_high_quality.mp3", codec='libmp3lame', bitrate='320k')


Example 2: Get audio as AudioClip object

from moviepy import VideoFileClip

# Load a video file
video = VideoFileClip("data/my-video.mp4")

audio = video.audio  # Get audio as AudioClip object

# You can then manipulate the audio or save it
audio.write_audiofile("data/output_audio.wav")  # Save as WAV


Example 3: Extract specific portion of audio

from moviepy import VideoFileClip

# Load a video file
video = VideoFileClip("data/my-video.mp4")

# Extract audio from 10s to 30s
audio_clip = video.subclipped(10, 30).audio
audio_clip.write_audiofile("data/partial_audio.mp3")


Example 4: Extract audio and apply effects

from moviepy import VideoFileClip, afx

# Load a video file
video = VideoFileClip("data/my-video.mp4")

# Get audio and apply effects
audio = video.audio

# Increase volume by 50%
audio = audio.with_effects([afx.MultiplyVolume(1.5)]) 
# 2-second fade in
audio = audio.with_effects([afx.AudioFadeIn(2)])  
# 2-second fade out
audio = audio.with_effects([afx.AudioFadeOut(2)])  

audio.write_audiofile("data/processed_audio.mp3")

In the code snippet above, function with_effects() is called three times for changing different effects. You can also pass multiple effects as a list to the function.

audio = audio.with_effects([afx.MultiplyVolume(1.5), afx.AudioFadeIn(2), afx.AudioFadeOut(2)]) 


Example 5: Batch process multiple videos

You can use the code snippet below to extract audio from multiple videos in a batch process.

from moviepy import VideoFileClip
import os

folder = "data"

for filename in os.listdir(folder):
    if filename.endswith((".mp4", ".avi", ".mov")):
        video_path = os.path.join(folder, filename)
        audio_path = os.path.join(folder, os.path.splitext(filename)[0] + ".mp3")

        try:
            video = VideoFileClip(video_path)
            video.audio.write_audiofile(audio_path)
            print(f"Processed: {video_path}")
        except Exception as e:
            print(f"Error processing {video_path}: {str(e)}")


Example 6: Extract audio with custom parameters

We need understand some parameters for handling audio: Sample rate (measured in Hertz, Hz) refers to the number of audio samples captured per second when converting analog sound into a digital signal. It directly impacts audio quality and frequency range. When using write_audiofile() in MoviePy to extract audio, the default sample rate (fps) is 44100 Hz (standard CD quality). However, you can explicitly set the sample rate using the fps parameter. Bitrate, bits (binary data) processed per second, measured in kbps (kilobits per second). It determines the amount of data used per second to encode the audio. Higher bitrate preserves more detail. Bitrate directly impacts audio quality (higher bitrate = better fidelity) and file size (higher bitrate = larger files). The difference between Bitrate and Sample rate is: Sample rate = how often the audio is sampled (vertical resolution) and Bitrate = how much data is allocated to store those samples (horizontal resolution). A codec (short for coder-decoder) is a software or hardware tool that compresses and decompresses digital audio data. It determines how audio is encoded (for storage/transmission) and decoded (for playback). For MP3 audio format, it requires codec="libmp3lame" and supports bitrates up to 320 kbps. When extracting or converting audio, we also need Force stereo. It ensures compatibility as many devices and applications expect stereo audio (e.g., smartphones, YouTube). It avoids playback issues with mono/surround sources and preserves spatial effects (e.g., music panning). When calling the function write_audiofile(), we can set the FFmpeg parameter as -ac 2 to force the ouput audio to be stereo (2 channels). -ac is short for "audio channels" and 2 is for 2 channels (left + right).

The following code example shows the custom parameter settings when using the function write_audiofile().

from moviepy import VideoFileClip

video = VideoFileClip("data/my-video.mp4")

# Extract audio with specific parameters
video.audio.write_audiofile(
    "data/custom_audio.mp3",
    fps=44100,        # Sample rate
    bitrate="192k",   # Bitrate
    codec="libmp3lame",  # Audio codec
    ffmpeg_params=["-ac", "2"]  # Force stereo output
)


Example 7: For long videos, consider processing in chunks

The code snippet below divides the long video into 5-minute segments for processing.

from moviepy import VideoFileClip

video = VideoFileClip("data/my-video.mp4")

# Process 5-minute chunks
chunk_size = 300

for i in range(0, int(video.duration), chunk_size):  
    start, end = i, min(i+chunk_size, video.duration)
    chunk = video.subclipped(start, end)
    chunk.audio.write_audiofile(f"data/audio_part_{i//chunk_size}.mp3")




Search