Python Extracting Audio from Video

  • 2021-10-16 02:15:33
  • OfStack

Brief introduction

There is no audio in the video written by VideoCapture class in OpenCV. If you want to process audio one step further, you need to use a library-MoviePy. This library is Python video editing library, which can be cropped, spliced, caption inserted, video synthesized, video processed and customized.

Installation


pip install moviepy

Code


from moviepy.editor import *
video = VideoFileClip('test.mp4')
audio = video.audio
audio.write_audiofile('test.mp3')

You can use the ffmpeg-python library without installing the moviepy video editing library, see Reference 4, and the code is slightly more complicated

Audio format


extensions_dict = { "mp4": {'type':'video', 'codec':['libx264','libmpeg4', 'aac']},
          'ogv': {'type':'video', 'codec':['libtheora']},
          'webm': {'type':'video', 'codec':['libvpx']},
          'avi': {'type':'video'},
          'mov': {'type':'video'},
          'ogg': {'type':'audio', 'codec':['libvorbis']},
          'mp3': {'type':'audio', 'codec':['libmp3lame']},
          'wav': {'type':'audio', 'codec':['pcm_s16le', 'pcm_s24le', 'pcm_s32le']},
          'm4a': {'type':'audio', 'codec':['libfdk_aac']}
         }

It can be seen that ogg, mp3, wav and m4a are supported, and the output of m4a failed in personal test. It is recommended to only use mp3 and wav

Test the 2-minute video output mp3 is 1.83 Mb, wav is 20.1 Mb

mp3 is lossy and wav is lossless, optional on demand

Remarks

To realize the lower-level audio and video processing application ffmpeg

Supplement: python processes mp4 video, extracts audio into mp3 or wav, and intercepts it

mp4 video file extracts audio into mp3 or wav file

mp3 is a lossy file, and wav is a lossless file. Just like the video I tested, mp3 exports only a few 10k, and wav exports more than 3M.


from moviepy.editor import *
video = VideoFileClip('aa.mp4')
audio = video.audio
audio.write_audiofile('test.wav')
audio.write_audiofile('test.mp3')

Intercept map or wav files


from scipy.io import wavfile
like = wavfile.read('test.wav')
# print (like)
#  Audio results will be returned 1 A tuple . No. 1 1 The dimension parameter is the sampling frequency in seconds; No. 1 2 Dimension data is 1 A ndarray Indicates the song, if the first 2 Dimensional ndarray Only 1 Two data represent mono and two data represent stereo. Therefore, by controlling the 2 Dimension data can be used to crop songs. 
#  Right like This tuple number 2 Dimension data is clipped, so it is like[1]; No. 1 2 In the dimension data, music data is segmented.  start_s Indicates the start time when you want to crop the audio; Likewise end_s Indicates the end time when you cut the audio. Multiplication 44100  Is because it needs to be carried out every second 44100 Subsampling 
#  Object for the audio 13-48 Intercept in seconds 
wavfile.write('test2.wav',44100,like[1][13*44100:48*44100])

Related articles: