python绘制频谱图,在音频分析中绘制频谱图

257 阅读 0 评论 170 点赞

我是靠谱客的博主着急世界，这篇文章主要介绍python绘制频谱图,在音频分析中绘制频谱图，现在分享给大家，希望可以做个参考。

I am working on speech recognition using neural network. To do so I need to get the spectrograms of those training audio files (.wav) . How to get those spectrograms in python ?

解决方案

There are numerous ways to do so. The easiest is to check out the methods proposed in Kernels on Kaggle competition TensorFlow Speech Recognition Challenge (just sort by most voted). This one is particularly clear and simple and contains the following function. The input is a numeric vector of samples extracted from the wav file, the sample rate, the size of the frame in milliseconds, the step (stride or skip) size in milliseconds and a small offset.

from scipy.io import wavfile

from scipy import signal

import numpy as np

sample_rate, audio = wavfile.read(path_to_wav_file)

def log_specgram(audio, sample_rate, window_size=20,

step_size=10, eps=1e-10):

nperseg = int(round(window_size * sample_rate / 1e3))

noverlap = int(round(step_size * sample_rate / 1e3))

freqs, times, spec = signal.spectrogram(audio,

fs=sample_rate,

window='hann',

nperseg=nperseg,

noverlap=noverlap,

detrend=False)

return freqs, times, np.log(spec.T.astype(np.float32) + eps)

Outputs are defined in the SciPy manual, with an exception that the spectrogram is rescaled with a monotonic function (Log()), which depresses larger values much more than smaller values, while leaving the larger values still larger than the smaller values. This way no extreme value in spec will dominate the computation. Alternatively, one can cap the values at some quantile, but log (or even square root) are preferred. There are many other ways to normalize the heights of the spectrogram, i.e. to prevent extreme values from "bullying" the output :)

freq (f) : ndarray, Array of sample frequencies.

times (t) : ndarray, Array of segment times.

spec (Sxx) : ndarray, Spectrogram of x. By default, the last axis of Sxx corresponds to the segment times.

Alternatively, you can check the train.py and models.py code on github repo from the Tensorflow example on audio recognition.

Here is another thread that explains and gives code on building spectrograms in Python.