概述
由于自己第一次接触这方面的内容,以前是计算机软件方面,对于信号处理方面是一窍不通,进入这个实验室,接触新的知识,新的血液,其实 说实话挺难的,至少对于我这个笨笨的人来说是有难度的,打基础打了好久,基本上什么都要从头开始,首先学的就是奥本海默的《信号与系统》,宋知用老师的《MATLAB在语音信号分析与合成的应用》,《数值方法》,《信号处理教程》,《概率论与数理统计》,《算法导论》,周志华的《机器学习》,李航的《统计学习》等等,慢慢的对信号处理方面有了冰川一角的理解。
今天对第一个项目做一个小小的 总结:
我们收集到的语音信号,一般都是包含很多噪音的,所以我们经常要进行语音信号的滤波和降噪处理,在同时还要截取信号。
接下来是我对信号的截取操作,但是效果不是很好。
根据peak来截取信号。
function extract_middle_click()
readFilePath='D:datatoothSu*.wav';
readPathStr='D:datatoothSu';
savePathStr='D:datatoothddSu';
fileList=dir(readFilePath);
fileNum=length(fileList);
for j=1:fileNum
name=fileList(j).name; %获得cell数据中的name列 也就是完整的文件名字 Zhao-zhang Syam LWF Su
splitName=strsplit(name,'.'); %在.处截取.前面的字符串
varStr = splitName{1};
%dirname = [savePathStr,varStr,''];
a = ['mkdir ' savePathStr]; %mkdir是一个判断文件夹的函数。没有创建,有的话就是一个警告不是错误
system(a); %执行外部命令
fileName=strcat(readPathStr,name);%这个语句 就是获得了这个文件的完整路径
data = audioread(fileName);
% [b,a]=butter(3,[5000/44100*2,15000/44100*2],'bandpass'); %
% 18800hz~19200hz 19Khz 44.1Khz (f/fs)*2 滤波
% inputsignal = filter(b,a,data);
%
[event_index] = identify_middle_click_index(data)
disp(['Alice is ' num2str(event_index) ' years old!']);
for i=1:1:length(event_index)
dataIndex = (event_index(i)-2000):(event_index(i)+2000);
% datarange= inputsignal(dataIndex);
datarange= data(dataIndex);
%datarange = datarange/max(abs(datarange));
% [b,a]=butter(6,[0.8526,0.8707],'bandpass'); % 18800hz~19200hz 19Khz 44.1Khz (f/fs)*2
% filterData=filter(b,a,datarange);
% Fir = fir1(5000,[18985/44100*2,19015/44100*2],'stop');
% outdata = filter(Fir,1,filterData);
%varStr=inputname(1);
newStr=[savePathStr,int2str(j),'.txt'];
%newStr=[pathStr,varStr,'.txt'];
dlmwrite(newStr,datarange);
figure
plot(datarange);
end
end
function [event_index] = identify_middle_click_index(inputsignal) % 这个函数最终反回的是peak的最终index
nf = 0.04; %看时域图 看你的峰值一般都是大于多少,这个相当于过滤的一个阈值
span =20;
peakdistance = 4000;%这是个 阈值 ,来判断index上 峰值之间的距离
peakdistance2=20000;
event_index = [];
[lined_data,peaks,locs] = findpeak(inputsignal,nf,span); %find peak
% disp(['weizhi is ' num2str(length(locs))]);
%locs是peak的位置index
%peaks是peak的值
j=2;
event_index(1)=locs(1);
for i=2:length(locs)
if (locs(i)-locs(i-1))>peakdistance &&((locs(i)-locs(i-1)))<peakdistance2
event_index(j)=locs(i);
j=j+1;
end
end
找到每个语音信号的peak
function [lined_data,peaks,locs] = findpeak(x,nf,span)
%Function used to get the peaks (local maxima) from the given data
% [lined_data,peaks,locs] = findpeak(x,nf)
% lined_data => peaks in the locations
% peaks => Just the peak values
% locs => location at which peaks are occuring
% x => data for which peaks have to be obtained
% nf => Noise Floor
% span => span of the moving average required
for j=1:length(x(:,1))
if(x(j)>=(nf))
x(j)=x(j);
end
if(x(j)<(nf)) %Taking the values above the noise floor
x(j)=nf; %Assigning the minimum value as noise floor magnitude
end
end
x_smoothed=smooth(x-min(x),span,'moving'); %smoothing the shifted current snapshot
%20 is decided based on the type of data that is taken. It is like a cutoff
%frequency for a LPF.This moving average actually helps the findpeaks()
%function defined in Matlab library to decide the peak more efficiently
%especially in the case of experimental results when there is randomness
%in the data obtained.
[peaks,locs]=findpeaks(x_smoothed); %get the peaks from the data
lined_data=zeros(1,length(x)); %lined data will have peaks at locations
lined_data(locs)=peaks;
lined_data=lined_data+min(x); %shifting it back to original values
peaks=peaks+min(x); %Shifting it back to its original values
end
下面是提取feature的部分:
1、计算每个人的MFCC feature。
2、查看每个人的MFCC的图像。
3、对每个人的MFCC的特征进行自相关的分析
A=corr(MFCC);
A=corr(MFCC');
查看图形进行分析;
4、由于每个人的MFCC特征,没在一个mat文件中(主要是我做批处理的时候,没有把代码写好)
所以把每个MFCC特征放在一起
①先双击一个人的mat文件,名称为MFCCS,也就是load进来
定义mfcc=MFCCs;
②再打开另外一个的mfcc的mat文件,文件名称也为MFCCs
mfcc=[mfcc,MFCCs]
....
最终把所有的单个的mat文件合并到一个mat文件中
最后再保存 使用save 4.mat mfcc;
5、可以查看所有的mfcc的相关性,做一个简单的mfcc的分析
A=corr(MFCC')
6、点开所有的feature,这里也就是所有的mat文件,即4.mat。然后进行打label,1,2,3...
7、放入到SVM中进行模型的训练。
提取MFCC的feature代码
function MFCCs = extract_mfcc()
filePath='D:datatoothddZhao-zhang*.txt';
pathStr='D:datatoothddZhao-zhang';
fileList=dir(filePath);
fileNum=length(fileList);
MFCCs = [];
hamming = @(N)(0.54-0.46*cos(2*pi*[0:N-1].'/(N-1)));
for i=1:fileNum
name=fileList(i).name;
fileName=strcat(pathStr,name);
data=dlmread(fileName);
[ CC, FBE, frames ] = mfcc(data,44100,25,10,0.97,hamming,[5000,15000],20,13,22);
MFCCs = [MFCCs,mean(CC')'];
end
MFCCs = MFCCs';
save('MFCCs.mat');
SVM 代码如下:
function ac = ovoSVM()
%mfcc=load ('mfcc.mat'); %data format: n*m matrix, n is the number of observations,m-1 is number the dimension of the features,
load mfcc.mat % the last colum is the labels corresponding to the observations
%[meas,species] = formatdata_svm();
labels = mfcc(:,14);
[~,~,labels] = unique(labels); % # labels: 1/2/3
observations = mfcc(:,1:13);
data = zscore(observations); % # scale featuresx
%data = meas;
numInst = size(data,1); %获取矩阵的行数
%numLabels = max(labels);
% # split training/testing
idx = randperm(numInst); %获取行数的随机排列 1-16的随机排列
numTrain = 8;
%numTest = numInst - numTrain;
trainData = data(idx(1:numTrain),:); testData = data(idx(numTrain+1:end),:);
trainLabel = labels(idx(1:numTrain)); testLabel = labels(idx(numTrain+1:end));
% model=svmtrain(trainLabel,trainData,'-c 24 -g 4.1');
% [prediction_decision_label,prediction_accuracy,dec_value]=svmpredict(testLabel,testData,model);
% [training_decision_label,training_accuracy,dec_value]=svmpredict(trainLabel,trainData,model);
bestcv = 0;
for log2c = -4:12,
for log2g = -8:4,
% for log2c = -1:3,
% for log2g = -4:1,
cmd = ['-v 5 -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];
cv = svmtrain(trainLabel, trainData, cmd);
if (cv >= bestcv),
bestcv = cv; bestc = 2^log2c; bestg = 2^log2g;
end
% fprintf('%g %g %g (best c=%g, g=%g, rate=%g)n', log2c, log2g, cv, bestc, bestg, bestcv);
end
end
% # train one-against-one model
cmd2 = ['-c ', num2str(bestc), ' -g ',num2str(bestg), ' -b 1 '];
model = svmtrain(double(trainLabel), trainData, cmd2);
% # get probability estimates of test instances using each model
[pred,acc,preb] = svmpredict(double(testLabel), testData, model, '-b 1');
disp(pred);
ac = acc(1);
disp(['the accuracy is:' int2str(ac)]);
CM=confusionmat(testLabel,pred);
imagesc(CM);
colormap(flipud(gray));
axis xy;
xlabel('Groundtruth');% x轴名称
ylabel('Prediction');
最后
以上就是懦弱唇彩为你收集整理的第一个关于语音信号处理的research笔记的全部内容,希望文章能够帮你解决第一个关于语音信号处理的research笔记所遇到的程序开发问题。
如果觉得靠谱客网站的内容还不错,欢迎将靠谱客网站推荐给程序员好友。
发表评论 取消回复