"Perceptual evaluation of music mixing practices," 138th Convention of the Audio Engineering Society, Warsaw, 7 May 2015

I will present the following paper

Brecht De Man, Matthew Boerum, Brett Leonard, Richard King, George Massenburg and Joshua D. Reiss, “Perceptual evaluation of music mixing practices,” 138th Convention of the Audio Engineering Society, May 2015.

on Thursday 7 May at 3pm CET in the session “P3 - (Lecture) Recording and Production“ at the upcoming Convention of the Audio Engineering Society, in Warsaw, Poland.

Abstract:

The relation of music production practices to preference is still poorly understood. Due to the highly complex process of mixing music, few studies have been able to reliably investigate mixing engineering, as investigating one process parameter or feature without considering the correlation with other parameters inevitably oversimplifies the problem. In this paper we present an experiment where different mixes of different songs, obtained with a representative set of audio engineering tools, are rated by experienced subjects. The relation between the perceived mix quality and sonic features extracted from the mixes is investigated, and we find that a number of features correlate with quality.

The paper will be available from the AES E-Library from next week onwards.

Resources can be found here:

raw tracks, mixes, and DAW files (Avid Pro Tools 10) for 6 out of 10 songs in the test can be found here, and will be searchable on the Open Multitrack Testbed;
code for the MATLAB-based listening test interface I designed can be found on SoundSoftware;
the impulse responses from the left and right speaker in the CIRMMT Critical Listening lab (44.1 kHz/24 bit PCM WAV files), with the power spectral density plotted below.

PSD left

PSD right

Impulse responses recorded and graphs plotted by Brett Leonard, with the MATLAB script below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
clear all; close all;
tic;
% make sure you are in the IR folder:
[irL, fs] = audioread('Left-01.wav');
[irR, fs2] = audioread('Right-01.wav');
assert(fs==fs2);

figure(1);
nfft = 2^nextpow2(length(irL));
Pxx = abs(fft(irL,nfft)).^2/length(irL)/fs;

% Create a single-sided spectrum
Hpsd = dspdata.psd(Pxx(1:length(Pxx)/2),'Fs',fs);  
plot(Hpsd); 
ylim([-220 -80]);

figure(2);
nfft = 2^nextpow2(length(irR));
Pxx = abs(fft(irR,nfft)).^2/length(irR)/fs;

% Create a single-sided spectrum
Hpsd = dspdata.psd(Pxx(1:length(Pxx)/2),'Fs',fs);  
plot(Hpsd); 
ylim([-220 -80]);
toc;

Furthermore, the resampling of the audio from 96 kHz or 88.2 kHz to 44.1 kHz, for feature extraction from a more perceptually relevant range, was done using SoX. The performance of SoX can be compared to other resampling algorithms at src.infinitewave.ca.

HQ batch #audio resampling: for i in *.wav; do sox $i -b 24 temp.wav rate -v 48k; done
w/ @TheBaronHimself #SoX pic.twitter.com/6KfaHBQ0jP
— Brecht De Man (@BrechtDeMan) February 12, 2015