A .f0 file is an ASCII file with two columns containing the time in seconds of the analysed frames and the corresponding fundamental frequencies in Hertz:
time_in_secs_1 f0_value_1 time_in_secs_2 f0_value_2 time_in_secs_3 f0_value_3 . . . . . .
Here is an example of an .f0 file :
0.020000 117.934227 0.030000 74.105606 0.040000 98.863930 . . . . . .
Here is the "text" conversion of an .f0.sdif file :
SDIF 1NVT { StreamID 0; Date Wed_Jun__7_15.58.58_2000_; TableName GenericBreakPointFunction; WrittenBy Pm_Version_1.2.2; } SDFC 1FQ0 1 0 0.02 1FQ0 0x0004 1 1 117.934 1FQ0 1 0 0.03 1FQ0 0x0004 1 1 74.1056 1FQ0 1 0 0.04 1FQ0 0x0004 1 1 98.8639 . . . ENDC ENDFSee option -L for the specifications of .long and .long.sdif files.
f0 is available on Unix platforms at Ircam and is called from a command line using the options below :
-twidth : FFT width in samples (default : 512).
It is the size of the FFT applied on the signal frame after zero padding.
It should be a power of two and greater or equal to the number of samples
in the analysed frame of signal.
-istep : analysis step in samples (default : 64).
After each frame analysis, the signal window (or frame) is advanced
by this step.
-Nwidth : analysis window width in samples
(default : 1024).
This is the size in samples of the signal window (or frame) which is analysed
by FFT at each step. In order that spectral peaks appear separated in
the FFT analysis,
the signal window size should be at least equal to 3 time the inverse of
the smaller distance in Hertz between the peaks or partials which should
be detected in the analysis.
-ooName : output file name.
If this option is not used, the output is the standard output.
-fmf0_min
-fMf0_max :
Lower limit and upper limits of the interval within which fundamental
frequencies are searched. Defaults are 50 and 200 Hz respectively. It
is safer, when possible, to limit the interval [f0_min, f0_max] to one
octave. The interval can be adjusted at best after a first f0 detection pass before
starting another. In any case, compare the fundamental frequency of
your original file (e.g. by hear or by looking at the spacing of the
partials on a spectrum as in the image above above) to the -fm and -fM
limits and to the result of the analysis, octave errors among other are
frequent and often result from wrong settings of f0_min and f0_max.
-fvmf0_min_file
-fvMf0_max_file :
f0_min_file and f0_max_file are files which contain ( respectively ) time varying lower limits and time varying upper limits of the interval within which fundamental frequencies are searched. They are ASCII files with two columns containing the time in seconds and the corresponding limit frequency in Hertz (i.e. what is known as Break Point Function Files or Piece Wise Linear Function):
time_in_secs_1 f0_value_1 time_in_secs_2 f0_value_2 time_in_secs_3 f0_value_3 . . . . . .N.B. time_in_secs_i does not have to be the time of analysised frame i, for each frame, the value of the limit frequency is obtained by linear interpolation.
-Ffreq : The f0_estimation is based on regular spacing of possible harmonic partials up to this frequency. Only partial frequencies below freq are considered ( default : fe/2 ).
-fdfreq : When the spectrum is equal to zero, f0 is set to freq (default : 0).
-bnumber : number of the first fft analysis taken into account (default : 1).
-enumber : number of the last fft analysis into account (default : last one).
-snfloor : This value specifies the noise floor to be used when calculating the fundamental frequency for the soundfile (default : 50dB).
-jsubdivision : histogram bands subdivision (default : 2).
-lpercentage : f0 percentage for the probality law (default : 0.4).
-P : frame times start at i/fe - Nwidth/2fe instead of Nwidth/2fe.
-S : SDIF output of long or short f0 results.
-L : long f0 output file
By default, the format of the fundamental frequency file is as described above (.f0 and .f0.sdif files). The option -L causes the fundamental frequency file format to be extended with more information :
A .f0.long file is an ASCII file with five columns containing the time in seconds of the analysed frames, the corresponding fundamental frequencies, score, real amplitude, confidence good pitch :
time_in_secs_1 f0_value_1 score_1 real_amplitude_1 confidence_1 time_in_secs_2 f0_value_2 score_1 real_amplitude_1 confidence_1 time_in_secs_3 f0_value_3 score_1 real_amplitude_1 confidence_1 . . . . . . . . . . . . . . .
Here is an example of an .f0.long file :
0.020000 117.934227 72.390953 0.000863 0.534666 0.030000 74.105606 121.991577 0.001941 0.458296 0.040000 98.863930 91.781097 0.001888 0.509323 . . . . . . . . . . . . . . .
Here is the "text" conversion of an .f0.long.sdif file :
SDIF 1NVT { StreamID 0; Date Fri_Jun__9_11.41.21_2000_; TableName GenericBreakPointFunction; WrittenBy Pm_Version_1.2.2; } SDFC 1FQ0 1 1073832416 0.02 1FQ0 0x0004 1 4 117.934 72.391 0.000862592 0.534666 1FQ0 1 1073832416 0.03 1FQ0 0x0004 1 4 74.1056 121.992 0.00194089 0.458296 1FQ0 1 1073832416 0.04 1FQ0 0x0004 1 4 98.8639 91.7811 0.00188833 0.509323 . . .-H : output whole histogram instead of max value.