Guide: stereo to binaural conversion for headphone listening
Posted: Tue Jun 03, 2014 9:30 pm
Hello everyone!
Some folks at the MQN thread have asked me to share my quest into binaural headphone playback, so I decided to start this new thread. I’m afraid the first post will be somewhat long, but I think it’s going to be worth it. I published part of it before somewhere, but have added a lot of new bits. To make it all more bearable (and fun), I’ll use easily digestible episodes. :-)
Part 1: Introduction
First, an appetizer, download: Virtual Barbershop
If you know it already, great, if you don’t, listen to it on headphones straight up, and be amazed! Now THAT is what I would call a 3-axis 360 degree sound stage!
I’m going to share with you my quest for a similarly effortless, realistic, natural, life-like, transparent sound stage through headphones. I thought to myself: wouldn’t it be absolutely great, if I could listen to my favourite music with the same life-like presence as the virtual barbershop. That’s what started me on my quest and although in the end I have not managed to place my favourite musicians in a virtual space exactly as sharply defined as the barbershop, I have come very close!
Here are some pre-processed samples in 16/44 flac I've created by the process I'll explain here. They can be played directly through headphones and will give you a good idea of what to expect:
Binaural Samples
Part 2: All the World’s a Sound Stage?
I guess this chapter is not going to be news to most of you, but I’ll include it anyway to paint a complete picture and as preparation for what comes next. From the very beginning, music recording has been focused on reproduction through loudspeakers rather than headphones. That’s why almost all recordings to date are “stereophonic” recordings. In modern sound studios stereo mixes are created from multiple mono tracks, but in the old days a stereo recording was made by placing two microphones a certain distance apart, and realistic playback of a 3-dimensional “stereo-image” was possible through two speakers similarly placed a certain distance apart, a phenomenon all of us know very well. This type of recording was and still is meant to be heard through a pair of loudspeakers in order to unfold and re-create its inherent 3D-image or sound stage. If heard through headphones however, each channel that’s supposed to be heard by both the left and the right ear, is instead heard only by one ear, causing the stereo-image to collapse into a flat line between both ears.
This is the (not much of a) “sound stage” that we perceive through headphones while listening to stereophonic music recordings in our natural state of hearing. Note the two issues I have underlined, which I will get into separately.
Part 3: Binaural Minority Report
The Virtual Barbershop (VB) is NOT a stereophonic recording. It’s a binaural recording, tweaked digitally by way of a proprietary algorithm. The binaural recording technique is one specifically designed to be played back through headphones. Two microphones are placed in a dummy head, where our eardrums are located. If the dummy would be an exact plastered copy of our own head and ears, we would have no need of digitally enhancing the recording. Anything would then sound exactly as the VB. I’m sure you can understand why. In order for it to create its 3D realism to such an extent in no matter which pair of human ears, the digital algorithm that’s whispered into your ear at the end of the clip is used. It enhances the so-called head-related transfer functions (HRTF) of the recorded sounds. This is what creates the main difference between the perception of front and rear sounds. Of the very few binaural recordings that are made, only some will give you that exact front/rear positioning like the VB. I'll tell you why: some binaural recordings are recorded with a Jecklin Disc, or a dummy head without ears, so typically the perceived space is placed either 180 degrees behind you OR 180 degrees in front of you, rather than the full 360 like the VB. It’s our ears, and in this case I mean those funny pieces of meat sticking out of the sides of our heads, that allow us to discern between a sound coming from the front or the rear. They screen the sounds coming from the rear more than they do the sounds coming from the front. The way sound is altered because of our outer ear is determined by these HRTF. So the first clue I followed was the mysterious algorithm that was whispered in my left ear. But first, as I promised above, I‘d like to share my experience with natural hearing and the lack thereof!
Part 4: The Red or the Blue Pill
Our brain is an amazing thing capable of performing awe-inspiring feats. As we come into the world, our ears (that is our brain) don’t have as yet the capacity to locate sounds. We have to slowly start learning to interpret those slight phase-shifts in sound, those reflections and diffractions that are caused by the unique shape of our ears and our head (HRTF). We would lose that capability if we would suddenly lose our ears or be outfitted with differently shaped ears, at least at first, but as we would get used to those new ears, we would slowly gain that capability again. This shows that we are able the “re-program” our brain in order to preserve our capacity for 3-dimensional hearing. In case of the new set of ears we merely have to continue our inherent capability for “natural” hearing, based on those subtle HRTF cues, so although the transitional adaption period might be slightly confusing and tiring for our brain, once re-programmed, we are again able to listen effortlessly to the sounds in the world around us. Now, what does this have to with anything?
Let’s talk about headphone fatigue. This is the reason why I personally always preferred listening to speakers rather than headphones. While we listen to stereophonically recorded music through headphones, our brain is receiving auditory information that is in some way distorted and unnatural. Some aspects of it, like the frequency spectrum and the timing, are OK, but the directional cues are plainly NOT there in the way our brain is used to receiving them. So rather than give up and leave us with the narrow between-the-ears stereo-image we actually perceive in that natural state, our brain starts the process of re-programming itself in order to re-instate the illusion of natural positional hearing. This takes time and effort. It does cause fatigue, but after some time our wonderful brain IS actually able to have us believe that we are listening to a speaker-like sound stage. And the more we get used to it, the less fatiguing it gets and we are happy. It’s not exactly natural, and it still does take some small effort for the brain to maintain the illusion, but it kind of works, and at least there are absolutely no changes made to the frequency spectrum, the timing or resonant harmonics of the source.
For a long time this was the one and only choice available for headphone listening, but now I’m going to offer you a pill of a different colour. What if we could spare the brain the initial time and effort to re-program itself for headphone listening and the continuous effort it takes to uphold an illusion. I believe that fatigue is still occurring to almost everyone who's gotten used to headphone listening, because it just takes much more effort to translate those invalid auditory cues into a coherent sound stage, at least compared to the natural HRTF phase-based cues.
I’m not the first one to get the idea of some sort of pre-processing to make the sound more natural. Some headphone amp makers started experimenting with hardware-based cross-feed circuits, so that’s one of the things I started experimenting with.
Part 5: Cross-Feed, just a gimmick?
Cross-feed, as the name implies, feeds or little bit of the left channel into the right, and vice versa.
I found out that there were actually some Foobar plugins offering software cross-feed. There's the Bauer stereophonic-to-binaural DSP. The name was very promising and having played around with it and its settings, I liked it. It emulates various hardware based cross-feed circuits and makes subtle changes to the sound. It helps the brain a little bit more with deciphering spatial cues and building a small-scale sound stage, but still leaves something for the brain to do: expanding the soundstage outward; all in all, a good compromise, but not the end of the journey. At this point in time I found the virtual barbershop demo, clearly demonstrating that even more should be possible, so I started digging deeper.
Part 6: Positional Audio
I started searching for that mysterious Cetera algorithm responsible for the WOW-factor in the Virtual Barbershop. I found that the demo was created by a manufacturer of hearing aids called Starkey. The Cetera algorithm was the software part of a hearing aid developed in the late nineties, in cooperation with another company called QSound Labs. This company, then as well as now, specializes in a wide spectrum of 3D audio solutions. Their technology is implemented in various ways, software as well as hardware based. In fact, I discovered that since the nineties various companies had started research into 3D audio, both for studio purposes, like music recording and movie surround tracks, as well as positional audio for the PC (think first-person shooters). SRS Labs, for instance, has worked in the same field as QSound. Both these companies have developed software packages for the PC, able to process and enhance sound and music in a variety of ways, including headphone surround. I’ll not go into details here, as their products usually are shipped with certain hardware, like PC sounds cards, or if sold separately, are only useable as part of the operating system, which means upsampling/downsampling, etc, so that doesn’t really serve the purpose of audiophile music listening.
There’s one company however that I haven't mentioned yet, Lake Technology. This Australian company developed digital audio algorithms for recording studios. One of their algorithms allowed movie studio technicians to use headphones to work with and monitor 5.1 movie tracks. After Dolby Laboratories licensed the technology and then even bought the whole company, it became known as Dolby Headphone. Being a company with a slightly different focus compared to the others mentioned before, Dolby Headphone was licensed to manufacturers of DVD-players and other home-theatre equipment, where its algorithms were hardwired into the signal path.
Investigating Dolby Headphone I stumbled upon a thread at the Hydrogenaudio forum and discovered that someone had had similar thoughts already, and that turned out to be a significant discovery.
Part 7: Dolby Headphone Wrapper
Someone had already developed a great piece of software, called the Dolby Headphone Wrapper (DHW). It’s now an official 3rd party Foobar plugin and using it correctly and in combination with certain other plugins improves regular cross-feed processing by several orders of magnitude.
The Dolby Headphone algorithm is not only built into stand-alone dvd-players, but is also part of a number of commercial software dvd-players for the pc. One little file in particular takes care of it: dolbyHph.dll and the Foobar plugin utilizes that file. It is possibly not 100% legal to distribute it, but there are trial versions of software dvd-players available for download that include that dll-file.
The wrapper converts a 5.1 channel input into a binaural 2-channel output for headphones. Dolby Headphone does work with a 2-channel input as well, but the result won't be as good.
So what can we put before DHW to change a 2-channel stereophonic track into a 5.1 surround track? For almost all my music - be it rock, classic, pop or folk - I listen to through headphones, I use my customized DSP chain based on Dolby Headphone. With it I experience something better than a speaker-like soundstage. I feel as if I'm smack in the middle of a live soundstage. Words cannot begin to describe what these DSP’s do to any source of music, no matter how it’s recorded.
So what is the missing link?
Part 8: The Icing on the Cake
A guy named Steve Thomson created a free piece of software called V.I. Stereo to 5.1 Converter VST Plugin Suite (VI) that incorporates a number of algorithms (i.e. ambisonics) to place sounds into the proper place in the 3-dimensional sound stage. It creates a living, breathing atmosphere out of the slightest auditory cues available in the original signal. No matter how the recording is made, as long as it’s stereophonic and not already binaural, VI will create a 360 degrees image that is absolutely believable. It is VI that is responsible for placing echoes, resonances and other subtle or not so subtle cues at the proper place in the virtual sound stage, without ever overdoing it in such a way that it’s perceived as unnatural. A singer for instance is typically placed front center, but the acoustic reverberation of the voice that is part of the original recording is placed all around the listener just as it should be if the singer would be standing before you in a real room. And this applies to all instruments and sounds. The result is impressive. Of course some recordings work better than others with it, but all in all it’s pretty amazing how intelligently VI and Dolby Headphone work together to create such a realistic sound space. I have spent considerable time finding the optimal setting for VI where the focus and front/rear division is optimal and most realistic. I suggest you start with this and only if you are the experimenting type, change the settings and see if your taste is different than mine.
Part 9: Putting it all together
So, what do you need?
Download the package I prepared on Dropbox. It contains Dolby Headphone Wrapper (foo_dsp_dolbyhp.dll), VI Suite (VI_Setup.zip), VST adapter plugin (foo_vst.dll), SoX resampler (foo_dsp_resampler.dll) and another unnamed but necessary file. Install VI Suite according to the instructions included with it. Place the three Foobar plugins in the components folder of your foobar2000 installation folder. Start Foobar. Open Preferences and go to Components>VST plug-ins. Click Add, navigate to the folder where VI Suite was installed and add VI.dll to the VST list. Click OK and restart Foobar. Then open Preferences again and go to Playback>DSP Manager. Move Dolby Headphone and VI from the right to the left pane to activate these DSP's. If you have a DAC that works on 24bit/96KHz, I suggest you add the Resampler (SoX) to the list as well. Make sure the DPS's are listed in the following order: VI, DH, SoX.
Configure Dolby Headphone (click on DSP in list and then click "Configure selected"). Point the wrapper to the dolbyHph.dll file you must have sourced somehow :-) and saved on your PC somewhere (I suggest the Foobar folder). There are 3 choices for your virtual room. I tend to use the DH2 live room, as this is a good compromise between directness and spaciousness. The DH1 reference room is smaller, so less reverb, and the DH3 movie theater is large, so will create a very spacious effect, really impressive and pleasant to just let wash over you, but muddles detail somewhat. It's up to personal preference, but the VI settings I use and my converted samples are all based on room 2. Set amplification at 100% and make sure to leave Dynamic Compression off.
Configure VI. A red settings screen pops up. There are 4 sliders and 3 buttons. The top on/off button should obviously be on. Leave or switch the other buttons off. The sliders are each divided into 100 units. I call the centre point 0, with the leftmost at -50 and the rightmost at 50. Set them as follows:
Width correction: -15
Front ambience: 0
Rear ambience: -10
Rear level: -45
I have spent a lot of time figuring out these optimal settings. They are of course subject to personal preference, so feel free to experiment yourself. I'll warn you though: a few notches can cause the illusion to collapse.
If you decided to resample, configure SoX by setting target samplerate to 96000, quality to best, and leave the rest of the settings as they are.
I also use ReplayGain in the conversion, because especially with test tracks I don't like continuously having to fiddle with the volume knob. If you don't know what it is, or know already you don't want to use it, just skip this paragraph. I set +3dB for tracks without RG info and +6dB for the ones with RG info. For test tracks I choose track source mode and for listening to a whole album I obviously choose album source mode. This also offsets the loss of gain the DSP chain induces. I just apply gain, I don't select prevent clipping according to peak, as this is a useless feature that more often than not nullifies the whole purpose of using RG. If you decide to use RG as well, just add the Advanced Limiter DSP and place it last in the list. This doesn't need to be configured. It's just a simple filter that only touches samples that are actually clipping, and will only ever change anything in rare cases where average gain level is low and peaks are relatively high.
When you've got your DSP chain configured, type a name into the empty field under DSP chain presets and click Save. You can now load this DSP plugin chain with their configurations by simply selecting it in the preset list and clicking load. The point however is not to use the chain in real time (although you could of course do that too, if you'd like), but only use Foobar to convert the original file into a binaural version. For ease of use you should now create a conversion preset. Right click on any file in your library and select Convert and then the ... (three dots at the bottom).
You are now in the Converter Setup window. There are 4 main parts, which you reach by clicking on the links. Start at the top with Output format. I suggest you use Wav for best quality, but a lossless format is okay too. If your DAC only supports 16/44 you should select 16-bit under Output bit depth and under Dither select always. If your DAC supports 24/96, select 24-bit and under Dither select never or lossy sources only if you sometimes convert MP3's or other lossy formats.
Go back and go to the Destination part. Just read about the various output destination options. That pretty much speaks for itself.
The last link Other is left as is "When finished do nothing".
Now we come to the most important part: Processing. Open the submenu. If you want to use ReplayGain, set that up under the relevant header. Now you can simply select the DSP preset you created above and load it. The Active DSPs list should then be populated with your previously selected and configured plugins. Go back. Click the Save button. Select Create new preset. Give it a name and press Enter. You now have created a Binaural Conversion Preset. All you need to do for future use is right click on a track, go to Convert and then click on the preset name you've chosen before. Your file will be converted and saved in the location and under the name you've setup in the Destination part. You can now play this new file on your headphones, preferably with a good headphone amp or DAC with a headphone output, and using a top class music player like MQn.
Wow, that was a long story. Thanks for your patience if you managed to make it this far. :-)
I hope this will give you as much musical pleasure as it has given me already for a number of years.
Enjoy!
Some folks at the MQN thread have asked me to share my quest into binaural headphone playback, so I decided to start this new thread. I’m afraid the first post will be somewhat long, but I think it’s going to be worth it. I published part of it before somewhere, but have added a lot of new bits. To make it all more bearable (and fun), I’ll use easily digestible episodes. :-)
Part 1: Introduction
First, an appetizer, download: Virtual Barbershop
If you know it already, great, if you don’t, listen to it on headphones straight up, and be amazed! Now THAT is what I would call a 3-axis 360 degree sound stage!
I’m going to share with you my quest for a similarly effortless, realistic, natural, life-like, transparent sound stage through headphones. I thought to myself: wouldn’t it be absolutely great, if I could listen to my favourite music with the same life-like presence as the virtual barbershop. That’s what started me on my quest and although in the end I have not managed to place my favourite musicians in a virtual space exactly as sharply defined as the barbershop, I have come very close!
Here are some pre-processed samples in 16/44 flac I've created by the process I'll explain here. They can be played directly through headphones and will give you a good idea of what to expect:
Binaural Samples
Part 2: All the World’s a Sound Stage?
I guess this chapter is not going to be news to most of you, but I’ll include it anyway to paint a complete picture and as preparation for what comes next. From the very beginning, music recording has been focused on reproduction through loudspeakers rather than headphones. That’s why almost all recordings to date are “stereophonic” recordings. In modern sound studios stereo mixes are created from multiple mono tracks, but in the old days a stereo recording was made by placing two microphones a certain distance apart, and realistic playback of a 3-dimensional “stereo-image” was possible through two speakers similarly placed a certain distance apart, a phenomenon all of us know very well. This type of recording was and still is meant to be heard through a pair of loudspeakers in order to unfold and re-create its inherent 3D-image or sound stage. If heard through headphones however, each channel that’s supposed to be heard by both the left and the right ear, is instead heard only by one ear, causing the stereo-image to collapse into a flat line between both ears.
This is the (not much of a) “sound stage” that we perceive through headphones while listening to stereophonic music recordings in our natural state of hearing. Note the two issues I have underlined, which I will get into separately.
Part 3: Binaural Minority Report
The Virtual Barbershop (VB) is NOT a stereophonic recording. It’s a binaural recording, tweaked digitally by way of a proprietary algorithm. The binaural recording technique is one specifically designed to be played back through headphones. Two microphones are placed in a dummy head, where our eardrums are located. If the dummy would be an exact plastered copy of our own head and ears, we would have no need of digitally enhancing the recording. Anything would then sound exactly as the VB. I’m sure you can understand why. In order for it to create its 3D realism to such an extent in no matter which pair of human ears, the digital algorithm that’s whispered into your ear at the end of the clip is used. It enhances the so-called head-related transfer functions (HRTF) of the recorded sounds. This is what creates the main difference between the perception of front and rear sounds. Of the very few binaural recordings that are made, only some will give you that exact front/rear positioning like the VB. I'll tell you why: some binaural recordings are recorded with a Jecklin Disc, or a dummy head without ears, so typically the perceived space is placed either 180 degrees behind you OR 180 degrees in front of you, rather than the full 360 like the VB. It’s our ears, and in this case I mean those funny pieces of meat sticking out of the sides of our heads, that allow us to discern between a sound coming from the front or the rear. They screen the sounds coming from the rear more than they do the sounds coming from the front. The way sound is altered because of our outer ear is determined by these HRTF. So the first clue I followed was the mysterious algorithm that was whispered in my left ear. But first, as I promised above, I‘d like to share my experience with natural hearing and the lack thereof!
Part 4: The Red or the Blue Pill
Our brain is an amazing thing capable of performing awe-inspiring feats. As we come into the world, our ears (that is our brain) don’t have as yet the capacity to locate sounds. We have to slowly start learning to interpret those slight phase-shifts in sound, those reflections and diffractions that are caused by the unique shape of our ears and our head (HRTF). We would lose that capability if we would suddenly lose our ears or be outfitted with differently shaped ears, at least at first, but as we would get used to those new ears, we would slowly gain that capability again. This shows that we are able the “re-program” our brain in order to preserve our capacity for 3-dimensional hearing. In case of the new set of ears we merely have to continue our inherent capability for “natural” hearing, based on those subtle HRTF cues, so although the transitional adaption period might be slightly confusing and tiring for our brain, once re-programmed, we are again able to listen effortlessly to the sounds in the world around us. Now, what does this have to with anything?
Let’s talk about headphone fatigue. This is the reason why I personally always preferred listening to speakers rather than headphones. While we listen to stereophonically recorded music through headphones, our brain is receiving auditory information that is in some way distorted and unnatural. Some aspects of it, like the frequency spectrum and the timing, are OK, but the directional cues are plainly NOT there in the way our brain is used to receiving them. So rather than give up and leave us with the narrow between-the-ears stereo-image we actually perceive in that natural state, our brain starts the process of re-programming itself in order to re-instate the illusion of natural positional hearing. This takes time and effort. It does cause fatigue, but after some time our wonderful brain IS actually able to have us believe that we are listening to a speaker-like sound stage. And the more we get used to it, the less fatiguing it gets and we are happy. It’s not exactly natural, and it still does take some small effort for the brain to maintain the illusion, but it kind of works, and at least there are absolutely no changes made to the frequency spectrum, the timing or resonant harmonics of the source.
For a long time this was the one and only choice available for headphone listening, but now I’m going to offer you a pill of a different colour. What if we could spare the brain the initial time and effort to re-program itself for headphone listening and the continuous effort it takes to uphold an illusion. I believe that fatigue is still occurring to almost everyone who's gotten used to headphone listening, because it just takes much more effort to translate those invalid auditory cues into a coherent sound stage, at least compared to the natural HRTF phase-based cues.
I’m not the first one to get the idea of some sort of pre-processing to make the sound more natural. Some headphone amp makers started experimenting with hardware-based cross-feed circuits, so that’s one of the things I started experimenting with.
Part 5: Cross-Feed, just a gimmick?
Cross-feed, as the name implies, feeds or little bit of the left channel into the right, and vice versa.
I found out that there were actually some Foobar plugins offering software cross-feed. There's the Bauer stereophonic-to-binaural DSP. The name was very promising and having played around with it and its settings, I liked it. It emulates various hardware based cross-feed circuits and makes subtle changes to the sound. It helps the brain a little bit more with deciphering spatial cues and building a small-scale sound stage, but still leaves something for the brain to do: expanding the soundstage outward; all in all, a good compromise, but not the end of the journey. At this point in time I found the virtual barbershop demo, clearly demonstrating that even more should be possible, so I started digging deeper.
Part 6: Positional Audio
I started searching for that mysterious Cetera algorithm responsible for the WOW-factor in the Virtual Barbershop. I found that the demo was created by a manufacturer of hearing aids called Starkey. The Cetera algorithm was the software part of a hearing aid developed in the late nineties, in cooperation with another company called QSound Labs. This company, then as well as now, specializes in a wide spectrum of 3D audio solutions. Their technology is implemented in various ways, software as well as hardware based. In fact, I discovered that since the nineties various companies had started research into 3D audio, both for studio purposes, like music recording and movie surround tracks, as well as positional audio for the PC (think first-person shooters). SRS Labs, for instance, has worked in the same field as QSound. Both these companies have developed software packages for the PC, able to process and enhance sound and music in a variety of ways, including headphone surround. I’ll not go into details here, as their products usually are shipped with certain hardware, like PC sounds cards, or if sold separately, are only useable as part of the operating system, which means upsampling/downsampling, etc, so that doesn’t really serve the purpose of audiophile music listening.
There’s one company however that I haven't mentioned yet, Lake Technology. This Australian company developed digital audio algorithms for recording studios. One of their algorithms allowed movie studio technicians to use headphones to work with and monitor 5.1 movie tracks. After Dolby Laboratories licensed the technology and then even bought the whole company, it became known as Dolby Headphone. Being a company with a slightly different focus compared to the others mentioned before, Dolby Headphone was licensed to manufacturers of DVD-players and other home-theatre equipment, where its algorithms were hardwired into the signal path.
Investigating Dolby Headphone I stumbled upon a thread at the Hydrogenaudio forum and discovered that someone had had similar thoughts already, and that turned out to be a significant discovery.
Part 7: Dolby Headphone Wrapper
Someone had already developed a great piece of software, called the Dolby Headphone Wrapper (DHW). It’s now an official 3rd party Foobar plugin and using it correctly and in combination with certain other plugins improves regular cross-feed processing by several orders of magnitude.
The Dolby Headphone algorithm is not only built into stand-alone dvd-players, but is also part of a number of commercial software dvd-players for the pc. One little file in particular takes care of it: dolbyHph.dll and the Foobar plugin utilizes that file. It is possibly not 100% legal to distribute it, but there are trial versions of software dvd-players available for download that include that dll-file.
The wrapper converts a 5.1 channel input into a binaural 2-channel output for headphones. Dolby Headphone does work with a 2-channel input as well, but the result won't be as good.
So what can we put before DHW to change a 2-channel stereophonic track into a 5.1 surround track? For almost all my music - be it rock, classic, pop or folk - I listen to through headphones, I use my customized DSP chain based on Dolby Headphone. With it I experience something better than a speaker-like soundstage. I feel as if I'm smack in the middle of a live soundstage. Words cannot begin to describe what these DSP’s do to any source of music, no matter how it’s recorded.
So what is the missing link?
Part 8: The Icing on the Cake
A guy named Steve Thomson created a free piece of software called V.I. Stereo to 5.1 Converter VST Plugin Suite (VI) that incorporates a number of algorithms (i.e. ambisonics) to place sounds into the proper place in the 3-dimensional sound stage. It creates a living, breathing atmosphere out of the slightest auditory cues available in the original signal. No matter how the recording is made, as long as it’s stereophonic and not already binaural, VI will create a 360 degrees image that is absolutely believable. It is VI that is responsible for placing echoes, resonances and other subtle or not so subtle cues at the proper place in the virtual sound stage, without ever overdoing it in such a way that it’s perceived as unnatural. A singer for instance is typically placed front center, but the acoustic reverberation of the voice that is part of the original recording is placed all around the listener just as it should be if the singer would be standing before you in a real room. And this applies to all instruments and sounds. The result is impressive. Of course some recordings work better than others with it, but all in all it’s pretty amazing how intelligently VI and Dolby Headphone work together to create such a realistic sound space. I have spent considerable time finding the optimal setting for VI where the focus and front/rear division is optimal and most realistic. I suggest you start with this and only if you are the experimenting type, change the settings and see if your taste is different than mine.
Part 9: Putting it all together
So, what do you need?
Download the package I prepared on Dropbox. It contains Dolby Headphone Wrapper (foo_dsp_dolbyhp.dll), VI Suite (VI_Setup.zip), VST adapter plugin (foo_vst.dll), SoX resampler (foo_dsp_resampler.dll) and another unnamed but necessary file. Install VI Suite according to the instructions included with it. Place the three Foobar plugins in the components folder of your foobar2000 installation folder. Start Foobar. Open Preferences and go to Components>VST plug-ins. Click Add, navigate to the folder where VI Suite was installed and add VI.dll to the VST list. Click OK and restart Foobar. Then open Preferences again and go to Playback>DSP Manager. Move Dolby Headphone and VI from the right to the left pane to activate these DSP's. If you have a DAC that works on 24bit/96KHz, I suggest you add the Resampler (SoX) to the list as well. Make sure the DPS's are listed in the following order: VI, DH, SoX.
Configure Dolby Headphone (click on DSP in list and then click "Configure selected"). Point the wrapper to the dolbyHph.dll file you must have sourced somehow :-) and saved on your PC somewhere (I suggest the Foobar folder). There are 3 choices for your virtual room. I tend to use the DH2 live room, as this is a good compromise between directness and spaciousness. The DH1 reference room is smaller, so less reverb, and the DH3 movie theater is large, so will create a very spacious effect, really impressive and pleasant to just let wash over you, but muddles detail somewhat. It's up to personal preference, but the VI settings I use and my converted samples are all based on room 2. Set amplification at 100% and make sure to leave Dynamic Compression off.
Configure VI. A red settings screen pops up. There are 4 sliders and 3 buttons. The top on/off button should obviously be on. Leave or switch the other buttons off. The sliders are each divided into 100 units. I call the centre point 0, with the leftmost at -50 and the rightmost at 50. Set them as follows:
Width correction: -15
Front ambience: 0
Rear ambience: -10
Rear level: -45
I have spent a lot of time figuring out these optimal settings. They are of course subject to personal preference, so feel free to experiment yourself. I'll warn you though: a few notches can cause the illusion to collapse.
If you decided to resample, configure SoX by setting target samplerate to 96000, quality to best, and leave the rest of the settings as they are.
I also use ReplayGain in the conversion, because especially with test tracks I don't like continuously having to fiddle with the volume knob. If you don't know what it is, or know already you don't want to use it, just skip this paragraph. I set +3dB for tracks without RG info and +6dB for the ones with RG info. For test tracks I choose track source mode and for listening to a whole album I obviously choose album source mode. This also offsets the loss of gain the DSP chain induces. I just apply gain, I don't select prevent clipping according to peak, as this is a useless feature that more often than not nullifies the whole purpose of using RG. If you decide to use RG as well, just add the Advanced Limiter DSP and place it last in the list. This doesn't need to be configured. It's just a simple filter that only touches samples that are actually clipping, and will only ever change anything in rare cases where average gain level is low and peaks are relatively high.
When you've got your DSP chain configured, type a name into the empty field under DSP chain presets and click Save. You can now load this DSP plugin chain with their configurations by simply selecting it in the preset list and clicking load. The point however is not to use the chain in real time (although you could of course do that too, if you'd like), but only use Foobar to convert the original file into a binaural version. For ease of use you should now create a conversion preset. Right click on any file in your library and select Convert and then the ... (three dots at the bottom).
You are now in the Converter Setup window. There are 4 main parts, which you reach by clicking on the links. Start at the top with Output format. I suggest you use Wav for best quality, but a lossless format is okay too. If your DAC only supports 16/44 you should select 16-bit under Output bit depth and under Dither select always. If your DAC supports 24/96, select 24-bit and under Dither select never or lossy sources only if you sometimes convert MP3's or other lossy formats.
Go back and go to the Destination part. Just read about the various output destination options. That pretty much speaks for itself.
The last link Other is left as is "When finished do nothing".
Now we come to the most important part: Processing. Open the submenu. If you want to use ReplayGain, set that up under the relevant header. Now you can simply select the DSP preset you created above and load it. The Active DSPs list should then be populated with your previously selected and configured plugins. Go back. Click the Save button. Select Create new preset. Give it a name and press Enter. You now have created a Binaural Conversion Preset. All you need to do for future use is right click on a track, go to Convert and then click on the preset name you've chosen before. Your file will be converted and saved in the location and under the name you've setup in the Destination part. You can now play this new file on your headphones, preferably with a good headphone amp or DAC with a headphone output, and using a top class music player like MQn.
Wow, that was a long story. Thanks for your patience if you managed to make it this far. :-)
I hope this will give you as much musical pleasure as it has given me already for a number of years.
Enjoy!