In subjective listening tests, natural sound sources were presented to subjects as vertically-oriented phantom images from two layers of loudspeakers, ‘height’ and ‘main’. Subjects were required to reduce the amplitude of the height layer until the position of the resultant sound source matched that of the same source presented from the main layer only (the localisation threshold). Delays of 0, 1 and 10 ms were applied to the height layer with respect to the main, with vertical stereophonic and quadraphonic conditions being tested. The results of the study showed that the localisation thresholds obtained were not significantly affected by sound source or presentation method. Instead, the only variable whose effect was significant was interchannel time difference (ICTD). For ICTD of 0 ms, the median threshold was −9.5 dB, which was significantly lower than the −7 dB found for both 1 and 10 ms. The results of the study have implications both for the recording of sound sources for three-dimensional (3D) audio reproduction formats and also for the rendering of 3D images.