The equal segment microphone array (ESMA) is a multichannel microphone technique that attempts to capture a sound field in 360 ◦ without any overlap between the stereophonic recording angle of each pair of adjacent microphones. This study investigated into the optimal microphone spacing for a quadraphonic ESMA using cardioid microphones. Recordings of a speech source were made using the ESMAs with four different microphone spacings of 0 cm, 24 cm, 30 cm, and 50 cm based on different psychoacoustic models for microphone array design. Multichannel and binaural stimuli were created with the reproduced sound field rotated with 45 ◦ intervals. Listening tests were conducted to examine the accuracy of phantom image localization for each microphone spacing in both loudspeaker and binaural headphone reproductions. The results generally indicated that the 50 cm spacing, which was derived from an interchannel time and level trade-off model that is perceptually optimized for 90 ◦ loudspeaker base angle, produced more accurate localization results than the 24 cm and 30 cm ones, which were based on conventional models derived from the standard 60 ◦ loudspeaker setup. The 0 cm spacing produced the worst accuracy with the most frequent bimodal distributions of responses between the front and back regions. Analyses of the interaural time and level differences of the binaural stimuli supported the subjective results. In addition, two approaches for adding the vertical dimension to the ESMA (ESMA-3D) were devised. Findings from this study are considered to be useful for acoustic recording for virtual reality applications as well as for multichannel surround sound.