Recently, there has been an ever-growing demand for virtual reality (VR) and 360° video applications. Different from conventional 2D videos, 360° videos take users into an immersive experience by providing them with a navigable panoramic view. However, achieving adequate quality of experience (QoE) levels poses significant network challenges, especially in mobile delivery setups. Despite the tremendous improvements offered by 5G and beyond mobile networks, streaming 360° videos in a similar fashion to 2D videos is suboptimal, while scaling at high numbers questions the feasibility of the endeavor. This paper explores the utilization of caching and multicasting solutions for the mobile delivery of VR and 360° videos. First, an overview of immersive technologies and their distinctive characteristics is provided. Then, we discuss the network challenges associated with 360° videos and the role of implementing robust caching and multicasting schemes that exploit the unique features of 360° videos and capitalize on the correlations among end-users' viewports. Having established the foundations and challenges of 360° video streaming, we continue with a comparison of the state-of-the-art literature, while focusing on video streaming optimization aspects. We conclude our work by discussing the status and future research directions.