Due to the recent advances in cameras, cell phones and camcorders, particularly the resolution at which they can record an image/video, large amounts of data are generated daily. This video data is often so large that manually inspecting it for useful content can be time consuming and error prone, thereby it requires automated analysis to extract useful information and metadata. Existing video analysis systems lack automation, scalability and operate under a supervised learning domain, requiring substantial amounts of labelled data and training time. We present a cloud-based, automated video analysis system to process large numbers of video streams, where the underlying infrastructure is able to scale based on the number and size of the stream(s) being considered. The system automates the video analysis process and reduces manual intervention. An operator using this system only specifies which object of interest is to be located from the video streams. Video streams are then automatically fetched from the cloud storage and analysed in an unsupervised way. The proposed system was able to locate and classify an object of interest from one month of recorded video streams comprising 175 GB in size on a 15 node cloud in 6.52 h. The GPU powered infrastructure took 3 h to accomplish the same task. Occupancy of GPU resources in cloud is optimized and data transfer between CPU and GPU is minimized to achieve high performance. The scalability of the system is demonstrated along with a classification accuracy of 95%.