The development of computer hardware and communications has brought with it many exciting applications in the Internet of Things. More and more Single Board Computers (SBC) with high performance and low power consumption are used to infer deep learning models at the edge of the network. In this article, we investigate a cooperative task execution system in an edge computing architecture. In our topology, the edge server offloads different workloads to end devices, which collaboratively execute object detection on the transmitted sets of images. Our proposed system attempts to provide optimization in terms of execution accuracy and execution time for inferencing deep learning models. Furthermore, we focus on implementing new policies to optimize the E2E execution time and the execution accuracy of the system by highlighting the key role of effective image compression and the batch sizes (splitting decisions) received by the end devices from a server at the network edge. In our testbed, we used the You Only Look Once (YOLO) version 5, which is one of the most popular object detectors. In our heterogeneous testbed, an edge server and three different end devices were used with different characteristics like CPU/TPU, different sizes of RAM, and different neural network input sizes to identify sharp trade-offs. Firstly, we implemented the YOLOv5 on our end devices to evaluate the performance of the model using metrics like Precision, Recall, and mAP on the COCO dataset. Finally, we explore optimal trade-offs for different task-splitting strategies and compression decisions to optimize total performance. We demonstrate that offloading workloads on multiple end devices based on different splitting decisions and compression values improves the system’s performance to respond in real-time conditions without needing a server or cloud resources.