Optical Music Recognition (OMR) is a specialized field within computer vision focused on converting scanned images of musical scores into machine-readable digital formats. Unlike traditional Optical Character Recognition (OCR) for text, OMR faces unique challenges due to the complex and highly symbolic nature of musical notation, with intricate spatial relationships and hierarchical structures. This thesis shifts the focus of OMR research from theoretical models to practical, scalable systems designed to handle large music archives. The proposed approach leverages the advanced capabilities of object detection models, particularly the YOLO11 series, to create a modular, efficient OMR pipeline capable of processing thousands of pages quickly and accurately. By separating the detection and interpretation stages, this modular framework enables fine-tuning of individual components to optimize for both speed and robustness. The pipeline produces simplified MusicXML outputs, and experiments are conducted on the OmniOMR and OLiMPiC datasets. Results demonstrate significant improvements in both detection accuracy and processing speed. The work also provides open-source libraries to enable future experimentation and development.