One of the essential needs of the media and entertainment (M&E) industry has been to quickly divide a video content into smaller segments so that only the necessary content can be prepped for cataloging and eventual broadcasting. The presence of digitally added segments such as Color Bars, Episodic Slates, Black Screens, Break Bumpers etc. to the video content makes the process of extracting the real content a tricky and time-consuming task, requiring personnel with industry specific know-how.
The creative nature of the video content further complicates the Segmentation process, leading to the process being almost completely executed through manual inspection. With the surge in the number of assets being produced for Broadcast and OTT consumption, it becomes imperative to automate the segmentation flow, to the highest extent possible, with a human operator only needed to decide on the finer, creative boundaries of the detected segments.
Currently, the available solutions in the market cater to a very miniscule percentage of the detectable segments and/or do not generate frame accurate segments. This results in only a minimal reduction in the overall effort, rendering the solutions infeasible for large scale operations.
Furthermore, the existing solutions do not have the ability to learn from the manually fine tuned segments data, and hence, will never approach perfection for solving the problem which is driven by creative needs.
Hence, we propose a Segmentation approach which learns predictable features over time and improves its predictions based on feedback from a QC step.