Automatic Recurring Video Clip Detection

Target: Automatically find recurring video clips (i.e. commercials, news segments, etc.) in a video database.

Summary: In order to find recurring video segments in a video database without any prior knowledge, firstly visual features are extracted. For computational efficiency features are extracted at only critical points: the extrema of edge strength of frames. Then a query subset of the database is selected. The rest is used as reference set for comparison. The only requirement is the reoccurence of a query segment in the reference set. Query features are matched with reference features. Sequential matches are grouped. The detected sequential groups are post-processed by means of some heuristics. Finally the recurring video clips are obtained.

Test Set

An artifical set is generated by using 12 commercials that are listed below.
Id Short Name Full Name Duration (s)
1kent1Kent Bayram Reklamı 201263
2didi1Ne Yersen Ye Yanında DİDİ!70
3didi2didi Soğuk Çay - Reklam Filmi 252
4thy1Drogba vs. Messi - #EpicFood29
5thy2Dünyanın 100 ülkesine uçan tek havayolu - Türk Hava Yolları17
6thy3 Fly Turkish32
7thy4Kobe vs. Messi - The Selfie Shootout60
8thy5Türk Hava Yolları _ Hayal Edince (Dream)130
9thy6Türk Hava Yolları ile Afrika’ya uç!60
10thy7Turkish Airlines - Bu Gurur Hepimizin92
11thy8Vazgeçme150
12thy9WE’R From Turkey104

Individual clips are concatenated in different orders to form a video database. The combination of the generated videos are given in the following table.

Name Combination Order (in term of clip IDs) Duration (s)
concat11,2,3,4,5,6,7,8,9,10,11,12864
concat212,11,10,9,8,7,6,5,4,3,2,1864
concat310,9,1,11,7,3,6,2,5,12,8,4864
concat411,8,6,7,10,1,9,2,4,5,12,3864
concat55,2,1,4,9,3,11,7,12,6,8,10864
concat68,9,2,7,10,12,1,5,6,3,4,11864
concat73,2,8,11,7,6,1,12,5,9,10,4864
concat811,12,4,6,10,8,2,5,3,1,9,7864
concat97,2,12,3,11,1,6,10,8,9,5,4864
concat105,11,2,1,9,10,12,3,8,6,4,7864
concat119,1,12,2,8,7,6,4,11,5,10,3864

Results

One concatenated file is used as query set and the rest are used as reference set. The video clip detection performance for a typical selection is given in the following table. The results indicate that as the number of instances of a query segment in the reference set increases the detection performance increases. For this typical test setup the clips can be detected (up to 1-3 secs boundary misalignment) successfully.

Query Set Reference Set (Concat Id) Num of Instances Correct Missed/Split/Merged Notes
concat121102?
concat131102?
concat141102(6,7) exists in concat4
concat151120
concat16184(8,9;3,4) exist in concat6
concat171102?
concat181102(11,12) exists in concat8
concat191102(8,9) exists in concat9
concat1101102(9,10) exists in concat10
concat1111120
concat121102
concat12,32102
concat12,3,43102
concat12,3,4,54102
concat12,3,4,5,65102
concat12,3,4,5,6,76120
concat12,3,4,5,6,7,87120
concat12,3,4,5,6,7,8,98120
concat12,3,4,5,6,7,8,9,109120
concat12,3,4,5,6,7,8,9,10,1110120
concat12,3,4,5,6,7,8,9,10,11+3h101203h indicates unrelated videos of 3 hours for false positive test. There occurs no false positives.
concat211102
concat21,3,4,5,6,7,8,9,10,11+3h101023h indicates unrelated videos of 3 hours for false positive test. There occurs no false positives.

Contact: Ersin ESEN