The availability of measures of appearance of trademarks and logos in a video is important in fields of marketing and sponsoring. These statistics can, in fact, be used by the sponsors to estimate the number TV viewers that noticed them and then evaluate the effects of the sponsorship. The goal of this project is to create a semi-automatic system for detection, tracking and recognition of pre-defined brands and trademarks in broadcast television. The number of appearances of a logo, its position, size and duration will be recorded to derive indexes and statistics that can be used for marketing analysis.
To obtain a technique that is sufficiently robust to partial occlusions and deformations, we use local neighborhood descriptors of salient points (SIFT features) as a compact representation of the important aspects and local texture in trademarks. By combining the results of local point-based matching we are able to detect and recognize entire trademarks. The determination of whether a video frame contains a reference trademark is made by thresholding the normalized-match score (the ratio of SIFT points of the trademark that have been matched to the frame). Finally, we compute a robust estimate of the point cloud in order to localize the trademark and to approximate its area.