Localization and Mapping with a robotic PTZ sensor aims to perform camera pose estimation while maintaining update the map of a wide area. While this has previously been attempted by adapting SLAM algorithms, no explicit varying focal length estimation has been introduced before and other methods do not address the problem of being operative for a long period of time.
In recent years, pan-tilt-zoom cameras are becoming increasingly common, especially for use as surveillance devices in large areas. Despite its widespread usage, there are still issues yet to be resolved regarding their effective exploitation for scene understanding at a distance. A typical operating scenario is that of abnormal behavior detection which requires both simultaneous target 3D trajectories analysis and the indispensable image resolution to perform target biometric recognition.
This cannot generally be achieved with a single stationary camera mainly because of the limited field of view and poor resolution with respect to scene depth. This will be crucial for the challenging task of managing the sensor to track/detect/recognize several targets at high resolution in 3D. In fact, similarly to the human visual system, this can be obtained slewing the video sensor from target to target and zooming in and out as necessary.
This challenging problem however has been largely neglected mostly because of the absence of reliable and robust approaches for PTZ camera localization and mapping with 3D tracking of targets as well. To this end we are interested in the acquisition and maintenance of an estimate of the camera zoom and orientation, relative to some geometric 3D representation of its surroundings, as the sensor performs pan-tilt and zoom operations over time.