Author Archives: admin

Mnemosyne: smart environments for cultural heritage

Mnemosyne is a research project carried out by the Media Integration and Communication Center – MICC, University of Florence along with Thales Italy SpA. and funded by the Tuscany region. The goal of the project is the study and experimentation of smart environments which adopts natural interaction paradigms for the promotion of artistic and cultural heritage by the analysis of visitors behaviors and activities.

Mnemosyne Interactive Table at the Museum of Bargello

The idea behind this project is to use techniques derived from videosurveillance to design an automatic profiling system capable of understanding the personal interest of each visitor. The computer vision system monitors and analyzes the movements and behaviors of visitors in the museum (through the use of fixed cameras) in order to extract a profile of interests for each visitor.

This profile of interest is then used to personalize the delivery of in-depth multimedia content enabling an augmented museum experience. Visitors interact with the multimedia content through a large interactive table installed inside the museum. The project also includes the integration of mobile devices (such as smartphones or tablets) offering a take-away summary of the visitor experience and suggesting possible theme-related paths in the collection of the museum or in other places of the city.

The system operates in a total respect of the privacy of the visitor: the cameras and the vision system only capture information on the appearance of the visitor such as color and texture of the clothes. The appearance of the visitor is encoded into a feature vector that captures its most distinctive elements. The feature vectors are then compared with each other to re-identify each visitor.

Mnemosyne is the first installation in a museum context of a computer vision system to provide visitors with personalized information on their individual interests. It is innovative because the visitor is not required to wear or carry special devices, or to take any action in front of the artworks of interest. The system will be installed, on a trial basis until June 2015, in the National Museum of the Bargello in the Hall of Donatello, in collaboration with the management of the Museum itself.

The project required the work of six researchers (Svebor Karaman, Lea Landucci, Andrea Ferracani, Daniele Pezzatini, Federico Bartoli and Andrew D. Bagdanov) for four years. The installation is the first realization of the Competence Centre Regional NEMECH New Media for Cultural Heritage, made up of the Region of Tuscany and Florence University with the support of the City of Florence.

Claudio Baecchi

Claudio Baecchi

Claudio Baecchi

Claudio Baecchi was born on December 14th, 1984 in Florence. He received a laurea degree in computer engineering from University of Florence, with a thesis on Fisher Feature Fusion Forests for visual object recognition. Currently he’s working at the Visual Information and Media Lab at Media Integration and Communication Centre, University of Florence.

From re-identification to identity inference

Person re-identification is a standard component of multi-camera surveillance systems. Particularly in scenarios in which the longterm behaviour of persons must be characterized, accurate re-identification is essential. In realistic, wide-area surveillance scenarios such as airports, metro and train stations, re-identification systems should be capable of robustly associating a unique identity with hundreds, if not thousands, of individual observations collected from a distributed network of very many sensors.

Traditionally, re-identification scenarios are defined in terms of a set of gallery images of a number of known individuals and a set of test images to be re-identified. For each test image or group of test images of an unknown person, the goal of re-identification is to return a ranked list of individuals from the gallery.

From re-identification to identity inference

Configurations of the re-identification problem are generally classified according to how much group structure is available in the gallery and test image sets. In a single-shot image set there is no grouping information available. Though there might be multiple images of an individual, there is no knowledge of which images correspond to that person. In a multi-shot image set, on the other hand, there is explicit grouping information available. That is, it is known which images correspond to the same individual.

While such characterizations of re-identification scenarios are useful for establishing benchmarks and standardized datasets for experimentation on the discriminative power of descriptors for person re-identification, they are not particularly realistic with respect to many real-world application scenarios. In video surveillance scenarios, it is more common to have many unlabelled test images to re-identify and only a few gallery images available.

Another unrealistic aspect of traditional person re-identification is its formulation as a retrieval problem. In most video surveillance applications, the accuracy of re-identification at Rank-1 is the most critical metric and higher ranks are of much less interest.

Based on these observations, we have developed a generalization of person re-identification which we call identity inference. The identity inference formulation is expressive enough to represent existing single- and multi-shot scenarios, while at the same time also modelling a larger class of problems not discussed in the literature.

From re-identification to identity inference

In particular, we demonstrate how identity inference models problems where only a few labelled examples are available, but where identities must be inferred for very many unlabelled images. In addition to describing identity inference problems, our formalism is also useful for precisely specifying the various multi- and single-shot re-identification modalities in the literature.

We show how a Conditional Random Field (CRF) can then be used to efficiently and accurately solve a broad range of identity inference problems, including existing person re-identification scenarios as well as more difficult tasks involving very many test images. The key aspect of our approach is to constraints the identity labelling process through local similarity constraints of all available images.

PITAGORA. Airport Operations Management

The PITAGORA project on Airport Operations Management is financed under the auspices of the POR CReO FESR program of the Region of Tuscany and co-financed by the European Regional Development Fund. The PITAGORA consortium consists of one large enterprise, five SMEs and two universities.

PITAGORA project on Airport Operations Management

PITAGORA project on Airport Operations Management

The primary goal of the project is to investigate the principal problems in airport operations control: collaboration, resources, and crises. In the course of the two year project the consortium will design, develop and create innovative
prototypes for an integrated platform for optimal airport management.

The PITAGORA platform will be based on an open architecture consisting of the following modules:

  • airport collaboration module;
  • energy resource optimization module;
  • human resources management module;
  • crisis management module;
  • passenger experience module.

MICC is the principal scientific partner in the project consortium and is leader of the Passenger Experience workpackage. In this workpackage the MICC will develop techniques for automatic understanding of passenger activity and behaviour through the use of RGB-D sensors.

The showcase prototype of this work will be a Virtual Digital Avatar (VDA) that interacts with the passenger in order to obtain an estimate of the volume passenger’s carry-on luggage. The VDA will greet the passenger, asking them to display their hand luggage for non-intrusive inspection. Until the system has obtained a reliable estimate of the volume and dimensions of the passenger’s luggage, the VDA will interact with the passenger, asking her to turn and adjust the system’s view of the baggage in order to improve its estimate.

A prototype system for measuring crowd density and passenger flux in airports will also be developed by MICC in the PITAGORA project. This prototype system will be used to monitor queues and to measure critical crowding situations that can occur in airport queues.

Finally MICC will develop a web application for passengers profiling and social networking inside the airport.

SISSI: Intermodal System Integrated for Security and Signaling on Rail

The SISSI project is a three-year project focusing on the design and development of a multi-sensor portal for train safety. MICC participates in this project. SISSI is funded by the Region of Tuscany and MICC contributes its expertise in video and image analysis to the project in order to analyze passing cargo trains and measure and detect critical situations.

This project involves the exploitation of high speed sensors (up to 18000Hz), both linear and matrix, in the visible spectrum and thermal spectrum in order to measure critical factors in passing cargo trains. The matrix sensor (640×480 pixels @ 300Hz) works in the visible spectrum and is used to detect the train pantograph in order to avoid false-alarm in the shape analysis system.

Pantograph detection samples

Pantograph detection samples

Two linear cameras (4096×1 pixels @ 18500Hz) are used to observe the profile of train and stitch a complete image of the train seen laterally. These images can then be used to extract the identifier of each wagon. Finally, two thermal cameras (256×1 pixels @512Hz) are used to segment train temperature and compute maximum and average temperature over a grid of sub-regions.

SISSI: train safety from MICC on Vimeo.

Web Framework for cultural tourism in smart cities

Prototype of a web framework for the definition and modification of a personalized visit in the city of Florence accessible through different devices. In particular the system exploits a wall mounted touchscreen in a visitor center for the early definition of a city visit plan transferrable on a mobile phone. Once the route plan is transferred, the mobile application allows updates and changes of the plan as well as to access geolocalized information of each Point Of Interest during the visit in the city. An application server platform and a network infrastructure permits to record user activities as well as search and retrieve personalized data.

People Interacting with the touchscreen

People Interacting with the touchscreen

The prototype system is currently under test at the Media Integration and Communication Center of the University of Florence and is developed in a joint project between the University of Florence and the Municipality of Florence. It will be part of the newly started project Social Museum and Smart Tourism that has been funded under the Cluster program of MIUR. It is expected to be in operation by January 1st 2014.

The mobile application interface

The mobile application interface

Francesco Turchini

Francesco Turchini was born in Florence on December 25th, 1984. I received a master degree in computer engineering from the University of Florence in 2013, with a thesis on “Fisher Feature Fusion Forests for visual object recognition”. Currently, he is PhD Candidate at the Media Integration and Communication Center, University of Florence. His main interests focus on image and video analysis for machine learning.

Francesco Turchini

Francesco Turchini

 

User Intentions in Multimedia Information Systems

In this talk Dr. Mathias Lux, Assistant Professor at the at the Institute for Information Technology (ITEC) at Klagenfurt University will investigate possibilities, challenges and opportunities for integrating user intentions into multimedia production, sharing, and retrieval.

Mathias Lux

Mathias Lux, creator of LIRe (Lucene Image Retrieval)

Abstract: how to build better multimedia information systems? Management and organization of multimedia data has become easier thanks to the wide availability of metadata as well as advances in content-based image retrieval (CBIR); these advances, however, do not address what matters the most: the actual users of multimedia information systems. The goals and aims of users, i.e., their intentions, need to be put into focus in some creative way.

An Evaluation of Nearest-Neighbor Methods for Tag Refinement

The success of media sharing and social networks has led to the availability of extremely large quantities of images that are tagged by users. The need of methods to manage efficiently and effectively the combination of media and metadata poses significant challenges. In particular, automatic image annotation of social images has become an important research topic for the multimedia community.

Detected tags in an image using Nearest-Neighbor Methods for Tag Refinement

Detected tags in an image using Nearest-Neighbor Methods for Tag Refinement

We propose and thoroughly evaluate the use of nearest-neighbor methods for tag refinement. We performed extensive and rigorous evaluation using two standard large-scale datasets to show that the performance of these methods is comparable with that of more complex and computationally intensive approaches. Differently from these latter approaches, nearest-neighbor methods can be applied to ‘web-scale’ data.

Here we make available the code and the metadata for NUS-WIDE-240K.

  • ICME13 Code (~ 8,5 GB, code + similarity matrices)
  • Nuswide-240K dataset metadata (JSON format, about 25MB). A subset of 238,251 images from NUS-WIDE-270K that we retrieved from Flickr with users data. Note that NUS is now releasing the full image set subject to an agreement and disclaimer form.

If you use this data, please cite the paper as follows:

@InProceedings\{UBBD13,
  author       = "Uricchio, Tiberio and Ballan, Lamberto and Bertini, 
                  Marco and Del Bimbo, Alberto",
  title        = "An evaluation of nearest-neighbor methods for tag refinement",
  booktitle    = "Proc. of IEEE International Conference on Multimedia \& Expo (ICME)",
  month        = "jul",
  year         = "2013",
  address      = "San Jose, CA, USA",
  url          = "http://www.micc.unifi.it/publications/2013/UBBD13"
}

Tiberio Uricchio

Tiberio Uricchio received his B.S. and M.S. degrees both in computer engineering from the University of Florence, Italy respectively in 2009 and 2012. Currently he is a PhD candidate at the Visual Information and Media Lab of the Media Integration and Communication Center at the University of Florence, Italy, under the supervision of Prof. Alberto Del Bimbo. His research interests include image and video understanding with social media analysis and machine.

Tiberio Uricchio

Tiberio Uricchio