Multimedia Pattern Recognition

Pattern Recognition in Multimedia Data

»Multimedia Pattern Recognition« involves the research and development of pattern recognition processes, tools and software solutions for voice, image, audio and video data as well as documents. The surge in digital data has increased demand for automatic analysis and recognition methods. This type of evaluation will frequently need to be undertaken in real time such as for the automatic recognition of traffic signs or the generation of subtitles during live broadcasts. Evaluating such enormous amounts of data manually would also be prohibitively expensive.  

Methodology: Data-driven Procedures and Learning Procedures

Two main issues need to be taken into account when researching and developing pattern recognition technologies: First of all it is usual for statistical classifiers to be used for recognition purposes before being trained using extensively annotated data pools. These types of data-driven methods lead to the robust recognition technologies which make the use of learning procedures possible. Secondly, pattern recognition technology is selected and developed to ensure it is suitable for real-life scenarios. Specific application scenarios present particular challenges to research and development teams working in this area.

Stable Recognition Technology for Long Runtimes  

Statistical classifiers such as deep neural networks, support vector machines and hidden Markov models are applied to the transformed input data as part of the actual recognition process. As the recognition technology is being used in a productive environment it is important that it is particularly stable in terms of its runtime behavior and maintainability.

Research Projects

Our researchers work with pattern recognition technology across all media ensuring it is maintained, continues to be enhanced and is tailored to the individual customer’s specific needs within extensive software libraries.

Document Analysis

We research and develop innovative algorithms to optimize the quality and speed of document analyses. We are also continually looking for solutions to new issues such as table structure recognition or document classification.

Voice and Audio Analysis

Our research topics cover the indexing of spoken documents as well as pattern recognition on audio data. The emphasis is on improving our German language voice recognizer. We also research audio segmentation and voice recognition procedures.

Image and Video Analysis

We optimize basic object detection, recognition and tracking technology in terms of runtime, quality of results and resource and energy consumption. These processes are applied not only to camera images and video data but also to ultrasound, hyperspectral, laser and radar imaging.