Audio Mining System for the ARD

Fully automated indexing of the ARD archives

We are helping journalists and editors to quickly find the information they need within the archives of the ARD. To this end, we developed the Audio Mining System, which automatically extracts information from audio and video files (i.e. television and radio programs): the spoken words are automatically transcribed, and the program is segmented according to the persons speaking. Moreover, relevant key words are automatically added.

The ARD, which is a joint organization of Germany's regional public-service broadcasters, including its international broadcaster Deutsche Welle, owns a large archive of produced programs, dating back to the beginnings of the Federal Republic (West Germany) and the German Democratic Republic (East Germany). Most of this archival content has been digitized over the years.

Each material possesses manually annotated metadata such as title, series information or a short abstract. By automatically analyzing the data and extracting the spoken text and by integrating this information into the archival systems of the broadcaster, journalists and editors can search the whole archival content in seconds and find relevant segments within the program.

Implementation

Fraunhofer IAIS collaborates with the WDR since 2015 to practically apply the Audio Mining System. The system is operated in the computing center of the ARD and is connected to the archival systems of the individual broadcasters. It runs in a processing cluster to ensure the fast availability of the results – even in times of processing peaks – and is currently sized to process up to 2000 hours of audio and video material per day. Newly published material is processed fully automatically: this includes the automatic transcription of spoken content and extracting keywords for each program. The speaker recognition module to find original quotes of public persons, e.g. politicians, is in a pilot phase.

Customer specific adjustments

Besides supporting the operations of the Audio Mining System, Fraunhofer IAIS develops further functionalities serving the particular needs of journalists, editors and investigators of the ARD. Since the news domain is prone to new topics, new terms and proper names, we developed a self-learning speech recognition system which is able to learn new words and phrases on a daily basis. Moreover, we created customer specific models featuring certain topics or regional specifics. In the course of a development project, we implement further features, such as recognizing foreign names or correctly formatting numbers and dates.

Benefits for the customer

Our Audio Mining System opens up the content of archives so that journalists can find relevant information more quickly. The ARD archival systems can thus be utilized to find programs about specific topics, containing relevant keywords. Within a program, the journalist can then jump to the exact time stamp when a term is used. By combining speech recognition and speaker recognition, original quote by known persons can be searched directly.