Vocabulary-based Data Integration

Interlink, integrate and exchange data along digitized value chains: Inspired by agile software and content development, the »VoCol« methodology and tool environment supports vocabulary- based smart data integration to leverage the semantic interoperability within organizations and beyond.

© Fraunhofer IAIS

Data as a Strategic Asset

Nowadays, business processes are being digitized across all industries. Just-in-time manufacturing and mass customization generate vast amounts of data at a faster pace than ever. Specialization and outsourcing multiply the number of actors
involved in business exchanges. Data management is adapting to these trends: data quality is assured proactively and data is increasingly considered a strategic asset.

© Fraunhofer IAIS

Emerging Company Challenges

Increasing customization, specialization and outsourcing also lead to increased data heterogeneity, formats, structures and schemas involved in business processes. At the same time, the volume of unstructured and semi-structured data is growing exponentially.

In such settings, it is challenging to:

  • guarantee coherent and transparent data management,
  • efficiently leverage data lake repositories and avoid data silos,
  • establish data sharing agreements with strategic business partners for mutual benefit,
  • analyze data in real time and in an integrated manner.

Customizable Solution

Our scalable ontology-based data integration solution »VoCol« effectively establishes interoperability between heterogeneous data sources and IT systems. This is achieved by resolving semantic conflicts using ontologies, which define domain terminology explicitly and precisely.

We support the creation of enhanced, comprehensive and smart information models that prevent the emergence of disparate data silos. These information models are further evolved in a coherent collaborative work environment.

Boost Digitization

We develop integrated information models at a company, B2B and B2C level, adhering to applicable industry standards. We provide concise, non-technical overviews of such information models in order to facilitate strategic decision making. The information models ensure smooth data exchange and interoperability at all levels. They turn data into smart data, boosting significantly its quality and the creation of new products. They accelerate the digitization of business processes and lead to enhanced traceability throughout the supply and value chain. Integrating and linking data from different sources can additionally help to uncover unique and value-creating business insights. Examples of existing large-scale deployments of vocabulary-based integration approaches include: The schema.org initiative (driven by the large search engines Google, Bing and Yandex), which defines vocabularies for Web data and E-commerce, resulting in 15-20% of all Web pages containing such data nowadays. The German and European Digital Library initiatives exchanging and aggregating metadata about cultural heritage artefacts from thousands of memory institutions (museums, libraries, archives). Initiatives in the life science, supply chain and manufacturing domains aiming at defining vocabularies for efficient data exchange.

© Fraunhofer IAIS

Reference Projects

  • International Data Spaces: Digital Sovereignty over Data (initiative with more than 100 industrial partners funded by the Federal Ministry of Education and Research).
  • bIoTope – Building an IoT Open Innovation Ecosystem for Connected Smart Objects (research project funded by the European Commission).
  • Individual research projects with different industrial partners in the automation, manufacturing and consumer electronics domain.


  • Grangel-González, Halilaj, Auer, Lohmann, Lange, Collarana: »An RDF-based Approach for Implementing Industry 4.0 Components with Administration Shells«. In: 21st International Conference on Emerging Technologies and Factory Automation (ETFA ‘16). IEEE.
  • Halilaj, Grangel-González, Coskun, Auer: »Git4Voc: Git-Based Versioning for Collaborative Vocabulary Development«. In: 10th International Conference on Semantic Computing (ICSC ‘16), pp. 285–292. IEEE.
  • Petersen, Lange, Auer, Frommhold, Tramp: »Towards Federated, Semantics-based Supply Chain Analytics«. In: 19th International Conference on Business Informatics Systems (BIS ‘16), pp. 436–447. Springer.


Let’s discuss how vocabulary-based smart data integration can boost digitization in your organisation: