Text Analytics

People talking in social media and the web, reports and emails written in companies, legal documents and patent information, knowledge stored in libraries and scientific publications – text is the pre-dominant form of communication on today's society. Understanding texts can give insights in what people think, tap into knowledge that is waiting to be exploited, and help humans and machines to communicate efficiently. Significant advances have been made in the recent past towards understanding texts and discovering semantics.

Techniques and methods

We investigate and develop algorithms for semantically mining various form of unstructured text data, including topic detection to efficiently summarize large corpora of texts, named entity and relation extraction to obtain semantic information, and sentiment analysis to describe subjective emotional coloring by the author. Novel deep learning algorithms make text analytics scalable to very large collections of data. This, coupled with linguistic know-how and long-standing practical experience, allows us to gain new insights into all kinds of texts.  

Our research focuses on text mining for challenging, innovative applications. This includes:

  • Mining social media to uncover trends, follow discussions, chart emotions and interests, thus extracting information from the web
  • Extracting high-level information from technical documents, business documents like contracts, patents, scientific literature, and other textual sources
  • Learning from structured and unstructured data, combining textual information with information from other data sources such as data bases or linked open data
  • Mining medical texts, e.g. from electronic health records