-
Home
-
European Projects
-
Large-scale, Cross-lingual Trend Mining and Summar.. (TrendMiner)
Large-scale, Cross-lingual Trend Mining and Summarisation of Real-time Media Streams
(TrendMiner)
Start date: Nov 1, 2011,
End date: Oct 31, 2014
PROJECT
FINISHED
The recent massive growth in online media and the rise of user-authored content (e.g weblogs, Twitter, Facebook) has lead to challenges of how to access and interpret these strongly multilingual data, in a timely, efficient, and affordable manner. Scientifically, streaming online media pose new challenges, due to their shorter, noisier, and more colloquial nature. Moreover, they form a temporal stream strongly grounded in events and context. Consequently, existing language technologies fall short on accuracy, scalability and portability.The goal of this project is to deliver. innovative, portable open-source real-time methods for cross-lingual mining and summarisation of large-scale stream media.TrendMiner will achieve this through an inter-disciplinary approach, combining deep linguistic methods from text processing, knowledge-based reasoning from web science, machine learning, economics, and political science. No expensive human annotated data will be required due to our use of time-series data (e.g. financial markets, political polls) as a proxy. A key novelty will be weakly supervised machine learning algorithms for automatic discovery of new trends and correlations. Scalability and affordability will be addressed through a cloud-based infrastructure for real-time text mining from stream media.Results will be validated in two high-profile case studies: financial decision support (with analysts, traders, regulators, and economists) and political analysis and monitoring (with politicians, economists, and political journalists).The techniques will be generic with many business applications: business intelligence, customer relations management, community support. The project will also benefit society and ordinary citizens by enabling enhanced access to government data archives, summarisation of online health information , and tracking of hot societal issues.TrendMiner addresses Objective ICT-2011.4.2 Language Technologies, target outcome b) Information access and mining.