CSIDM-200803: Multimodal-based Immersive Environment for Personalized Browsing and Language Learning

Principal Investigators:

This project aims to create an immersive environment using multi-modalities including video, audio and text to provide users a personalized service in this environment. Particularly, we use sports games as an application to achieve and illustrate the functions defined in the immersive environment. We focus on two applications using created immersive environment, personalized browsing and language learning. The specific technical objectives include:

(A) Creation of immersive environment. Immersive environment refers to 3D virtual environment which enable users to be able to view events/objects in any angle in this environment. We will use multi-view 2D videos to create a 3D virtual environment and provide a multi-granularity 3D scene generation scenario.

(B) Synchronization of multi-modal signals. We will use external sources to help video semantic analysis and integrate these external sources into the immersive environment. For sports game, we use web-cast text and commentator speech to help semantic event detection in the video. The challenging task is to align the text, speech and video for the same event.

(C) Personalized browsing of sports games in the immersive environment. We will extract events in sports games using multi-modal features and obtain user’s preferences using machine learning mechanism. By matching detected events with user preference model, the users are able to browse their desired events in any angles in the immersive environment.

(D) Language Learning in the immersive environment. For the detected events, we will use related texts/speech with different languages to describe them so that users are able to learn different languages in the immersive environment.

(E) Prototype system on multi-modal based immersive sports game environment. We will build a prototype system in sports game to allow users to view events based on their preferences and learn multi-lingual languages used to describe the events.

The proposed project is in coherence with CSIDM’s “Immersive language learning” vision by contributing our immersive environment creation solution as enabling technologies toward the multi-lingual learning market