An auditory scene is a semantically consistent sound segment characterised by a few dominant sources of sound. Auditory scene categorization is a task that automatically groups auditory scenes with similar semantic events from an audio stream. The technique is demanded by many multimedia applications; e.g., semantic event detection, audio indexing in video clips and context-aware computing.In a project, a state-of-the-art technique for auditory scence categorization is investigated to establish a system for audio content analysis from TV shows and movies. The main issues to be investigated include the intermediate-level representation of audio features, unsupervised categorization methods and the mapping between the intermediate-level representation and semantic categories. The deliverable of this project will be an auditory scene categorization system to segment an audio stream into audio segments corresponding to semantic categories.
Prerequisites: Good programming skills are essential. It would
be an advantage to have an interest in speech information processing and pattern
recognition as well as good mathematics background.