Special Research Initiatives - Grant ID: SR0354596
Funder
Australian Research Council
Funding Amount
$20,000.00
Summary
Perception and Action in Auditory Scenes (PAAS): Neural, Behavioural, Computational and Mechanical Systems. Auditory scenes are temporal and ephemeral yet pervasively influence human life. How humans negotiate such scenes has not been solved, a fact highlighted by attempts to build machines to respond to speech, warnings etc., in real-world situations with room reverberation, different talkers, and background noise. No one discipline can solve such problems. In this network outstanding researche ....Perception and Action in Auditory Scenes (PAAS): Neural, Behavioural, Computational and Mechanical Systems. Auditory scenes are temporal and ephemeral yet pervasively influence human life. How humans negotiate such scenes has not been solved, a fact highlighted by attempts to build machines to respond to speech, warnings etc., in real-world situations with room reverberation, different talkers, and background noise. No one discipline can solve such problems. In this network outstanding researchers from physical, medical, human, and social sciences with interests in speech, music and audition will provide insights into how humans and machines localize, recognize, interpret and produce auditory events, and advance frontier technologies, e.g., automatic speech recognition, hearing prostheses, auditory monitoring/warning systems.Read moreRead less
Linkage Infrastructure, Equipment And Facilities - Grant ID: LE100100211
Funder
Australian Research Council
Funding Amount
$650,000.00
Summary
The Big Australian Speech Corpus: An audio-visual speech corpus of Australian English. Contemporary speech science and technology are driven by the availability of large speech corpora. While audio databases exist for languages spoken in America, Europe and Japan, there is currently no large auditory-visual database of spoken language, and certainly not one for Australian English. Here we will establish the Big Australian Speech Corpus, which will support a speech science research and developmen ....The Big Australian Speech Corpus: An audio-visual speech corpus of Australian English. Contemporary speech science and technology are driven by the availability of large speech corpora. While audio databases exist for languages spoken in America, Europe and Japan, there is currently no large auditory-visual database of spoken language, and certainly not one for Australian English. Here we will establish the Big Australian Speech Corpus, which will support a speech science research and development using Australian English and facilitate the development of Australian speech technology applications from automatic speech recognition and text-to-speech synthesis used in taxi and other ordering services, to hearing prostheses and talking head aids for learning-impaired children, and a range of security and forensic applications.Read moreRead less
Cognitive Modelling of Computer Games Pidgins. This project develops a pidgin language for use in computer games and models the way human users exploit such languages. It has implications also for computer assisted collaborative work and other educational or entertainment interactive environments like computer games. By developing mini-languages for games environments we can dramatically simplify the speech recognition problem and make recognition robust across different speech cultures and ba ....Cognitive Modelling of Computer Games Pidgins. This project develops a pidgin language for use in computer games and models the way human users exploit such languages. It has implications also for computer assisted collaborative work and other educational or entertainment interactive environments like computer games. By developing mini-languages for games environments we can dramatically simplify the speech recognition problem and make recognition robust across different speech cultures and backgrounds. We use protocol analysis and markup techniques for modelling dialogues between human player and reactive agents in computer games.Read moreRead less
How, What and Who in Human Communication: Movement of Face and Voice. The aim of this project is to identify the essential characteristics of tone, affect, and identity from face and voice using a combination of signal processing, biological, and behavioural techniques in order to develop a comprehensive model of auditory-visual speech processing and communication. This research will significantly improve understanding of the basis of auditory-visual perception and production in tonal languages ....How, What and Who in Human Communication: Movement of Face and Voice. The aim of this project is to identify the essential characteristics of tone, affect, and identity from face and voice using a combination of signal processing, biological, and behavioural techniques in order to develop a comprehensive model of auditory-visual speech processing and communication. This research will significantly improve understanding of the basis of auditory-visual perception and production in tonal languages and in affective communication, facilitate links between neurophysiological processes and auditory-visual speech processing; and contribute to applications in automatic person recognition, automatic speech recognition, text-to-speech systems, and talking head aids for the hearing impaired.Read moreRead less
Filters reveal what flicker conceals: temporal processing in the human visual system. I have recently discovered a new form of camouflage using 10Hz luminance flicker. This project will quantify this effect and examine the extent to which it generalises across colour and spatial dimensions and to video sequences depicting natural scenes. This information is expected to provide foundational information to technologies relating to national security that rely on visual concealment. This research wi ....Filters reveal what flicker conceals: temporal processing in the human visual system. I have recently discovered a new form of camouflage using 10Hz luminance flicker. This project will quantify this effect and examine the extent to which it generalises across colour and spatial dimensions and to video sequences depicting natural scenes. This information is expected to provide foundational information to technologies relating to national security that rely on visual concealment. This research will examine the extent to which filtering out these camouflaging frequencies enhances our sensitivity to low temporal frequency information. This decamouflaging aspect of my research is expected to improve the clarity of digital video-based technologies including ultrasound, educational, info-tainment and defence applicationsRead moreRead less
Broadcasting 3D Audio: Recording, Transmission, and Playback. With the current state of the art, a performance at the Sydney Opera House cannot be recorded and broadcast such that you can listen to it as if you are in the best seat of the house. The goal of our project is to develop the ultimate form of multi-channel audio broadcasting to create this experience. We will develop and implement effective systems for recording, broadcasting and playback of 3D audio in three different scenarios: indi ....Broadcasting 3D Audio: Recording, Transmission, and Playback. With the current state of the art, a performance at the Sydney Opera House cannot be recorded and broadcast such that you can listen to it as if you are in the best seat of the house. The goal of our project is to develop the ultimate form of multi-channel audio broadcasting to create this experience. We will develop and implement effective systems for recording, broadcasting and playback of 3D audio in three different scenarios: individual headphone reproduction; small loudspeaker array reproduction; and large loudspeaker array reproduction. We will create optimal recording techniques and broadcasting software for each of these playback techniques.Read moreRead less
Temporal segmentation, leadership and cognition in musical improvisation and creativity. Improvisation is core to conversation and to creative and social emergence. This project investigates musical improvisation, in order to reveal constituent processes, using computational and cognitive approaches. Mechanisms for generating transitions in the temporal stream, and for asserting social power or position in it are assessed. Improvised material can be explored, modified, and developed in the creat ....Temporal segmentation, leadership and cognition in musical improvisation and creativity. Improvisation is core to conversation and to creative and social emergence. This project investigates musical improvisation, in order to reveal constituent processes, using computational and cognitive approaches. Mechanisms for generating transitions in the temporal stream, and for asserting social power or position in it are assessed. Improvised material can be explored, modified, and developed in the creative process, and the project investigates how this occurs and whether computers can facilitate the process. Such contributions can be critical to the development of innovation in research and cultural arenas in Australia. Read moreRead less