Design Project



Supervisor: George Tzanetakis,

GTZ1 Speech responsive robot using microphone array and prosodic analysis

The goal of this project will be to make a robot respond to speaker location and non-verbal speech characteristics (such as emotion, gender or age). For example as a speaker moves around a round the robot utilizing analysis of the microphone array input would turn its head following the movement. As another example it might react differently to a child than an adult. Speech analysis combined with machine learning can be used to train a model for discriminating between adult and children speakers. The project combines mechatronics and audio digital signal processing. The robot doesn't have to be too complicated and can either be built using existing robot building platforms or build from scratch using sensors and actuators.

GTZ2 Sound Spatialization System using Novel Controllers

Using multi-channel loudspeaker systems and signal processing it is possible to place individual sound sources in particular locations in space as well as move them in user-specified trajectories. Sound spatialization is especially important for games and movies. Currently spatialization is performed using controllers such as sliders and joysticks. In this project students will be required to implement different spatialization methods suitable for interactive multi-channel rendering and design and implement novel ways of controlling them. Some ideas are: using the Wii controller, the Radio Drum (two sticks that each transmit continuously x,y and z cooridnates) or a touchscreen with gestural input such as the iPhone. The projects requires a combination of audio signal processing, software development (for the real-time interactive rendering) and sensor data processing.

GTZ3 Distributed sound processing using "physical" delay lines

The goal of his project is to do sound processing using several computers connected through a fast network. Many synthesis and sound processing algorithms use delay lines as building blocks which are components that delay incoming samples by adjustable amounts. Networks connections always involve some amount of delay. The main idea is to use these "physical" delays in the network as a way of implementing delay lines. That way the overall latency of the sound processing could be reduced and the computational cost of simulating delay lines could be avoided. This project involved skills with network programming, software engineering and audio signal processing.