|[ Privacy, Policies, and Disclaimers ]|
Below are links to sound recordings of the AudioElla system in action. For these recordings, the AudioElla system was run on an IBM ThinkPad® laptop computer with 256 MB of RAM memory and a 1000 MHz Pentium processor. This is similar to the power of computer hardware contemplated for use in deployed systems.
The voice recognition function requires no user voice training, and users can be changed at any time. The conversations are completely hands-free after the program is started. Ella stops listening while she is talking, and resumes listening when she is finished, so there is no "push-to-talk" button needed.
Three different AT&T Natural Voices™ were used for voice synthesis, namely Crystal (American English), Mike (American English) and Audrey (British English). Recorded voices feature Robby Garner (jokes, limericks and poems) and Zhang Ying (slot machine symbols). The voice recognition engine used is ScanSoft's VoCon 3200® SDK. The interface program is Robitron.exe (C++), prepared by Robby Garner. Finally, EllaZ.dll (VB.Net) is the natural language response engine.
The recordings were made at the Tianjin office of EllaZ Systems during actual operation of the AudioElla system. The recordings were edited for length and content, but all the recorded inputs and responses reflect actual system interaction. For the Car Manual Mode recording, a few steps were taken to make it reflect potential operating conditions by opening the windows (windy day with traffic noise), turning on some instrumental music in the background, and using an inexpensive microphone placed about two feet away.
Simply click on the links below to listen to recordings of conversations with the system. They should open automatically in your browser's default audio player (such as Windows Media Player®). The following recordings are MP3 files saved at a 32 bit rate quality. We have the files available at a higher quality 128 bit rate also, but those are rather large for typical web browser access.
AudioElla has other features and capabilities that we haven't yet made sample recordings of. They include the ability to:
Another feature under development is an "Interrupt Button" that allows a user to instantly silence the system when pressed. When pressed again, the system will resume. If a long-playing feature was active (e.g. book reading or music track), Ella will ask the user if they want to resume the prior activity where it left off, or go back to the Base Mode. The interrupt Button will likely require multi-threading in the interface program.
There are no technical limitations to using EllaZ Systems natural language interaction for the control of external systems. Commands to turn on lights, adjust thermostats, set alarm systems, and so on can be implemented. Programs can contain verification steps (as shown in some of the above recordings), but of course, caution should be exercised in using natural language control of safety critical systems.
The purpose of these various demonstrations is to show how EllaZ Systems bring together computer technologies to make natural language computer interaction possible. The examples and similar possibilities can be implemented in a practical fashion, using currently available hardware at modest cost. Possible deployments include embedded systems, PC programs, and networked systems. System application options include internet access, entertainment programs, disabled access programs, automotive interfaces, PDA's and mobile phones. A wide range of possibilities exist, limited only by imagination and commitment to implementation.
|Copyright © 2007 EllaZ Systems||All Rights Reserved|