Web Posted on: February 13, 1998
DRACULAvox A NEW PHILOSOPHY OF NAVIGATION IN THE GUI USING A SPEECH SYNTHESIZER
by Junko FUKUDA and Ioan MONTANE, PhD,
President of euroBRAILLE,
Paris, France
WHAT IS THE PROBLEM ?
In order to navigate in GUIs with a speech synthesizer, we have to solve the following problems:
- The GUI is a collection of images, graphical objects and text. But in a speech synthesizer, we can only read a text.
- In a GUI, the information can be presented in a text form consisting of 2 000 to 5 000 characters. But a speech synthesizer can read only 10 to 14 characters /second (3-6 mn /screen).
- The GUI is usually controlled with a mouse directly. A speech synthesizer can be controlled by an interactive keypad.
- The GUI organization is intuitive for sighted people, who can understand a general situation at a glance, without reading lots of information in detail. The speech synthesizer can only read a text or a summary.
- The GUI is based on the notion of arborescence. But the presentation of the arborescence by a speech synthesizer is not evident.
- The GUI provides a great amount of attributes such as color, size, font and shape, to convey information. The speech synthesizer can have only a few different accent (lady, man, old, young, etc.)
- The speed of navigation for the sighted user is optimal in GUI, because of its intuitive display. The sightless user needs an acceptable navigating speed in order to have the same efficiency as the sighted user.
WHAT CAN WE DO ?
We can name each graphical object in the GUI.
We can abbreviate the names of graphical objects in order to read a maximum of information in a limited time period. For example, in place of "MICROSOFT WORD" we can abbreviate in "MWORD".
We can associate a set of functions to each word that represents an object. These functions are activated through an interactive ergonomical keypad, so that the sightless user can navigate in the GUI.
We can mark a word-object by actionning a key in the same time as his pronunciation and we can navigate up-down, right-left starting from this position.
We can group the objects in the GUI, which we can conventionally call elementary objects, in composed objects made up of a multiple criteria basis.
We can present a parent with all his direct children on a line. This line can be read sequencelly by the speech synthesizer.
We can apply these principles appropriately to reorganize information on every level to give to the sightless user a reasonable navigation speed in GUI.
DRACULAvox : A NEW PHILOSOPHY OF NAVIGATION
THE GOAL OF THE PHILOSOPHY is to provide a single efficient method of navigation in GUIs using a speech synthesizer for the sightless user.
THE THEORETICAL BASE OF THIS PHILOSOPHY. The GUI was designed for sighted users. Please, don't try to navigate in a GUI with a speech synthesizer as you would in the text or DOS environment.
We can imagine a VUI (Voice User Interface) in opposition with the GUI, but with identical information organized in a different way which can be read directly with a speech synthesizer according to its importance.
We can approach this problem by considering that :
- The GUI is a polychrome single-layer support with graphical information.
- The VUI is a monochrome multi-layer support with ASCII information.
If we could transform the GUI information into VUI information (which is a job for a software expert) and develop a way to navigate with some interactive functions directly associated to the words, then the problem is solved. Indeed speech synthesizer can read at one moment the information of one monochrome layer. We can then read the rest of the information in other layers with the appropriate interactive function keys.
For instance, one possible organization of the information would be using parallel superposed layers with perpendicular correspondence in VUI, like this :
- Layer 1 : the complete names of the GUI objects.
- Layer 2 : the abbreviation of the names of the objects presented in the GUI.
- Layer 3 : the complete information of the objects types and list of attributes. Etc...
THE PRACTICAL BASIS OF THIS PHILOSOPHY. Every object in GUI is represented by one name-word that is the object's signification, or an abbreviation of this signification.
We can associate to each word of the speech synthesizer a set of standard functions. These functions correspond to actions which are operable in the GUI and to actions correlated to this new philosophy. The functions are activated with the help of an ergonomical keypad.
Starting from the attentive analyze and practical experiment of Bibliography.2, we arrived at the conclusion that we can use the same principals of navigation with the speech synthesizer like with the braille display.
We can do the navigation with only 13 functions (key) grouped in 3 categories: functions with the action in the GUI, functions of navigation and functions to explicit the information.
1. Key functions with the action in the GUI
- KcL Key for simulation the click Left function of the mouse
- KcR Key for simulation the click Right function of the mouse
- KiO Key function to implode the elementary correspondent Object
2. Key functions of navigation. We must navigate in Vertical or Horizontal list or group of objects. In any moment we can order to continue the reading in function of the direction dictated by the last action automatically. In any moment we can stop the automatically reading and mark the last word ridden which become the new reference of navigation.
We will have the following navigation key functions :
- KmVd Key function for moving to the next Vertical down object
- KmVu Key function for moving to the next Vertical up object
- KbVd Key function for jump to the beginning of the Vertical object list ; all the other objects are down
- KmHr Key function for moving to the next Horizontal right object
- KmHl Key function for moving to the next Horizontal left object
- KbHr Key function for jump to the beginning of the Horizontal object list ; all the other objects are at right
- KmAx Key function for moving Automatically to the x direction (down or right)
- KSaM Key function for Stop automatically reading and Mark the last object ridden.
3. Key functions to explicit the information
- KzW Key function for zoom Word -read full name of the object
- KzA Key function for zoom Attribute -read the attribute of the object. With the help of KmVd, KmVu et KbVd, we can read the list of attributes of this object.
In order to accelerate the navigation in the GUI arborescence, more of those GUI objects that we call elementary objects can be grouped into a composed object. The elementary objects are grouped on the basis of their similarities, of their relationship, of their vicinity or their co-operation in accomplishing a particular action. Composed objects can also be grouped into another composed object.
Exploding function is executed with the left click of the mouse (KcL) of a composed object into a set of elementary objects, and the imploding function (KiO) of a group of elementary objects into a composed object, we achieve the goal of high efficiency with navigating the GUI.
In this new philosophy, the presentation of information is structured in a list of WORDs that constitute a family of objects, which is preceded by a WORD indicating the beginning of the list. The beginning WORD represents the parent, while the listed objects represent the children.
In conclusion, the new DRACULAvox philosophy, which associates with a ergonomical keypad to control the WORDs ridden by the speech synthesizer which represent the GUI objects, accelerates searches by the elementary and composed objects. This unique user interface gives to the DRACULAvox user the capacity to work flexibly in the Windows world, and enables the user to work as efficiently as a sighted user.
DRACULAvox : THE SOFTWARE.
On the basis of the new philosophy presented above, which principle was patented, the engineers of the French company euroBRAILLE, located in Romania and France, have developed DRACULAvox application. This software will work on Microsoft Windows 95, 98 and NT.
DRACULAvox software is in development using Microsoft Active Accessibility and Microsoft Access Model.
We are starting the usability demonstration tests in France. Actually, DRACULAvox supports on DRACULAkey device of the euroBRAILLE company.
We would like to appreciate to the MICROSOFT managers especially to Jeff WITT, Charles OPPPERMANN, Luanne LALONDE, for all their kindness and technical support.
BIBLIOGRAPHY :
1. Brevet Application # 97.04873 dated 04/21/97 : PROCEDE ET DISPOSITIF DE NAVIGATION POUR LES AVEUGLES DANS UNE INTERFACE GRAPHIQUE D'ORDINATEUR.
2. Conference : DRACULAwin, A NEW PHILOSOPHY OF NAVIGATION IN THE GUIS USING BRAILLE EQUIPMENT, presented at CLOSING THE GAP, October 97, Minneapolis, Minnesota.