音声ブラウザご使用の方向け: SKIP NAVI GOTO NAVI

JESTER - A HEAD GESTURE RECOGNITION SYSTEM FOR WINDOWS 95

Costi Perricos Department of Engineering University of Cambridge, UK

ABSTRACT

Gestural interfaces to computers are gaining prominence in the research field of human-computer interaction owing to their intuitiveness and their ability to be tailored to a particular user's capabilities. Gestures form a natural part of communication and valuable information is contained in them that is currently being neglected in traditional human-computer interfaces. Head gestures can provide a particularly attractive method of computer access for disabled individuals, as people with severe motor disabilities often have a sufficient degree of head control for recognition purposes. This paper describes a PC-based head gesture recognition system that has been designed to be easily incorporated in future software applications. The system runs in the Windows 95 environment, and a commercial version should be available by the end of 1996.

BACKGROUND

Substantial research work has been done in the computer recognition of gestures, using techniques such as hidden Markov models, neural networks, and finite state machines[1][2][3][4]. These procedures have generally been tailored to particular applications, and have not been flexible or easily accommodating to any changes in the user's ability to perform gestures. Research into gesture recognition at this organisation originated in 1986, when Harwin [1] identified head gestures as a possible means of computer access for severely disabled individuals. Harwin designed a number of recognition algorithms based on finite state machines and hidden Markov models, and showed that head gesturing could indeed be used to communicate with computers. Harwin's original research lead to the authors current work. A head gesture recognition system was designed and built around the following design criteria [5]:

  • The system should be simple to use.
  • It should be inexpensive.
  • It should have the ability to adapt to an improvement or deterioration of the user's ability to perform head gestures.
  • Its gesture recognition accuracy should be high.
  • The system should be trainable on small amounts of data.
  • It should be able to cope with a range of gesture vocabularies.
  • It should provide a simple method of incorporation in the development of computer software.

The prototype system was known as the Head Gesture Recognition System (HGRS) [6]. HGRS ran under DOS, and used a commercially available transducer, the Polhemus 3Space Isotrak [7], in order to translate the users movement and position into data understood by the computer. Instead of using recently developed recognition techniques such as hidden Markov models and neural networks, the system used a template based technique known as Dynamic Time Warping which was combined with heuristic rules in order to effectively recognise head gestures. This algorithm, known as the Hybrid Recognition Algorithm, had the advantage of being fast, and could easily be trained with one template per gesture [6]. The recognition performance of HGRS was evaluated by performing a set of user trials. The trials were performed by six subjects who had severe athetosis in their movement. The gesture vocabulary used by the subjects consisted of eight gestures (up, down, left right, yes, no and two custom gestures), and the recognition rate of the system was found to be 85.5% [6].

THE JESTER SYSTEM

The promising recognition results from the DOS based prototype led to the porting of the recognition algorithm to the Windows 95 platform, under the name of Jester. The enhanced display features of a graphical user interface platform enabled the recognition system to acquire a friendlier and more intuitive user interface. In addition, the ability of Windows to link pre-compiled code to applications at run time (through Dynamic Link Libraries, or DLLs), made Windows a suitable environment for developing applications that could use Jester.

Jester consists of a Dynamic Link Library, which can be accessed by an application and deals with all aspects of the gesture recognition process. The structure of Jester is described in figure 1.

Figure 1: The Structure of Jester

The recognition system consists of two basic components: the Jester Core, and the Jester Transducer Driver. The core handles all recognition tasks in the recognition process from segmenting a gesture to providing the application with a recognition result. The transducer driver deals with the communication to the transducer, and the translation of the raw transducer data into a format that is understood by the Jester Core. The reason for separating these two components in the recognition system is that it is foreseen that a number of transducers will be used with Jester. The type of transducer used will depend on the kind of gestures being recognised, the movement ability of a particular user, and the available finances of the user. Currently, the only restrictions on a transducer are that it must have a maximum of six movement axes, and a minimum sampling rate of 30 samples/second.

As can be seen from figure 1, the interaction between the various layers of the Jester structure, are simple and well defined.

The communication between the application and the Jester Core is achieved through the Jester Application Programming Interface (JAPI). This is a collection of C++ functions, such as `Initialise System' and `Recognise Gesture', which enable software developers to access the Jester system without requiring any knowledge of its complex internal structure. The communication between the transducer driver and the Jester Core is also well defined. In order for a driver to be Jester compatible it must provide a number of functions that Jester can access. These functions perform tasks such as identifying the transducer, initialising it, and retrieving position data. The transducer driver is itself a DLL and can therefore be selected from Jester at runtime.

Jester needs to be trained to recognise a user's gesturing characteristics before it can be used in an application. In order to simplify the training task, an application called Trainer has been developed. Trainer enables a user or carer to adjust a number of user dependent settings that optimise the system's performance, and to record the gesture templates that the user has chosen for his/her vocabulary. Once trained, Jester can be used by the particular user in any application that accesses the Jester DLL. During initialisation, the user is requested to enter his or her user ID. This allows the system to be customised to different user templates and characteristics. In addition, the user is able to select the type of transducer that will be used in that session. As shown in figure 1, Jester has its own interface, in the form of a control panel. The control panel, shown in figure 2 can be used to monitor the user's gesture performance, and also provides an ongoing insight into the state of the recognition algorithm.

Figure 2: The Jester Control Panel

An additional function of the control panel is that it enables the user to adjust some of the recogniser's parameters during the course of an application session. This is useful for users whose movement characteristics such as tremor may change frequently enough to affect the recogniser's performance during a session.

JESTERMOUSE - A MOUSE EMULATOR

Jester is still in the development stage, and there are currently no commercially available applications that support it. In order to provide gestural access to existing Windows applications, a gesture based mouse emulator called JesterMouse was developed. With JesterMouse, a user can access existing applications that have not been designed for use with a head gesture-based input device. This provides an indirect access to applications, and is limited by the functionality of a mouse pointer. In order to exploit gestural access fully however, the inputs provided by the gestures should be directly linked to application tasks. It is expected that a number of Jester compatible applications, as well as Jester based access programs such as keyboard emulators will be developed in the future.

FURTHER WORK

A set of user trials is currently being performed with a beta version of the Jester system. Eight subjects with a range of disabilities are taking part. The aim of these trials is to investigate whether this type of input can be effective in executing computerised tasks. The tasks performed therefore involve accessing existing Windows applications with the help of Jester. In addition, feedback on the design of the system from the users will be used to implement final changes to Jester before its commercial release.

There are two transducer drivers currently available for Jester. The first is a Polhemus driver, which is being used in the current user trials. The second is a joystick driver, which enables gestural input from a PC-compatible analogue joystick. When used with JesterMouse, this driver provides joystick access for the Windows interface. As part of further research, the Jester system will be tested with a range of transducers, in order to examine how well it can cope with movement in a number of different axes. In addition, trials with gestures from different sites of the body, such as hands and arms will be performed.

.ACKNOWLEDGEMENTS

This work is being supported by Action Research under grant A/P/0525. Additional trial facilities are being provided by the Papworth Group, UK. The author acknowledges the substantial contribution of the late Robin Jackson to this work.

REFERENCES

[1] W Harwin (1991), Computer recognition of the unconstrained and intentional head gestures of physically disabled people, PhD thesis, University of Cambridge.

[2]J Treviranus (1992), Quartering, Halving, Gesturing: Computer Access Using Imprecise Pointing, RESNA 92 Conference proceedings, pp 374 - 376.

[3] G Hamman, (1990), Two switchless selection techniques for using a head pointing device for graphical user interfaces, RESNA conference proceedings, pp 439-440.

[4] Towards the Automatic Recognition of Gesture. AY Cairns (1993), PhD thesis, University of Dundee.

[5]A Head Gesture Recognition System for Computer Access. C Perricos, RD Jackson (1994), Proceedings of the RESNA94 Conference, pp92-94.

[6]The Use of Head Gestures as a Means of Human - Computer Communication in Rehabilitation Applications. C Perricos (1995), PhD thesis, University of Cambridge.

[7]3Space Users Manual. Polhemus Navigation Sciences Division (1987), McDonnell Douglas Electronics Company

Dr Costi Perricos Department of Engineering University of Cambridge Trumpington Street CAMBRIDGE CB2 1PZ United Kingdom email:cp@eng.cam.ac.uk