音声ブラウザご使用の方向け: SKIP NAVI GOTO NAVI

Web Posted on: August 24, 1998


THE VOICE PROJECT
Giving a VOICE to the deaf
by developing awareness of VOICE to text recognition capabilities
(Telematics/TIDE Accompanying Measure)


Giuliano Pirelli
European Commission Joint Research Centre
giuliano.pirelli@jrc.it
voice@jrc.it
http://ntsta.jrc.it/voice


1. Summary

The difficulties of the deaf are beyond the loss of hearing itself, and underline a more general problem of lack of communication. Automatic recognition of speech in conversation, conferences and telephone calls, with their translation into PC screen messages, could be a powerful help in reducing the gap between the deaf and the hearing world. The paper presents an overview of the VOICE Project, an EC Telematics Programme Accompanying Measure. The Project proposes the promotion of new technologies in the field of voice to text recognition and to unite, by means of an Internet VOICE Forum, Associations, producers and organisations interested in this research.



| Top |

2. The Joint Research Centre of the European Commission

The Joint Research Centre (JRC) is the EC own research centre. It was created to share the large investments needed to carry out research on nuclear energy. Over time its tasks have developed into other areas in which a common approach on EU level is necessary. JRC provides neutral and independent advice in support of the formulation and implementation of the EU policies. In addition, it offers unique training services and organises workshops in advanced sectors of science.

2.1. JRC's Institute for Systems, Informatics and Safety (ISIS)

The Institute for Systems, Informatics and Safety (ISIS), based at Ispra in Italy, is the JRC impartial centre of expertise in the multi-disciplinary analysis of industrial, socio-technical and environmental systems, the innovative application of information and communication technology, and the science and technology of safety management. The activity areas of the ISIS's Unit for Software Technologies and Automation (STA) include dependable software applications (safety critical computing systems, requirements engineering), multimedia network applications (animation in medical imaging, multimedia techniques in training and education), sensor based applications (surveillance techniques, 3D reconstruction of real environments, learning approaches for control), robotics and remote handling.

2.2. JRC-ISIS's Exploratory Research Programme

JRC-ISIS's role in 1996 in the previous themes was oriented towards the provision of scientific and technical support to the EU services and initiatives. Moreover, a levy of 6% of the institutional budget was used to finance Exploratory Research. In 1996 the scientific staff of ISIS made a total of 65 proposals. The ISIS Scientific Committee judged the proposals on originality, appropriateness, soundness and cost and produced a shortlist of 16 proposals, 12 of which were then funded. In particular, two projects are carried out by the STA Unit, concerning the interface between Life Science and information technology to provide help for the disabled and the elderly:
Information technology aids for people with special needs - Voice to text conversion for the deaf;
Brain-actuated control - using EEG pattern recognition to help the disabled.



| Top |

3. Applications of voice to text recognition for the deaf

Although voice to text recognition packages are marketed primarily as a means of allowing people in businesses to create documents without using the keyboard, it is an application that holds great advantages for the hearing impaired, blind, physically handicapped and elderly, as well as people without special needs. These systems are reaching a very good level of development and begin to be widely available for PCs. The software that until now could only recognise words separated by short pauses, is being replaced by new releases, which present very significant improvements and recognise continuous dictated text. Finding solutions and ways of adapting such software for the use of a disabled person is encouraged by this increase in market, affordability and user-friendliness.


3.1. User needs

The difficulties of the deaf are beyond the loss of hearing itself, and underline a more general problem of lack of communication. One of the main forms of modern communication, the telephone, is of no or of very little use to this community for oral communication (while it is useful for the transmission of faxes). Other modern means of communication, although not completely useless, generate frustration by providing only part of the information in a form accessible to them. An example is the television which, when not subtitled, supplies very limited information.

In some European countries, it is usual to think that hearing impaired people would had difficulties trying to learn to lip-read and speak and should therefore make use of sign language and attend special schools. In others there is another approach to the problem. In Italy the law encourages the integration of deaf children in the normal schools, with a remedial teacher, without the use of sign language. Some Associations of the deaf, like ALFA in Milan, are getting very good results from helping the children following this approach, and do so with children joining primary school right through to those finishing the University and finding job afterwards. Despite the fact that good results are achievable, they demand an enormous effort, which could be greatly reduced through the use of new technologies.


3.2. State of the art

A widely used application is the Teletext subtitling of television transmissions, very powerful help for deaf people. The importance of the educational aspect lies in the fact that subtitles are for a deaf child one the most powerful learning tools, just as a hearing child would learn from things it heard. Similarly it gives hearing impaired adults the opportunity to enrich their vocabulary. Since subtitling is the result of a manual preparation of files to be transmitted via Teletext, most of the subtitled transmissions are films. Subtitling of live programs and of the news is rarely performed.

Subtitling of conferences, even those addressed to the deaf, is usually not available. Sign language interpreters provide a significant help for the deaf used to sign language, but other deaf participants or partially hearing impaired, elderly, foreigners are unable to understand sign language. Moreover this activity is lost after the conference, being of no use for producing proceedings or abstracts.

In telephone communication, Text-telephones have already proved themselves vital from a deaf person's point of view. These systems do, however, present one major problem, that all people wishing to contact a deaf person on such a machine must possess one themselves. This makes such a means of communication awkward and expensive, both for the deaf and those they wish to call.

3.3. Autonomy and quality of life

When more conferences, meetings and discussions slowly become subtitled, there will be a larger participation from the deaf. By an increase in subtitling capabilities, television will become a more useful source of information. The use of subtitles in the telephone calls, which involve everyday communication in society, will greatly increase the interaction of the deaf community. This contribution will increase the effect that their decisions have on the surrounding environment, which will subsequently improve their standard of living. An easier access to schools and universities will allow a more satisfying life and also a better choice of a work corresponding to personal capabilities and, at large, more economic productivity for the society.


3.4. Market situation and prospects

It is worth remembering that the market of the hearing impaired consists of between 1% and 5% of the population (according to the degree of the hearing loss), which represents millions of people in Europe. This field can be enlarged to take into account also those loosing their hearing, having hearing problems and even normal hearing people who cannot hear due to the noise in their environment. Moreover, a lack of communication similar to that experienced by the deaf also affects the disadvantaged, the people living in foreign environments and the elderly. When united this group consists of more than 30% of the total population.

The new products seem well suited for the needs of the deaf. The modification necessary for some tests are of limited extent, but the deaf rarely has the technical awareness and the social power in order to address the activities. Nevertheless this could be an opportunity of a great interest for the producers of speech recognition systems, since the deaf could accept the present limited accuracy of recognition, as a complement to his lip reading skill. Even the more limited accuracy of recognition over the telephone line, is an interesting starting point for the deaf. The Associations of the deaf are considered both as the most interested and critical user group for all the possible applications in this area, and thus the most motivated for testing a system which will be improved for all users, also in related fields, such as video-telephones or on-line television subtitling in several languages.



| Top |

4. European Dimension

Hardware, software and services producers hesitate to invest more, since the user needs are not translated into technical specifications and are sometimes not even known. On the other hand, the Associations of the disabled have a limited overview of possible technical new solutions and rarely have the opportunity to participate in the feasibility studies of new projects. What lacks is a better definition, from a technical point of view, of the needs of the disabled to enhance collaborative work between technicians and non-technicians. A broader co-operation and a European dimension are of great importance, allowing a large-scale factor for the study and the development of technical aids and ensuring a large impact of the results. Also the multilingual aspects should be considered at a European level, since most of the concerned Associations are only at a National level.


4.1. The VOICE Project's first steps

JRC-ISIS has undertaken, as from the beginning of 1996, a number of the tasks here described. The first step was the set up of a VOICE Laboratory provided with the necessary software, hardware and network capabilities. Contacts with producers of speech recognition systems, research centres, telecommunications firms and television broadcasters, created a coherent overview of the state of the art. Regular contacts with the Associations of the deaf gave the opportunity of analysing the special needs, resulting from difficulties in hearing and in speech, in many aspects of the everyday life.

In view of facilitating the contacts and establishing a common goal, JRC-ISIS gave some Associations the opportunity of creating a VOICE Forum on the Internet, by allocating space for them on a Web server and providing technical assistance. Since then, the Associations have shown great interest in participating to the Project. The VOICE Forum begins to be a known Internet site and several Associations of the hearing impaired are adding information to it or communicate their interest in testing the demonstrator and participating to the foreseen meetings and workshops.


4.2. The VOICE Project - a Telematics Applications' Accompanying Measure

We felt that all the activities started at JRC-ISIS with the collaboration of its Italian Partners, could get a particularly important push if the tests and the dissemination of the results could be organised in several countries. So we enlarged our group, proposing as first to the Educational Endeavour Computer Science for the Blind of the Institute for Computer Science of the Johannes Kepler University in Linz and to the Institute for Auditory and Visual Training (IHSB) in Linz to join us.

We prepared a proposal for an Accompanying Measure, which we submitted to the Telematics Applications Programme Call in April 1997. The proposal: VOICE - Giving a VOICE to the deaf, by developing awareness of VOICE to text recognition capabilities, has been selected and we are at present (March 1998) in the last negotiation phases for starting the Project.

The Project proposes to continue the activities in this field, increasing the contacts to a broader European dimension and disseminating the awareness of the capabilities of voice to text recognition systems. The Project will provide an Accompanying Measure playing a technical and social role in collecting information and presenting it in a coherent way to the producers of speech recognition systems and researchers. The aim is that of disseminating information on how the producers may help the users with disabilities by limited improvement of their standard products and on how the users with special needs may collect useful information and translate it into technical specifications.

JRC-ISIS is acting as scientific and technical co-ordinator of the Project, developing several specific aspects of the research. FBL software house, experienced in applications of speech recognition to the disabled, is developing additional software and integrating it into the demonstrator to turn off-the-shelf voice to text recognition packages into user-friendly programs modelled on the requirements of the users. Each step of the activity is discussed and checked with ALFA and CECOEV Associations of the deaf in Milan. Kepler University examines the Italian results, verifying their validity in Austria and helping IHSB in the Austrian validation phase.



| Top |


5. Objectives and strategic approach

Main objectives of the VOICE Project are: to investigate into voice to text recognition for automatic subtitling of conferences, school lessons, television transmissions and telephone conversations; to spread the use of general purpose voice to text recognition systems and to improve the prototypes developed until now; to demonstrate the prototypes to relevant organisations and in international conferences; to use a VOICE Forum on the Internet as a Project tool for collecting and spreading information on technical aids for the deaf.


5.1. Technical aspects of the demonstrator

One of the objectives is the set up of a cluster of demonstrator applications related to voice to text recognition, on the basis of a multimedia laboratory prototype. The system could be of use for conferences and live television transmissions subtitling. This operational capability involves the manipulation of both: the functions available on the commercial dictation packages and the generated text (converting strings of text into groups of subtitles, positioning them against blank screens, displaying them with video signals and providing various other options). The system will be of help for any user in producing at the same time a first draft of conferences' proceedings.

A prototype has been developed for generating subtitles of live television transmissions and broadcasting them by Teletext systems. A different approach is also being considered for specific television transmissions or radio broadcasts, in order to make the generated subtitling lines available through the Internet. The subtitles do not necessarily have to be created by broadcasting companies themselves. Independent members of the public with the correct equipment and programs could listen to the radio or television, summarise what is being said into a microphone and the subtitles will be broadcast world-wide over the Internet.

For the use of voice to text recognition with telephones, the basic principle is that a person would speak down the phone line, the message would be passed into a PC at the deaf person's end and the words (via some form of voice to text recognition) would be printed out on his screen. In this situation only the deaf person would need the appropriate equipment. The application will also include a text to speech system to allow the deaf person to reply (should he/she have difficulties in speaking), which may also be useful in providing the person at the dictating end with feedback on whether what was said has been recognised correctly.


5.2. Design for all

The VOICE Project, according to JRC-ISIS background and TIDE policy, is looking for developing prototype applications using, as far as possible, hardware and software commonly available on the market. This allows reducing development costs and times as well as the future maintenance of the products. Moreover this helps in improving the quality of products for the normal market, for any user, eliminating new barriers that often are created by new information technology tools.

The cost of some voice to text recognition commercial package is about 100 ECU (10% of the original price). The approximate cost of the basic structure for the application (a Pentium PC 200 MMX with 64Mbyte Ram, Cd-Rom and Soundblaster16 Sound card) is 1500 ECU. This includes a fully operational PC that can be used in many other useful ways. A great advantage is also the fact that the system is not dependent on any particular company or software release.


5.3. Conferences, school lessons

All the phases of the Project will be developed with continuous and tight participation of the users. Several European conferences and workshops will be organised in view of helping them to discuss their needs with the industry and services providers: ICCHP-98, Vienna and Budapest, August 98; HANDImatica-98, Bologna, November 98; Linz, first semester 99; JRC-Ispra, second semester 99. The demonstrator will be presented and used for generating prototype live subtitling for the deaf participating to the conferences. The meetings will not only concern the technical aspects, but will also try to bring the manufacturers and producers closer to the users' needs.

The prototype system has been presented to some schools, where it will be tested in real situations of use for subtitling school lessons for the benefit of the deaf students. It will visualise the dialogue pronounced during the foreign language lessons, for the benefit of the hearing students, or the lessons of the host country's language for the benefit of any user, particularly the immigrated. Some tests have been also foreseen for subtitling university lessons and printing summaries.


5.4. VOICE Forum

One of the Project's aims is to stimulate and increase the use of new, widely diffused technologies, namely the Internet. The objective is that of uniting, by means of an Internet VOICE Forum, Associations, companies, universities, schools, public administrations and anyone else interested in voice recognition that could benefit from such research. The Forum will become an intermediary between the different concerned groups and will help in collecting information on the user needs and on the validation of the prototype demonstrator, as well as in disseminating the results.

At present JRC hosts and maintains the sites of AFA, ALFA, CECOEV and ENS Associations of the deaf, with information including: Statutes, contact numbers and addresses, meetings, electronic copies of a selection of their newspaper, a research carried out into the hours and accuracy of the television broadcasters, a list of their archive of subtitled videocassettes. The current site provides a very strong foundation on which the creating awareness side of the VOICE Project can be built. This is an important part of the Project itself, since it demonstrates, to all those involved, the effectiveness of this means of communication for the deaf community.


5.5. VOICE Special Interest User Group

The linguistic aspect of the software packages has been considered choosing software packages already available in several European languages. Since most of the new IT packages are produced in English language, JRC-ISIS is testing them in English and the users in Italian and German, as to cover different linguistic approaches. The acquired know-how will be made available for applications in the other languages. Some contacts have been already established with the University of York and the NDCS Association in UK, French ANPEDA and Belgian APEDAF and TELECONTAC, which showed interest in following the Project. As complement to the VOICE Forum, a VOICE Special Interest User Group is being created and will hold its first meeting during the ICCHP-98 Conference in Vienna. It will provide the Project with a larger audience and will participate to the peer review of the deliverables for which this is appropriate.



| Top | | TIDE 98 Papers |