音声ブラウザご使用の方向け: SKIP NAVI GOTO NAVI

Towards Automatic Generation of Tactile Graphics

Thomas Way and Kenneth Barner Applied Science and Engineering Laboratories, University of Delaware / Alfred I. DuPont Institute

Abstract

Abundant high-quality computer images are available on the Internet and elsewhere, yet many are virtually inaccessible to the blind computer user. This paper introduces research in the development of a system for automatic conversion of any digitized image into a comprehensible tactile graphic form. An experimental software and hardware system is presented, and preliminary test results are discussed.

Background

The explosive growth of reliance upon the graphical user interface (GUI) has had significant impact on computer users, nowhere more negatively than among blind users. Some barriers can be overcome by use of new commercial GUI-friendly screen review and speech synthesis software and hardware in combination with standbys such as embossed braille printers and braille cell displays. Experimental approaches include the use of multimodal interfaces using audio and/or tactile output (5, 8), and the development of a variety of dynamic tactile display technologies, which is vigorously underway, though it may be some time before these directly impact users (1). Haptic perception of tactile graphics is improved when issues such as resolution of the fingertip, image size, exploration mode, image complexity, and the dependence upon simultaneous kinesthetic and cutaneous stimulation are accounted for as demonstrated in (2, 4, 9). Those studies, and others like them, combine to point out the complicated interdependency among numerous factors in the development of a tactually legible tactile graphic display. This paper describes research into the development of a system which attempts to take into account many of these issues. This system uses image processing techniques to simplify a complex computerized image, such as a photograph, and subsequently produce a tactile representation through an output medium such as microcapsule paper. The goal of this study, and the system we are developing, is to identify techniques for producing meaningful tactile graphics from the wealth of on-line visual graphic information, and to ultimately create a usable software and hardware solution to that end.

Research Question

Translation of complex images into tactile graphics which are meaningful to a blind person clearly requires some type of simplifying transformation. There are numerous image processing algorithms that segment images, locate region edges, reduce noise and generally extract meaningful features. Applying various combinations of these processes produces a broad range of results. Determining, in an automatic fashion, the optimal aggregate process, which may even be image dependent, is our aim. This preliminary study tests the efficacy of a small subset of image processing algorithms on the ability of sightless subjects to recognize identical images from a closed set. Method The experimental system consists of both software and hardware. The software is implemented in the C programming language as an extension to the X-windows image processing application "XV", developed at the University of Pennsylvania, running on a SUN Sparcstation 20. Our initial extension includes additions of new algorithms for grayscaling, negation, K-means clustering, and a Sobel edge detector (3, 7). In this application, the K-means algorithm adaptively segments an image into black and white using a statistical analysis of luminance levels of the pixels. The Sobel edge detector recognizes changes in luminance in both the X and Y directions, and highlights places where the luminance changes quickly indicating the existence of an edge in the image. The hardware component is a Reprotronics Tactile Image Enhancer, which produces raised tactile graphics on Flexi-Paper, a cloth-based microcapsule paper also developed by Reprotronics. Additional components include a 300 dpi Hewlet Packard IIISi laser printer and a Lanier 6725 photocopier. Note that any combination of laser printer and copier should work equally well, with the consideration that the photocopy machine must handle the slightly thicker capsule paper without jamming. We collected eight digital images found on the Internet as our test set. The set of images are photographs of three different faces, the chimney of a house, a notebook computer, a hot air balloon, a space shuttle launch, and an illustration of a human heart. These were selected to provide a variety of image types yet similarities among various subsets of the group. That is, the three faces, hot air balloon, and human heart illustration have similar (round) overall shapes, while the notebook computer and house are characterized by straight lines. Each image was processed five different ways, using grayscaling alone and in combination with either 2-way K-means clustering or Sobel edge detection or both, and in both possible orders. For each combination of processing the eight resulting images were printed out in the same size (2.5" x 2.5"), and randomly arranged on a single blank sheet of paper.

Example of an original grayscale image K-means(2) without & with Sobel Edge Sobel Edge without & with K-means(2)

A second sheet was similarly prepared using the same processed images in a different random arrangement. Subsequently, these sheets were photocopied onto capsule paper and raised using the enhancer. The above five figures illustrate the progression of processing undertaken. Four sighted subjects were blindfolded and asked to perform a matching task. First, a subject's hand was placed onto one processed image on one of the sheets, and the subject was allowed to freely explore the image. Then, the subject's hand was guided to a random spot on the second matching sheet and the subject attempted to locate the identical object on this second sheet. This task was repeated for each of the eight images on each of the five pairs of identically processed sheets. Results

Table 1 contains the results of this experiment. The Table columns indicate the type and order of image processing used, the average number of matches out of eight per subject and the average percentage of matches per subject, for each of the algorithm combinations.

: Table 1: Average Images Matched
Image process Matches Pct.
grayscale 2.25 28%
and K-means 6.25 78%
and Sobel 4.75 59%
and K-means & Sobel 5.75 72%
and Sobel & K-means 7.75 97%

Discussion

Clearly even a small amount of simplification yields a marked improvement in tactile image recognizability. Images which were simpler at the outset were recognized more easily in all cases. In particular, an illustration of the human heart chambers and a photograph of an open notebook computer tended to be distinguishable even with no processing, probably due to a white background and simple initial representation. There was often confusion among three images of human faces and a hot air balloon, each of which has an essentially oval shape. Even though there was a general tendency for recognition to increase when images were simpler, no statistically relevant conclusion can be drawn about the efficacy of line drawing versus light and dark region representations, nor can anything be said about identification of the content of the images. However, some interesting anecdotal evidence was gathered. Some subjects reported, upon feeling the processed images, that they thought there was more than one face among the images, though none had any idea ahead of time as to the content of the images. This content identification was not reported upon feeling the original unprocessed images. The tactile pattern identification methods observed during the experiment are also interesting, and intuitive. The tendency among all subjects was to use the outside edges of an image for gross classification. Once gross classification was determined, the details found through careful exploration of the interior of the images were used to separate similar images from on another. Overall, we feel that the results strongly indicate that this method is valid and deserves further investigation.

Future work

We plan to extend our study in a number of ways. First, improved algorithms which provide reasonable results for a broad range of images must be utilized or developed. One possibility we are considering is use of an adaptive clustering algorithm which produces a more accurate segmentation (6), although it is computationally expensive. Also, the use of tactile patterns must be investigated. Patterns may be an efficient way to represent regions of an image that have bee segmented. Clearly, study is warranted incorporating a larger set of images, algorithms and subjects. This study must look not only at identification, but content identification and understanding - access to content is, after all, what drives this research. Finally, we want to develop an application which can be invoked by other applications in replacement of a print routine, which will automatically analyze and simplify an image and send the result to a printer.

References

1. Fricke, J. and Baehring, H., Design of a tactile graphic I/O tablet and its integration into a personal computer system for blind users. Electronic proceedings of the 1994 EASI High Resolution Tactile Graphics Conference (http://www.rit.edu/~easi/easisem/easisemtactile.html).

2. Klatzky, R.L., Lederman, S.J., and Metzger, V.A., Identifying objects by touch: An "expert system." Perception and Psychophysic, 37(4), 299-302.

3. Lindley, C.A., Practical image processing in C, John Wiley & Sons, New York, 1991.

4. Loomis, J.M, and Lederman, S.J., Tactual perception. In K.R. Boff, L. Kaufman, and J.P. Thomas (Eds.), Handbook of perception and human performance: Vol II, Cognitive processes and performance, pp.31-1-31-41. New York: Wiley, 1986.

5. Mynatt, E.D., Auditory presentation of graphical user interfaces. Proceedings of the 1992 International Conference on Auditory Display. Addison-Wesley Publishing Company.

6. Pappas, T.N., An Adaptive clustering algorithm for image segmentation, IEEE Transactions on Signal Processing, 40(4), April, 1992.

7. Tou, J.T., and Gonzalez, R.C. Pattern recognition principles, Addison-Wesley Publishing Co., 1974.

8. Vanderheiden, G.C., Dynamic and static strategies for nonvisual presentation of graphic information, transcript, 1994 EASI High Resolution Tactile Graphics Conference (http://www.rit.edu/~easi/easisem/easisemtactile.html).

9. Wiker, S.F., Vanderheiden, G., Lee, S., and Arndt, S., Development of tactile mice for blind access to computers: importance of stimulation locus, object size, and vibrotactile display resolution. Proceedings of the Human Factors Society 35th Annual Meeting, 1991, 708-712.

Acknowledgments

Funding for this Project was provided by the National Science Foundation, Grant #: HRD-9450019. Additional support has been provided by the Nemours Research Programs.

Thomas Way, or, Kenneth Barner Applied Science and Engineering Laboratories Alfred I. duPont Institute 1600 Rockland Road Wilmington, DE 19899 Phone: (302) 651-6830 http://www.asel.udel.edu/ http://www.asel.udel.edu/sem http://www.asel.udel.edu/~way http://www.asel.udel.edu/~barner

Email: way@asel.udel.edu, or, barner@asel.udel.edu