agent-based architecture for implementing multimodal learning environments for visually impaired...

Agent-Based Architecture for Implementing Multimodal Learning Environments for Visually Impaired Children

Rami Saarinen, Janne Järvi, Roope Raisamo and Jouni Salo Tampere Unit for Computer-Human Interaction (TAUCHI)

Department of Computer Sciences FIN-33014 University of Tampere, Finland

Tel: +358 3 215 4035

E-mail: {rami.saarinen, janne.jarvi, roope.raisamo, jouni.salo}@cs.uta.fi

ABSTRACT Visually impaired children have a great disadvantage in the modern society since their ability to use modern computer technology is limited due to inappropriate user interfaces. The aim of the work presented in this paper was to develop a multimodal software architecture and applications to support visually impaired children and to enable them to interact equally with sighted children in learning situations. The architecture is based on software agents, and has specific support for visual, auditory and haptic interaction. It has been used successfully with different groups of 7-8-year-old and 12-year-old visually impaired children. In this paper we focus on the enabling software technology and interaction techniques aimed to realize our goal.

Categories and Subject Descriptors H5.2 [Information interfaces and presentation]: User Interfaces. - input devices and strategies, auditory (non-speech) feedback, haptic I/O, user-centered design.

General Terms Design, Human Factors.

Keywords Multimodal software architectures, visually impaired children, teaching programs, haptics, auditory feedback, navigation, inclusion.

1. INTRODUCTION In the modern society computers are used in teaching several subjects, not just computing and mathematics. This has created a new problem for visually impaired children, since they often cannot take advantage of the teaching materials created to be used with a computer. Even if they had up-to-date hardware at home, it is of no real use without appropriate software that supports it.

There has been some development in teaching materials for the blind and visually impaired, but these are not available widely or have limited use. An exception is audio books that have become popular, but they lack the interactive quality and possibilities of teaching programs. Children are a special group of computer users who require their own software and user interfaces depending on their development level. It is already challenging to design user interfaces for young children, but this challenge is much greater when the children are blind or visually impaired. Since it is necessary to support their learning and use of the computer at the same time as their impaired vision is supported with other modalities, the complexity of user interface design and supporting architecture is greater than in interfaces for common computer users. In studies made by Patomäki et al. [17] games and learning environments were built for visually impaired children from 3.5 to 7.5 years of age. It is evident that young children’s use of the PHANTOM device (Figure 1) [19] is greatly affected by their motor abilities and development level. When planning the present study we decided to direct our efforts in pre-school and elementary school children who should have more developed motorics and who are more capable of expressing themselves. The goal of our research was to produce a proactive and multimodal agent-based learning environment that would support children’s cognitive development both in learning and play. The pedagogical approach of this system is exploratory learning. In

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ICMI’05, October 4–6, 2005, Trento, Italy. Copyright 2005 ACM 1-59593-028-0/05/0010…$5.00.

Figure 1. A child using the PHANTOM device.

309

https://www.researchgate.net/publication/221052138_Experiences_on_haptic_interfaces_for_visually_impaired_young_children?el=1_x_8&enrichId=rgreq-e06e21ea-a2b3-473a-bd15-6e129da193d2&enrichSource=Y292ZXJQYWdlOzIyMTA1MjcwMjtBUzoxMDQ1NzM5ODk1NTYyMzJAMTQwMTk0Mzc5MDk3NA==

exploratory learning a child can explore the phenomena independently, guided by his/her own interests and questions. As hardware we used a Reachin Display [18] with a SensAble PHANTOM Desktop [19] and stereo audio through speakers. There was also a 2D projected view of the virtual environment in case the child had a residual sight and could make use of it. The buttons of the SpaceMouse [1] were used for user input. Our approach is to support learning and navigation with software agents. Agent-based models structure a system as a set of interconnected agents. An agent is an autonomous software component which has a state and which communicates with other agents by using messages. The agents’ role in exploratory learning is to support the children’s exploration in the simulation. They don’t force the child to take any specific actions. Rather, they provide additional information when the child wants to have it. Agents ask rhetoric and guiding questions about the subject that is being explored. They also encourage the children to ask questions, as well as to form hypotheses and explanations on their own. The questions start from every day phenomena such as night and day or the seasons, and progress gradually to more complicated topics. The phenomena chosen for computer simulation were our solar system, the interrelations between the Earth and the Sun, the Earth, the atmosphere and the interior layers of the Earth. The preliminary results of the user studies with two groups of visually impaired children support our design solutions, the usability of the interaction techniques and the architecture developed.

2. RELATED WORK Exploratory learning and the chosen phenomena was previously used in the PICCO project [12]. The software used 2D graphics and the focus group was children with normal eyesight. The experiences from this project were used as a starting point in this project. In recent years there has been some research concerning the use of the PHANTOM device in developing software for visually impaired persons. For instance, Jansson and Billberger [11] reported that blind persons can identify 3D objects faster with hands than with a PHANTOM device. They argued that with practice the performance can improve somewhat. According to Magnusson et al. [13], blind users can recognize quite complex objects, and they are also able to navigate in virtual environments as long as the environment is realistic. Sjöström [20,21] has studied non-visual haptic interaction using the PHANTOM device. In his informal experiments with visually impaired test users he came up with a list of design guidelines for one-point interaction haptics which include guidelines for navigation, finding objects and understanding objects. However, there has not been much research on user interfaces for visually impaired and blind children. Patomäki et al. [17] suggest that for young children the objects in the virtual environment should be very simple. In addition, Patomäki et al. used real mockup models of the environment to help the children get familiar with the simulation. This proved to be useful in their study. The results of Patomäki et al. [17] also supported Sjöström’s findings. The GRAB project [6] has developed an architecture which enables visually impaired and blind people to explore three-

dimensional virtual worlds using the senses of touch and hearing. The architecture is based on three tools: 3D force-feedback haptic interface used with two fingers; an audio interface for audio messages and verbal commands; and a haptic modeler for designing the objects in the environment. The GRAB system was successfully used in a computer game for the blind [23]. The GRAB system uses two-handed interaction which makes identifying objects and orientation easier for the blind. In one-point interaction, especially when used with only one hand like with the PHANTOM device, 3D objects must be designed carefully. We also believe that the use of such a system should be supported somehow. Software agents are a natural way of providing this support. Agents allow autonomous execution of multiple tasks making monitoring user actions easier. For instance, the Open Agent Architecture [15] (OAA) provides a distributed agent architecture especially targeted for multimodal user interfaces. The main emphasis is on speech recognition, gestures and pen input. The system is built around a central facilitator that handles and forwards tasks that the agents want to have completed. The system allows dividing the task into subtasks and parallel execution of each subtask. The facilitator also provides global data storage and a blackboard-like functionality for it. Several applications have been built using OAA. For example the InfoWiz [3], CommandTalk [14] and Multimodal maps [4] take advantage of the OAA’s multimodal capabilities. Using central dispatcher is very common in agent-based systems. In our architecture MessageChannel plays the role of dispatcher. Other functionality is left for the agents. Coutaz [7] has suggested an agent based approach to dialogue control. Her PAC model (Presentation, Abstraction, Control) describes an interactive system as a hierarchical collection of PAC agents. The Presentation facet of a PAC agent handles input and output behavior and the Abstraction facet contains its functional core. The Control facet controls communication between other agents and also between agent’s facets. PAC model and its descendant PAC Amodeus [16] have been used in fusion of multimodal input modalities. In our system the input is unimodal as we use only haptic input. But filter agents are used in a similar way to process information from low level input such as touching a surface to high level abstractions concerning exploratory learning. In agent-based learning environment EduAgents [10] Hietala and Niemirepo introduced teacher and companion agents that have their own personalities and abilities. The teachers have different ways of teaching and companions may try to help the user with the mathematical exercises. Companions were designed to be human-like and thus they also could make mistakes. Part of the learning process was to work with the companion in various exercises and, for example, study the solution offered by the companion.

3. SOFTWARE ARCHITECTURE FOR MULTIMODAL APPLICATIONS A haptic-auditory-visual simulation environment is very processor intensive and having other heavy components running in the same system would render the system almost unusable. Thus it is very helpful to have the application distributed around the local area network so that the processor intensive tasks can be divided between machines. Agent-based architecture is also reusable and

310

https://www.researchgate.net/publication/238730424_An_object_oriented_model_for_dialog_design?el=1_x_8&enrichId=rgreq-e06e21ea-a2b3-473a-bd15-6e129da193d2&enrichSource=Y292ZXJQYWdlOzIyMTA1MjcwMjtBUzoxMDQ1NzM5ODk1NTYyMzJAMTQwMTk0Mzc5MDk3NA==

https://www.researchgate.net/publication/2511007_CommandTalk_A_Spoken-Language_Interface_for_Battlefield_Simulations?el=1_x_8&enrichId=rgreq-e06e21ea-a2b3-473a-bd15-6e129da193d2&enrichSource=Y292ZXJQYWdlOzIyMTA1MjcwMjtBUzoxMDQ1NzM5ODk1NTYyMzJAMTQwMTk0Mzc5MDk3NA==



https://www.researchgate.net/publication/200026851_Navigation_and_recognition_in_complex_haptic_virtual_environmentsreports_from_an_extensive_study_with_blind_users?el=1_x_8&enrichId=rgreq-e06e21ea-a2b3-473a-bd15-6e129da193d2&enrichSource=Y292ZXJQYWdlOzIyMTA1MjcwMjtBUzoxMDQ1NzM5ODk1NTYyMzJAMTQwMTk0Mzc5MDk3NA==

https://www.researchgate.net/publication/221513710_A_Generic_Platform_for_Addressing_the_Multimodal_Challenge?el=1_x_8&enrichId=rgreq-e06e21ea-a2b3-473a-bd15-6e129da193d2&enrichSource=Y292ZXJQYWdlOzIyMTA1MjcwMjtBUzoxMDQ1NzM5ODk1NTYyMzJAMTQwMTk0Mzc5MDk3NA==

easy to extend to face the requirements of new projects. Agent-based software engineering also enables the use of iterative development making it possible to build the most vital components first and new functionality and agents may be provided in the following stages of the development. One of our ideals was to have the system as open and extensible as possible so that we could for example easily integrate third party software to the system if necessary. Currently we have successfully integrated the CLIPS rule engine [5] and the SQLite database library [22] to the rest of the agent system. The architecture has been designed to be portable and transferable across platforms and programming languages. The communication language between agents is plain text structured with XML and it is transferred by using TCP/IP that is most likely to be supported by any potential programming language. Currently we have C++ and Python implementations of the system. Agents from both of these two platforms can co-exist and interact with each other in a normal manner. The architecture uses various third party libraries and APIs as its foundation. Open Source variants have been used where applicable. Figure 2 displays a layered view of the agent architecture. There are some common components that are used in most of the other parts of the architecture. These are mostly network-related and provide an easy use of TCP/IP and UDP communication using the Common-C++ library [9]. The common components also have helper-classes to work with XML, this is done by using the Xerces-C XML library [24]. Controller, AgentContainer and MessageChannel are distinct functional components of the system.

The Controller is build on top of the ReachinAPI [18] that provides means to interact with the 3D simulation environment and the haptic devices, and our own simple sound system. The sound system is built on top of the BASS sound library [2]. The basic agent architecture can be divided into three separate functional components (see Figure 3). The system can be seen as a realization of a typical message dispatcher architecture where the MessageChannel is the central dispatcher providing a centralized way for passing messages between agents around the network. The AgentContainers (and thus agents) are connected to the agent system via the MessageChannel. The third component is the actual application that is handled by the Controller. The functionality and structure of MessageChannel, AgentContainer and agents is loosely based on FIPA agent specification [8].

3.1 MessageChannel MessageChannel registers all the agents and AgentContainers that exist in the system. It also stores the information of the possible services the agents may provide. The agents and their services may be queried. The most basic form of communication is to use the exact name of the receiving agent. If the sender only knows the name of the service it wants to send the message to, it can do so either by sending a query to the MessageChannel for all the agents that provide the given service or it can also send the message with the service name as the receiver so that all the agents that provide the given service will receive a copy of the message. There is also a possibility to broadcast a message to every agent.

3.2 AgentContainer and Agents AgentContainer is the container for the agents. For each agent it has a mailbox for incoming messages. AgentContainer also provides messaging services such as means to send messages to other agents and for querying the services other agents provide. Agents may open direct communication channels with other agents, but the default way of communication is via the AgentContainer and MessageChannel. The AgentContainer also enables the agent to register the services it provides in the MessageChannel. The service is just a label the agent may attach into itself to allow other agents to locate a certain type of an agent. On start-up the AgentContainer registers itself to the MessageChannel and receives a unique name that is used to pass messages between containers. After registering, the container starts all the agents that have been defined in the configuration file. Each agent receives a unique name that is used to distinguish the agent from another. On creation the agent also receives a list of parameters that the AgentContainer read from the configuration file. It is also possible to execute new agents dynamically when the container is already running. This can be done by sending a message to the specific container asking it to create a specific agent with given parameters. Mobile agents are not supported in this version of the agent architecture.

LAN

Message Channel

AgentContainer

Agent AgentAgentContainer

Agent Agent

Controller

Figure 3. Overview of the agent system.

ReachinAPI Common components

AgentContainer

Agents

Controller

SoundSystem

MessageChannel

Figure 2. Layered view of the architecture.

311

https://www.researchgate.net/publication/228915100_The_design_and_evaluation_of_a_computer_game_for_the_blind_in_the_GRAB_haptic_audio_virtual_environment?el=1_x_8&enrichId=rgreq-e06e21ea-a2b3-473a-bd15-6e129da193d2&enrichSource=Y292ZXJQYWdlOzIyMTA1MjcwMjtBUzoxMDQ1NzM5ODk1NTYyMzJAMTQwMTk0Mzc5MDk3NA==

3.3 Controller In the user’s end there is the PHANTOM Desktop haptic device that provides 3 degrees-of-freedom single point haptic feedback to the user. The device is shaped like a stylus attached to a robotic arm (See Figure 1). The stylus is used to feel the objects in the 3D scene which is displayed through a semi-transparent mirror over the robotic arm providing a co-centric work area. The Controller works as a bridge between the user and the rest of the agent system. It uses a specialized version of the AgentContainer and has some specialized agents to provide the means to get data from the user and to provide the other agents a way to interact with the user and the main application. The Controller provides the means to navigate in the scene as well as between scenes. It also allows the agents to manipulate the 3D environment and to interact with the user. And it provides data about the state of the world for the rest of the agent system. The system can initiate interaction with the user by playing sounds, moving the user to another micro world or using force feedback to inform the user on various things. Currently the notification is done by playing a sound and tapping or shaking the stylus. The user either accepts or declines to hear the new information. The information may be a question in which case the system waits for the user’s yes/no reply for certain amount of time. The Controller has been designed to be easily extended and used so that one can build an application just by creating the scenes (or micro worlds as we call them) in VRML format. The ReachinAPI provides a framework for building, for example, customized surfaces and objects that can directly be used in the VRML definition of the micro worlds allowing the VRML to be extended to fulfill the needs of the project. As with normal VRML code, the programmer can include scripted behavior in each micro world. The scripting language used is Python. The actual application is a combination of VRML definition of the scene graph, the Python scripts that control some aspects of the application, the agents that provide dynamic content and control for the application, and the Controller that pulls it all together. The Controller offers an interface to pass information between the Python scripted functionality and the rest of the system (figure 4).

4. CLASSES OF AGENTS The agent system, as the name implies, is a distributed system of concurrently running tasks, agents. Agents have different roles in our system. For example, all the interactions between the system and the user happen through a pedagogic agent so that if another agent wants to interact with the user, it will send a corresponding message to the pedagogic agent asking it to do so.

Three other vital roles are present in the current system. First there is a database agent that collects and keeps up all information of the system. Second, a filter agent inspects the information going through the system trying to find meaningful information. Filter agents may update the findings with the database agent or send the new information to other agents. Finally, perhaps the most important agent is the rule engine agent that handles most of the interactive functionality. The agent has certain patterns that it tries to find in the user’s actions and it will react to those patterns.

4.1 The Database Agent The database agent is the heart of the current simulation environment. It wraps around the SQLite database library [22]. The database contains facts that the agents can query, modify and observe. An agent can inform the database agent that it is interested to get a notification when a fact changes and the database agent will send a message with the fact and its new value to the agent whenever a change happens. The architecture also provides a convenient object-oriented way of handling the separate facts so that it is easy to modify, commit and update them to and from the database agent.

4.2 The Rule Engine Agent The rule engine agent is built on top of the CLIPS rule engine [5]. The agent is tightly connected to the database agent so that it is possible to efficiently pass information between these two components. That is, the rule engine agent observes all the facts in the database and updates the corresponding facts in its working memory and vice versa. The rule engine agent is the brain of the system providing most of the operations and interaction requests to the user. As a rule engine, the agent has a certain prerequisites (time being one) that, once fulfilled, will cause some event to happen: another new piece of information being created or perhaps a request for interaction with the user.

4.3 Filter Agents A filter agent is a special type of agent that can contain one or more filters. The motivation behind the filter structure is to provide pipes and filters kind of functionality to the system. These filters are usually used to observe very specific events or data and to provide new information based on them. The filters can and should be chained to provide a chain of information refining the data starting with atomic facts and ending up with high level information, assumptions and conclusions (Figure 5).

The filter agent sequentially passes the message it receives to each filter it contains. A filter then processes the message and

Figure 4. Message flow within the Controller.

Controller

Python script

Message IN

VRML scene

Message

Filter

Filter

Filter

Filter

Filter

Filter

Database

Figure 5. A chain of filters.

312

returns a boolean value indicating whether it did something with the message or not. Filters may be bypassed so that if one of the filters does something with the message, the message is not passed to the rest of the filters. There may be several Filter agents running in the system simultaneously. Filters can be used to observe the user’s actions and movement, to update facts in the database and to store messages in a log file for later analysis. Filters are also very good for building up a system for multimodal input, providing several levels of processing. In our system we use several small filters that monitor the log messages going through the system and seeking, for example, the information about navigation to other micro worlds. Each micro world has also a specific filter that monitors events specific to this micro world and updates the database accordingly.

4.4 Controller Agents There are two special agents in the AgentContainer of the Controller. The mediator agent acts as the representative of the Controller in the agent system and forwards messages sent to it to the Controller for further processing. There is also a simple coordinate agent that sends the user’s coordinates to the logging agent once every second.

5. APPLICATION EXAMPLES: ASTRONOMICAL PHENOMENA As part of the research project, we have constructed a simulation application that makes use of our architecture. The simulation is aimed for visually impaired children. The children can study natural phenomena which are related to the Earth, the Sun and the Solar System. The aim was to produce agents that support the concept learning process of visually impaired and normally seeing children. The results of the research concerning the learning processes are outside the scope of this paper. Here we concentrate in presenting the applications and some results of the user studies with children from our focus group. The simulation consists of six micro worlds. The micro worlds are the Earth, the Solar System, the Earth Orbit, the Atmosphere, the Earth Internal Layers and the Study Room (Figure 6). The user starts from the central station (Figure 6, center). From there he/she can open doors to other micro worlds. A door can be opened by pushing it with the PHANTOM stylus. The system then selects the corresponding micro world and guides the stylus to a suitable starting point. When the user is navigating from one micro world to another, he/she must travel through the Central station. This lessens the likelihood of getting lost in the environment because the user is always only one step away from the Central station. Every micro world has its own representative with a different voice (such as Captain Planet, Andy Astronaut and the Earth Giant). When the user enters a micro world, the representative gives an introduction and tells what the user can do in it. The main task of the agents in the system is to follow the user’s steps and paths of exploration and to provide adequate support for their exploration with questions and additional information through the representative. In the Solar System (Figure 6, top) the children can explore the Sun, the different planets and the circular system of the planets. The orbits of the planets are implemented as grooves on a black

plane that represents the void of the space. The plane helps the child to find the planets by restricting the depth of the scene to the level of the planets so that the PHANTOM stylus can’t fall under the planets. Whenever the user enters in an orbit the system tells which planet is in question. The planets are stationary, and implemented with a light magnetic pulling force in them. When the user finds a planet, the simulation tells its name and gives a brief description of it. If the user pushes the planet with the stylus the system tells some extra information about it.

In Earth (Figure 6, top-right) the children can explore the surface of the globe. The spherical Earth can be felt three-dimensionally with the stylus. The surface of the globe is implemented with a bump map that gives it a noticeable texture that feels different in oceans, land and mountains. The Earth has a gravity field that can be felt with the stylus as a slight pull towards the Earth. The gravity pull helps the child to locate planet Earth in empty space. After the child has found the Earth the gravity helps to explore the globe and prevents the stylus to fall into the empty space around the globe. When exploring the surface of the Earth the child can hear the names of the continents spoken. There is also ambient background noise that contains people’s voices and different vehicle and animal sounds. The sounds of sea can be heard when exploring the oceans. The user can rotate the earth by moving the stylus in the far right side of the world. When doing so, the Earth starts to spin slowly and the user can hear the sound of a clock ticking. The agent, for example, directs the child’s attention to gravity “Did you notice…?” or challenges the child’s thinking with arguments “What happens to different objects when you throw them in the air or drop them?” As a visual feedback it is possible

Figure 6. Navigation structure and the six different applications developed on top of the architecture.

Study Room

Earth

Study Room

Atmosphere Earth Internal

Layers

Earth Orbit

Solar System

313

to see a view of the spherical Earth and the virtual representation of the PHANTOM stylus. In Earth Orbit (Figure 6, top-left) the children explore the Earth’s revolution around the Sun. They learn the relative position of the Earth to the Sun during different seasons. When the user follows the orbit the stylus is shown as a small planet Earth and when the user is outside of the Earths orbit the stylus is shown as a small space rocket. The system tells the season and plays sounds that are familiar to that season (e.g. rain for autumn). The system also gives some general information such as the distance of the Earth from the Sun. Study Room (Figure 6, bottom-right) contains a room with doors along its walls. When the user press a door with the stylus the system presents him/her a question about the contents of the micro worlds. These questions have simple yes/no answers and the user answers by pressing a yes or no button in the SpaceMouse. The room has grooves on the floor and the users can find doors by following these aids. In Earth Internal Layers (Figure 6, bottom-left) users are able to explore the internal layers of the planet Earth. The layers are represented as a cross section of northern half of the Earth. Layers can be freely explored with the PHANTOM stylus. The topmost layer is the hardest and ‘rockiest’ of all layers and when the user is moving towards the bottom the ‘feeling’ gets smoother and smoother. When reaching the Earth’s core the haptic feedback is simulating the feel of earth’s liquid centre. As visual feedback the user can see the cross section and the layers and of course the PHANTOM stylus. In Atmosphere (Figure 6, bottom) children can explore the layers of the atmosphere of the Earth. The different layers are presented as different ambient sounds. The lowest layer contains human sounds, birds and humming of trees. The next layers contain aeroplane sounds and different kinds of windy air current sounds. When user approaches the top border of the application the sounds will get more silent and will disappear as the user goes out of the atmosphere to the space. Haptic feeling is very subtle and light but there’s still some damping when moving the stylus so that the child gets a concrete ‘feel of the air’. Visual feedback is presented as a simple background picture that shows how the atmosphere is fading to black space.

5.1 Navigation Support Upon startup the Controller loads a specific XML configuration file that defines the structure and content of the application as a tree. Each node in the tree represents a single application or so called micro world which a child can explore according to his/her interests. Amongst other things, the configuration file defines the path to the VRML file that defines the micro world in question. The Controller creates a navigation tree that it uses to allow navigation for the user. There can be two types of transitions between the nodes (micro worlds) of the navigation tree. First the navigation tree can be traversed using a pie-shaped navigation menu that provides means to move to the parent or the children of the current node. The navigation menu can be disabled for the specific node so that it cannot be used for navigation. A micro world may also have a set of objects that can be used as push buttons that trigger a direct transition to another micro world which does not have to be the parent or a direct descendant for the

current node. This kind of transition is called a ‘route’. This set of objects is defined both in the configuration file and in the VRML file. Each object must have a pushable haptic surface that the ReachinAPI provides. There is also a third way to navigate through the set of micro worlds. That is, the agents may ask the pedagogic agent to ask the Controller to switch to some specific micro world. This enables a very flexible application structure that could, for example, only implement navigation through agents. Early in the project we developed several navigation tools for the system. First, we considered a donut-like circular menu with push buttons in it. The idea was to have a similar real world model of it. The user could use that as a navigation help. We discarded this approach, however, because the navigation tool should be so intuitive to use that no extra support would be needed. Our second idea was to use a circular pie menu (Figure 7). The menu had a spring force lightly pulling the stylus towards the center of the menu to make the use of the menu easier. The menu worked quite well for adults with normal a eyesight, but when we tried it with eyes blindfolded it didn’t work as expected. So, we were not confident that it would work for visually impaired children. Finally, we decided not to use the kind of navigation we planned first. We ended up using a door metaphor similar to the one by Patomäki et al. [17], and a single center point from which the user would only be one step away. Further, we decided that the user should always travel through that one central point when travelling from a node to another. That way getting lost between the nodes would be very unlikely. This was accomplished by disabling the navigation menu in every micro world and defining appropriate routes between the micro worlds. The user can get back to the center point by pushing a SpaceMouse button.

5.2 Navigation Aids When the user moves from one micro world to another the stylus is moved to central position and held there until the new micro world is loaded and displayed. This is done to give the user a possibility to start exploring from the same point every time he/she enters the micro world thus allowing memorization of the scene and to strengthen the sense of location. The other reason to guide the stylus to a certain location when moving from scene to scene is that there may be some accidental force spikes if the stylus is located within the same space as some solid object of the new micro world. Moving the stylus away from the potential

Figure 7. Navigation tool.

314

interference area makes the use of the application pleasant and smooth. In the early stages of the project we used a separate magnet that appeared at the same coordinates as the stylus and started to move towards the certain location dragging the stylus along. The early user studies proved this technique to be flawed as the probability for user to lose the guiding magnet was relatively high. The magnet was soon replaced with a global force vector that provides much smoother and reliable way to move the stylus to given location. Other haptic elements are used to make the navigation in the micro worlds easier. The shape of the central station (Figure 6, center) is hexagonal and the “doors” to other micro worlds are in the corners of the shape as they are easier to find that way. The Study Room and Solar System have grooves that the user can follow to locate something interesting; in the Solar System the grooves are actually the orbits of planets. Some of the interesting objects are magnetic to give the user a hint of their existence and to guide the user to find the objects. In some cases, mainly in the Earth micro world (Figure 6, top-right), we use spring forces to restrict the user’s movement and to guide him or her to the interesting parts of the micro world. The spring in Earth micro world also represents the idea of gravity. For every micro world we also had a corresponding plastic model. By touching it by fingers the user can get a general picture of the world he/she is exploring. Each micro world also has a distinct ambient sound or music to support the sense of location.

5.3 User Studies The applications built on top of the architecture have been tested in two phases, first with seven 7-8-year-old visually impaired children and then with three 12-year-old visually impaired children and one sighted child of the same age. In the first study the tested micro worlds were Solar system, Earth, and Study room. In the second phase they were Solar system, Earth, and Earth’s Orbit. When inspecting the usability and accessibility of the applications, especially good results were obtained from the use of planet orbits in Solar system: the children could easily follow them and acquire information of the planets. Another well working solution was the surface of the Earth and rotation of the Earth. In the Earth’s Orbit micro world it was also easy to follow the orbit after it was first examined once. Both of the user studies have been aimed to studying learning patterns and learning confidence. These studies are beyond this paper, but the fact that they were possible is already a solid evidence for the usability and accessibility of the learning environment presented in this paper.

6. DISCUSSION The design and implementation of the system has been a huge learning experience for us. Working with the special target group of visually impaired children making use of specific hardware such as the PHANTOM device poses its own questions, possibilities and limitations. The children have been very excited and interested in using the system and they reacted keenly on the questions and thoughts raised by the agents. Allowing the user to choose the information he or she wants to hear has made the new information more

interesting for the users. The system encouraged the children to think of the phenomena presented in the micro worlds. Some users were also eagerly waiting to hear more information. Users coming up with imaginary explanations for the phenomena in 3D virtual environments was also reported in [17] and [23]. This could be studied further. For instance, one could try to find out what kind of mental presentations the users form. As Patomäki et al. [17] suggest we kept the objects in the scene as simple as possible. In addition, we learned that the navigation inside the environment should be restricted or at least guided in some way. For example, we used a spring force in the Earth micro world (Figure 6) to guide the user to the area where there was something to study. Patomäki et al. [17] reported also that there are big differences in young children’s sensomotoric skills. Our user studies proved that older children can use PHANTOM and 3D applications adequately when the use is guided and supported by the system. Sjöström [20,21] states that in virtual haptic environment for the blind there should be reference points which provide navigation help and sense of location for the user. In our system this is realized in micro worlds by moving the stylus to the same location every time the user enters the micro world. In the whole simulation the Central Station acts as reference point when navigating from one micro world to another. Currently users can only answer in yes/no questions, in addition to exploring the multimodal applications through touching, hearing and seeing. More modalities should be added also in the input side of interaction. Some research and development should also be spent on developing new navigation methods and user interface components for interacting with the user. A more extensive user interface system that is usable for visually impaired children is one of the future goals. Stylus shaking and tapping is parameterized, but it would also require further study on which tapping patterns can be distinguished as unique and which ones could be used to give hint of what kind of information the incoming message contains. Future systems could integrate various input devices and methods to the system by providing a filter for each input method. Filters could also be refined to work as the preliminary and fundamental information gathering and processing means for the rule engine. We could also create more specialized objects like spheres that react to the proximity of the stylus by sending the distance of the stylus from the center of the sphere. Such proximity objects could be used to monitor the user’s actions or they could also be used to create ‘interest points’ to the micro world. The system could monitor the user’s distance from the interest points and if the user would be far away from them all, the system could aid the user to find the areas of interest by using sound and force feedback.

7. CONCLUSIONS In this paper we presented a multimodal software architecture aimed to support visually impaired children. Several teaching applications were implemented on top of the architecture. The initial user studies support the usefulness and applicability of the architecture, as well as the technology used to build it. We were especially pleased to find out that the PHANTOM technology used to produce haptic feedback is applicable for visually impaired children of this age group. As the basic functionality of

315

the system is now complete, it will be used in pedagogic research and studies on learning confidence.

8. ACKNOWLEDGMENTS This research was funded by Academy of Finland, Proactive Computing Research Programme (grant 202180). We thank Dr. Marjatta Kangassalo and researchers Eva Tuominen and Kari Peltola from Department of Teacher Education, Early Childhood Education (University of Tampere) for fruitful collaboration and their active involvement in pedagogic design and content development of the system.

9. REFERENCES [1] 3Dconnexion. http://www.3dconnexion.com/ [2] BASS audio library. http://www.un4seen.com/ [3] Cheyer, A., Julia, L., InfoWiz: An Animated Voice

Interactive Information System. In Third International Conference on Autonomous Agents (Agents'99), Communicative Agents Workshop. 1999. http://www.ai.sri.com/~cheyer/pubs.html

[4] Cheyer, A., Julia, L., Multimodal Maps: An Agent-based Approach. Multimodal Human-Computer Communication, Lecture Notes in Artificial Intelligence 1374, Springer, 1998, 111-121.

[5] CLIPS: A Tool for Building Expert Systems. http://www.ghg.net/clips/CLIPS.html

[6] Computer GRaphics Access for Blind people through a haptic virtual environment. http://www.grab-eu.com

[7] Coutaz, J. PAC, an Object Oriented Model for Dialog Design. In Proceedings of Interact ’87. North Holland, 1987, 431-436.

[8] Foundation for Intelligent Physical Agents. http://www.fipa.org/

[9] GNU Common C++ library. http://www.gnu.org/software/ commoncpp/

[10] Hietala, P. and Niemirepo T., A framework for building agent-based learning environments. In Proceedings of AI-ED 95: Artificial intelligence in education. AACE, 1995, 578.

[11] Jansson, G. and Billberger,K., The PHANToM Used without Visual Guidance. In The First Phantom Users Research Symposium (PURS 99). http://mbi.dkfz-heidelberg.de/purs99/

[12] Kangassalo, M. Exploratory Learning in PICCO Environment. In, Proceedings of the 10th European - Japanese Conference on Information Modelling and

Knowledge Bases, Pori School of Technology and Economics, Series A, number 28, Pori, 2000, 259-261.

[13] Magnusson, C., Rassmus-Gröhn, K., Sjöström, C. and Danielsson, H., Navigation and recognition in complex haptic virtual environments - reports from an extensive study with blind users. In Proceedings of Eurohaptics 2002, University of Edinburgh, 2002. http://www.eurohaptics.vision.ee.ethz.ch/2002.shtml

[14] Moore, R., Dowding, J., Bratt, H., Gawron, J. M., Gorfu, Y. and Cheyer A., CommandTalk: A Spoken-Language Interface for Battlefield Simulations. In Proceedings of the Fifth Conference on Applied Natural Language Processing. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1997, 1-7.

[15] Moran, D. B., Cheyer, A. J., Julia, L. E., Martin, D. L. and Park, S., Multimodal User Interfaces in the Open Agent Architecture. In Proceedings of the 2nd international conference on intelligent user interfaces. ACM Press, 1997, 61-68.

[16] Nigay, L. and Coutaz, J. A Generic Platform for Addressing the Multimodal Challenge. In Proceedings of ACM CHI’95. ACM Press, 1995, 98-105.

[17] Patomäki, S., Raisamo, R., Salo, J., Pasto, V.and Hippula, A., Experiences on Haptic interfaces for visually impaired young children. In Proceedings of Sixth International Conference on Multimodal Interfaces (ICMI’04). ACM Press, 2004, 281-288.

[18] Reachin Technologies AB. http://www.reachin.se/ [19] SensAble Technologies Inc. http://www.sensable.com/ [20] Sjöström, C., Designing haptic computer interfaces for blind

people. In Proceedings of Sixth International Symposium on Signal Processing and its Applications (ISSPA 2001). IEEE, 2001, 68-71.

[21] Sjöström, C., Non-visual haptic interaction design: Guidelines and applications. Doctoral dissertation, Certec, Lund Institute of Technology, 2002.

[22] SQLite database library. http://www.sqlite.org/ [23] Wood, J., Magennis, M., Arias, E., Gutierrez, T., Graupp, H.

and Bergamasco, M., The Design and Evaluation of a Computer Game for the Blind in the GRAB Haptic Audio Virtual Environment. In Proceedings of Eurohaptics 2003. http://www.eurohaptics.vision.ee.ethz.ch/2003.shtml

[24] Xerces C++ library. http://xml.apache.org/xerces-c/

316

agent-based architecture for implementing multimodal learning environments for visually impaired...

Documents