collaboard: a remote collaboration groupware device ......collaboard: a remote collaboration...

4
CollaBoard: A Remote Collaboration Groupware Device Featuring an Embodiment-enriched Shared Workspace Martin Kuechler and Andreas M. Kunz ICVR Innovation Center Virtual Reality ETH Zurich Zurich, Switzerland {kuechler, kunz}@inspire.ethz.ch ABSTRACT In this paper we present a mixed presence groupware device called “CollaBoard”. The device improves collaboration between co-located and remote partners by providing a high level of workspace awareness. This is achieved by superimposing a life- size video showing the entire upper body of remote collaborators atop the displayed shared workspace. By doing so, the CollaBoard enriches the shared workspace with embodiments of remote collaborators. It shows pose, gaze and gestures of collaborators to their remote partners, and preserves the meaning of deictic gestures when pointing at displayed shared artifacts. The separate transmission of video and data allows the shared artifacts to remain editable at both conference sites. The functionality of two interconnected CollaBoard prototypes was verified in a public demonstration, a usability test, and a comparative user study. ACM Categories and Subject Descriptors H.5.3 [Information interfaces and presentation]: Group and Organization Interfaces – Computer-supported cooperative work (CSCW) H.4.3 [Information Systems Applications]: Communications applications – Computer conferencing, teleconferencing, and video-conferencing General Terms Design. 1. INTRODUCTION Remote collaboration by using video- and data-conferencing systems becomes increasingly popular. As a result, software clients that were initially designed for video-conferencing nowadays also allow the sharing of a common workspace between conferees for collaboration purposes (data-conferencing). However, working together is much more than just seeing each other and sharing a view of the workspace and the artifacts elaborated (images, drawings, slideshows, etc). In collaboration processes, being co-located or remote, the focus of attention is mainly on the shared artifacts (or more general, task-centered) [3], and collaborators use deictic gestures for referencing to shared artifacts. The importance of body language such as (deictic) gestures, pose, (partial/full/mutual) gaze, and facial expression of remote collaborators has already been identified earlier [4, 5, 6, 10, 16]. However, commercially available conferencing systems do not allow, e.g., a natural use of deictic gestures during combined video- and data-conferencing; by displaying video embodiments and the shared workspace in separate application windows, they break the spatial connection of deictic gestures to the artifacts in the shared workspace. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. GROUP’10, November 7-10, 2010, Sanibel Island, Florida, USA. Copyright 2010 ACM 978-1-4503-0387-3/10/11...$10.00. Figure 1. Remote collaboration using two interconnected CollaBoards.

Upload: others

Post on 20-Jul-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CollaBoard: A Remote Collaboration Groupware Device ......CollaBoard: A Remote Collaboration Groupware Device Featuring an Embodiment-enriched Shared Workspace Martin Kuechler and

CollaBoard: A Remote Collaboration Groupware Device Featuring an Embodiment-enriched Shared Workspace

Martin Kuechler and Andreas M. Kunz

ICVR Innovation Center Virtual Reality ETH Zurich

Zurich, Switzerland {kuechler, kunz}@inspire.ethz.ch

ABSTRACT In this paper we present a mixed presence groupware device called “CollaBoard”. The device improves collaboration between co-located and remote partners by providing a high level of workspace awareness. This is achieved by superimposing a life-size video showing the entire upper body of remote collaborators atop the displayed shared workspace. By doing so, the CollaBoard enriches the shared workspace with embodiments of remote collaborators. It shows pose, gaze and gestures of collaborators to their remote partners, and preserves the meaning of deictic gestures when pointing at displayed shared artifacts. The separate transmission of video and data allows the shared artifacts to remain editable at both conference sites. The functionality of two interconnected CollaBoard prototypes was verified in a public demonstration, a usability test, and a comparative user study.

ACM Categories and Subject Descriptors H.5.3 [Information interfaces and presentation]: Group and Organization Interfaces – Computer-supported cooperative work (CSCW)

H.4.3 [Information Systems Applications]: Communications applications – Computer conferencing, teleconferencing, and video-conferencing

General Terms Design.

1. INTRODUCTION Remote collaboration by using video- and data-conferencing systems becomes increasingly popular. As a result, software clients that were initially designed for video-conferencing nowadays also allow the sharing of a common workspace between conferees for collaboration purposes (data-conferencing). However, working together is much more than just seeing each other and sharing a view of the workspace and the artifacts elaborated (images, drawings, slideshows, etc). In collaboration processes, being co-located or remote, the focus of attention is mainly on the shared artifacts (or more general, task-centered) [3], and collaborators use deictic gestures for referencing to shared artifacts. The importance of body language such as (deictic) gestures, pose, (partial/full/mutual) gaze, and facial expression of remote collaborators has already been identified earlier [4, 5, 6, 10, 16]. However, commercially available conferencing systems do not allow, e.g., a natural use of deictic gestures during combined video- and data-conferencing; by displaying video embodiments and the shared workspace in separate application windows, they break the spatial connection of deictic gestures to the artifacts in the shared workspace.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. GROUP’10, November 7-10, 2010, Sanibel Island, Florida, USA. Copyright 2010 ACM 978-1-4503-0387-3/10/11...$10.00.

Figure 1. Remote collaboration using two interconnected CollaBoards.

Page 2: CollaBoard: A Remote Collaboration Groupware Device ......CollaBoard: A Remote Collaboration Groupware Device Featuring an Embodiment-enriched Shared Workspace Martin Kuechler and

Thus, the development and implementation of conferencing systems that enable rich use of body language in remote collaboration processes is subject of ongoing CSCW research. The paper in hand fits in this research and presents CollaBoard, a conferencing system that gives distant collaborators the sensation of being virtually co-located, and that allows collaborators the use of (deictic) gestures, partial gaze and full gaze (no mutual gaze), and pose for communication.

2. RELATED WORK Previous research in this particular field of CSCW yielded in systems that either attempt to create a virtual face-to-face situation or side-by-side situation among distant collaborators. Examples for systems which give the sensation of being face-to-face with remote collaborators are ClearBoard [4] and HoloPort [8]; those systems are not further discussed here. The presented CollaBoard enables a virtual side-by-side situation among distant collaborators. The CollaBoard resembles VideoArms [14, 15]. Both systems acquire people interacting on shared workspaces (large interactive displays) by means of a camera that is on-axis with the display device. Therefore, the context of hand gestures is preserved, e.g., deictic gestures pointing out a shared artifact can be correctly interpreted by the remote collaborator. However, VideoArms is limited due to the deployed segmentation algorithm. While the system transmits a video showing hands and arms, it fails to transmit the collaborator’s upper body which would mediate additional consequential and inconsequential communication such as pose and gaze that is not presented by the arms. Alternative segmentation algorithms are available [1], but entail time-consuming calculations, which lead to buckling live video embodiments.

E-Chalk enhanced with SIOX [2] provides a pleasing live video embodiment by using an elaborate segmentation technique that includes information from a depth-sensing camera. To mitigate occlusion, the developers of the system chose to display the live video embodiment in a slightly translucent way. Note, however, that E-Chalk was developed to support distant learning, and not remote collaboration sessions. As a consequence, the system provides only unidirectional live video embodiment (showing the teacher), and artifacts are not meant to be editable at both conferencing sites.

3. CONTRIBUTION To overcome the limitations of VideoArms and E-Chalk with SIOX, we designed and implemented the CollaBoard. It provides video- and data-conferencing with live videos showing the full upper bodies of remote collaborators. The artifacts in the shared workspace are editable for all collaborators. By displaying the video embodiments atop the shared workspace, the meaning of collaborators’ (deictic) gestures preserve their meaning relative to the shared artifacts.

3.1 Working Principle With the CollaBoard, video and artifacts are acquired and transmitted separately. The artifacts in the shared workspace (e.g., a presentation slide shown in Figure 3c) are transmitted as application data to remain editable at both conference sites. The collaborators are acquired by a video camera that is positioned as indicated in Figure 2. The acquired video stream is sent to the computer, processed, and transferred to the remote conference site. Video processing includes segmenting the user (foreground) from the background. The segmentation is necessary for subsequent video overlay atop the shared workspace at the remote conference site.

Note that the background of acquired video images consists of the shared workspace, therefore being highly inhomogeneous and also dynamic, which complicates the segmentation. To avoid this problem, we blank the shared workspace to the camera, while keeping it fully visible to the collaborators. A popular method to do so is to use polarized light and matching filters [17]. We use this method for our CollaBoard setup. Figure 3a shows a camera image acquired through the applied linear polarization filter.

For the segmentation, we use an algorithm based on the illumination invariant method [11]. Assuming a now static background, previously captured sequences of the blanked shared workspace without any user in front of it are compared to the actual image. This is done by using a statistic criterion that measures the colinearity of the actual color and expected background in the color space. Pixels identified as background are color-keyed by changing their color to green. The resulting image (Figure 3b) is compressed as part of the video stream, and transferred to the remote conferencing site.

Figure 2. Outline of the presented CollaBoard setup. Figure 3. (a) Image acquired by the camera, (b) segmentation output, (c) shared workspace with artifacts, (d) video overlay.

(a) (b)

(c) (d)

Page 3: CollaBoard: A Remote Collaboration Groupware Device ......CollaBoard: A Remote Collaboration Groupware Device Featuring an Embodiment-enriched Shared Workspace Martin Kuechler and

At the remote conferencing site, the received color-keyed live video stream is superimposed on the shared workspace. This is achieved by displaying the video in full-screen mode in front of the shared workspace and by rendering green pixels in the video as transparent. Note that the video overlay is transparent to interactive user input. The result is shown in Figure 3d.

3.2 Implementation For our research, we implemented two identical CollaBoard prototypes. Each prototype consists of a large display, an interaction module, a camera and a filter, a lighting bar, an audio system, and a computer with network access (see Figure 2).

3.2.1 Display The prototype’s display device is a 65” widescreen liquid crystal display (LCD) featuring a 1920x1080 pixel resolution. LCDs emit linearly polarized light, a feature which we use for blanking the displayed image to the camera while keeping it visible to the users’ eyes. The LCD was wall-mounted in landscape format. After the usability tests, the LCD’s housing was partially disassembled to bring the interaction module closer to the LCD’s electro-optical amplitude modulator (the display pane). By doing so, the interaction offset could be reduced from 16 mm to 8 mm, allowing much more precise touch and pen-based input.

3.2.2 Interaction Module On-screen interaction on the LCD is enabled by a commercially available interaction module that provides a 2000x2000 touch-point resolution. Touch events are registered by digital line scan cameras which constantly scan the interaction surface (a glass panel which protects the LCD’s display pane). The interaction module is mounted directly in front of the LCD; the module’s protective glass panel is replaced by one that preserves the polarization of transmitted light. A pen tray at the lower display border provides pens in four colors, as well as an eraser.

3.2.3 Camera and Polarization Filter We use a standard USB camera to acquire digital color video images. To prevent motion blur, the acquisition time is set to 17ms via the camera driver software. The resulting video features 30 fps and a resolution of 768 x 432 pixels. In front of the camera lens, a high-performance linear-polarization glass filter is mounted in such a way that any light emitted by the LCD is blocked. As a result, the camera “sees” only the user standing in front of the LCD, while the LCD’s display pane remains dark.

3.2.4 Lighting Bar The low transmittance (38%) of the deployed linear-polarization filter minimizes cross-talk, but also limits the brightness and contrast of the acquired images. However, increasing the camera’s gain would result in noisier images. Therefore, a lighting bar is used to illuminate the user(s) standing in front of the LCD. The lighting bar features fully dimmable high-frequency illumination; the high frequency is required to prevent interference visible in the acquired camera images. The brighter camera images also ease subsequent segmentation.

3.2.5 Audio For the microphone, an audio conferencing microphone pad is used. The device features echo cancellation and noise suppression.

The audio is processed by the computer; audio output is made with a common active-loudspeaker stereo system.

3.2.6 Computer Each CollaBoard prototype is operated by a computer running Microsoft Windows 7. The computers are connected to a gigabit ethernet.

3.2.7 Conferencing Software For the audio link between the CollaBoards, Skype [13] audio-conferencing software is used. The video connection is established by running a customized version of the open-source software ConferenceXP [12]. Software customization includes integration of self-programmed modules for video processing capabilities as lens distortion correction, segmentation, and video overlay.

To generate the shared workspace, we first used a customized version of the ConferenceXP Presentation module. However, a usability test showed that the synchronization of the connected shared workspaces was too slow. We therefore developed a new digital whiteboard application called “CollaWhiteBoard” [9]. The shared workspace follows the strict what-you-see-is-what-I-see (WYSISIS) principle: every visible artifact on every display is shared among all collaborators, and all visualizations of the shared workspace look exactly the same and are constantly synchronized. Created artifacts can be edited from both conference sites. The application detects picking a certain pen or the eraser from the interaction module’s pen tray, and changes color or switches to erase mode accordingly. The application synchronizes connected shared workspaces at a high frequency. This is essential for a natural drawing sensation at the remote conference site, since drawing activities must be continuously updated to match with drawing actions visible in the live video, and not only updated on pen-release.

Figure 1 shows a remote collaboration session using the CollaBoard prototypes and the conferencing software.

4. EVALUATION We assessed the usability and performance of the CollaBoard by means of a public demonstration, a usability test, and a comparative user study.

4.1 Public Demonstration The aim of the public demonstration was two-fold. On one hand, the demonstration was to proof the CollaBoards’ ability to create a virtual side-by-side situation between distant collaborators by acquiring, transmitting, and naturally displaying pose, partial and full gaze awareness, and remote gestures. On the other hand, the demonstration was a good opportunity to get feedback from potential users in industry. During the demonstration, the CollaBoard prototypes attracted wide interest; they were said to be a “practical solution of a video and data conferencing system” and they “could easily make the transition to a real-world application”. Nevertheless, the results from the segmentation software did not completely satisfy all demo users.

4.2 Usability Test The usability test was conducted to get feedback from test users in a controlled setting; Kuechler [7] describes the conducted usability test in detail. In general, the feedback was very positive. To identify potential system improvements, our feedback

Page 4: CollaBoard: A Remote Collaboration Groupware Device ......CollaBoard: A Remote Collaboration Groupware Device Featuring an Embodiment-enriched Shared Workspace Martin Kuechler and

questionnaire explicitly encouraged test users to state what disturbed them when using the CollaBoards. The received complaints were grouped into eight categories, i.e., too much delay of the shared whiteboard software, workspace not visible due to inappropriate video overlay, imprecise input due to interaction offset, poor audio quality, missing simultaneous interaction, poor resolution of the superimposed live video embodiment, bad segmentation of the video, and other complaints.

Based on these complaints, we developed the CollaWhiteBoard software (see Subsection 3.2.7), reduced the interaction offset, improved the audio quality, and enabled simultaneous interaction on the shared workspace. All this work resulted in an improved version of the CollaBoard prototypes.

4.3 Performance These improved prototypes were used for a user study that aimed at comparing the task performance enabled by the CollaBoard system with classic conditions like co-located collaboration and remote collaboration using a traditional video conferencing setup. For a detailed description of the conducted user study and findings, see Kunz et al. [9].

5. FUTURE WORK Future work on the CollaBoard should first address the problem of potentially not seeing well the workspace. One of our ideas is to enhance the video segmentation algorithm in such a way that the live video is analyzed to decide whether it is informative (e.g., contains gestures or upper bodies in realistic size) and thus should be displayed, or whether it would only decrease workspace visibility and thus should be discarded. Second, the segmentation should be further improved. The actual implementation using linearly polarized light limits the quality (brightness and contrast) of the live video embodiments. Alternative implementations [1, 2] are promising; other possible approaches like segmenting based on a background mask – generated with infrared background illumination and an additional infrared camera – have not yet been investigated. Generally, combined approaches are promising since they tend to be more robust to noise.

6. CONCLUSIONS In this paper, we presented CollaBoard, a mixed presence groupware device that features an embodiment-enriched shared workspace for remote collaboration. The embodiment is an online-processed live video stream that shows the remote conferees in life size; the video is superimposed on the shared workspace. The CollaBoard is capable of displaying pose, gaze and gestures of remote collaborators. Deictic gestures keep their meaning for remote conference participants, and the artifacts in the shared workspace remain editable for all conferees. An evaluation confirms the principle functionality and performance of the CollaBoard.

7. ACKNOWLEDGMENTS This research work was partially funded by CTI, the Swiss Innovation Promotion Agency (grant no. 9017.1 PFES-ES). We would like to thank Konrad Wegener from inspire AG, Urs Steiner from AVS Group AG, and David Martin from SMART Technologies, Inc. for their financial support in providing expensive hardware. We would also like to thank all people who contributed to the CollaBoard project.

8. REFERENCES [1] Coldefy, F., Louis-dit-Picard, S. Remote Gesture

Visualization for Efficient Distant Collaboration Using Collocated Shared Interfaces. Proc. IASTED HCI 2007, ACTA Press (2007), 37-42.

[2] Friedland, G. Adaptive Audio and Video Processing for Electronic Chalkboard Lectures. Dissertation, Department of Computer Science, Freie Universität Berlin. 2006.

[3] Gaver, W. W., Sellen, A. J., Heath, C., Luff, P. One is Not Enough: Multiple Views in a Media Space. Proc. CHI 1993, ACM Press (1993), 335-341.

[4] Ishii, H., Kobayashi, M. ClearBoard: A Seamless Medium for Shared Drawing and Conversation with Eye Contact. Proc. CHI 1992, ACM Press (1992), 525-532.

[5] Kirk, D., Rodden, T., Stanton Fraser, D. Turn It This Way: Grounding Collaborative Action with Remote Gestures. Proc. CHI 2007, ACM Press (2007), 1039-1048.

[6] Kirk, D., Stanton Fraser, D. Comparing Remote Gesture Technologies for Supporting Collaborative Physical Tasks. Proc. CHI 2006, ACM Press (2006), 1191-1200.

[7] Kuechler, M. Groupware Devices That Support Gaze Awareness and Workspace Awareness to Improve Remote Collaboration. Dissertation, Department of Mechanical and Process Engineering, ETH Zurich. Verlag Dr. Hut, 2010.

[8] Kuechler, M., Kunz, A. M. HoloPort – A Device for Simultaneous Video and Data Conferencing Featuring Gaze Awareness. Proc. IEEE VR 2006, IEEE Computer Society (2006), 81-87.

[9] Kunz, A. M., Nescher, T., Kuechler, M. CollaBoard: A Novel Interactive Electronic Whiteboard for Remote Collaboration with People on Content. Proc. Cyberworlds 2010, IEEE Computer Society (2010).

[10] Louwerse, M. M., Bangerter, A. Focusing Attention with Deictic Gestures and Linguistic Expressions. Proc. CogSci 2005, Lawrence Erlbaum Associates, Inc. (2005), 1331-1336.

[11] Mester, R., Aach, T., Dümbgen, L. Illumination-Invariant Change Detection Using a Statistical Colineary Criterion. Proc. DAGM 2001, Springer (2001), 170-177.

[12] Microsoft Research. ConferenceXP Project. URL http://research.microsoft.com/en-us/projects/conferencexp, accessed August 20, 2010.

[13] Skype, Ltd. Skype. URL http://www.skype.com, accessed August 20, 2010.

[14] Tang, A., Neustaedter, C., Greenberg, S. VideoArms: Embodiments for Mixed Presence Groupware. Proc. BCS HCI 2006, Springer (2006), 85-102.

[15] Tang, A., Neustaedter, C., Greenberg, S. VideoArms: Supporting Remote Embodiment in Groupware. Video Proc. CSCW 2004, ACM Press (2004), Video.

[16] Tang, J. C. Findings from Observational Studies of Collaborative Work. Int. J. Man-Machine Studies 34:2 (1991), 143-160.

[17] Tang, J. C., Minneman, S. L. VideoDraw: A Video Interface for Collaborative Drawing. Proc. CHI 1990, ACM Press (1990), 313-320.