thesis final draft

63
1 Gesture Control Table of Contents Introduction LoudSpeaker Revolution Gesture in the context of Musical Performance Human Movement Man-Machine Interaction Paper Construction Chapter I Background Applications in the Past: Applications in the Present: Chapter II What Kind of Gesture Spatial Relationships Body Movement Chapter III What Kind of Sensors Different sensors for measuring body movement Resistors Gyroscopes Location Tracking Interconnections System Latency Chapter IV Design Concepts Performance Environments Instrument Design

Upload: kidcranium

Post on 20-Jun-2015

153 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thesis Final Draft

1

Gesture Control

Table of Contents

Introduction LoudSpeaker Revolution

Gesture in the context of Musical Performance

Human Movement

Man-Machine Interaction

Paper Construction

Chapter I Background

Applications in the Past:

Applications in the Present:

Chapter II What Kind of Gesture

Spatial Relationships

Body Movement

Chapter III What Kind of Sensors

Different sensors for measuring body movement

Resistors

Gyroscopes

Location Tracking

Interconnections

System Latency

Chapter IV Design Concepts

Performance Environments

Instrument Design

Page 2: Thesis Final Draft

2

Chapter V System Design

Construction

Analog-to-Digital conversion

802.11b

PC-104

Chapter VI Software

MAX/MSP

Chapter VII Mapping and the Aesthetic Consideration

My Considerations, How I chose to manipulate data.

Chapter VIII Dance and Music

Chapter IX The Performance

Lexicon, Lighting, Eventide, Spatialization

Engine27

Diagram of the Performance

Chapter X

Perception

Conclusion Future Applications

Wearable computing, augmented reality

Conclusion

Appendices

Resources

Page 3: Thesis Final Draft

3

Gesture Control

Introduction

Why use gesture to control sound? It is necessary to answer this question by first

defining gesture. Webster’s definition of gesture:

2 : a movement usually of the body or limbs that expresses or emphasizes an idea,

sentiment, or attitude

3 : the use of motions of the limbs or body as a means of expression

This definition holds true in the context of musical performance. The gestures

associated with musical performance have the ability to heighten a musical experience.

However, the gestures associated with musical performance have been removed from the

listening experience. It is necessary to reintroduce the gestures associated with music

performance. It is possible to deconstruct the gestures of a musical performance and re-

associate the gesture with various sound control parameters. This can be accomplished

with the introduction of technology. Before we discuss how technology can be

implemented to map gesture to sound control parameters. It is best to first look at the

initial cause for the removal of gesture, from the musical experience.

LoudSpeaker Revolution

Elliot Schwartz states that the “Loudspeaker Revolution” transformed the nature of

the musical experience in profound ways. Today we may consider this revolution part of

a larger phenomenon – evidenced by the increased use of amplification and electric

instruments in live performance, the explosive growth of radio and television, and such

Page 4: Thesis Final Draft

4

recent developments as the compact disc, videocassette recorder, and personal stereo.”

(Schwartz, pg. 26) The “Loudspeaker Revolution” has taken our music listening

experience into the home, the car, and has given us a means to introduce music into any

environment. This shift in environment has disassociated the performer from the

performance. In the past, performances given by musicians provided a visual stimulus to

the listener. The listener’s perspective has shifted in proportion to the environmental

shift. There is no longer a connection between the visual stimulus given by a performer in

a live performance and the listening experience given in our current environment. The

listener’s shift in perspective has led to a disassociation with the visual gesture of music

making.

Gesture in the context of Musical Performance

To understand the visual stimulus provided in a musical performance, it is necessary

to first focus on the interaction between the musical performer and the musical

instrument. Musical instruments transform the actions of one or more performers into

sound. An acoustic instrument consists of an excitation source under the control of the

performer(s) and a resonating system that couples the vibrations of the excitation to the

surrounding atmosphere. This affects, in turn, the precise patterns of vibration. In most

acoustic instruments (apart from organs and other keyboard instruments) the separation

between the input and the excitation subsystems is unclear. This separation has been

further increased by the introduction of synthesis and MIDI.

Page 5: Thesis Final Draft

5

“The separation between gestural controllers and sound generators as standardized by

MIDI has led to the creation and development of many new alternative controllers with

which to explore new creative possibilities.” (e.g. Chadabe 1996; Wanderely and Battier

2000). MIDI was implemented as an interconnection between various digital

synthesizers. To further the connection between performer and instrument, Richard

Moore, a computer-music researcher at the University of California at San Diego, has

identified compatibility between performer and instruments as “Control Intimacy”:

“Control intimacy determines the match between the variety of musically desirable

sounds produced and the psycho-physiological capabilities of a practiced performer.” The

match between the variety of desirable sounds and the capabilities of the performer may

be determined differently depending on the terms used to represent the performance

process. “The fact that musical performance can be represented in different ways must be

taken into account when modeling musical performance. Reference to an auditory

process such as “hearing” for instance, implies a different representation action than does

reference to the same process as “listening.” Similarly the terms “moving” and

“gesturing” reflect different representations of the performance process.” (Mulder, LMJ,

1996). Let us broaden our focus and look at gesture in areas other than musical

performance. This will allow us to later investigate the gestures produced by a dancer.

Human Movement

Gesturing can take other forms such as gesticulation and signing, among others. All

forms of gesturing can involve tactile feedback, but very often do not. Similarly, they

may involve target-oriented movements, but very often do not. Adam Kendon, who

Page 6: Thesis Final Draft

6

conducts research into nonverbal communication, defines gesticulations as “idiosyncratic

spontaneous movements of the hands and arms during speech” and signing “as the use of

a set of gestures and postures for a full-fledged linguistic communication system”

Kendon suggest that different forms of gesturing can be ordered according to their use of

linguistic forms. Gesticulation, language-like gestures, pantomime, emblems and signing

are different forms of gesturing that form a continuum, which proceeds from

gesticulation (least structured as a language) to signing (most structured as a language).

“The presence of such a continuum indicate that in human communications, gestures and

speech are part of a single system of channeling more or less linguistic meanings.”

(Mulder, 1996) Let back our focus out even further and look at all human movement in

general so that we can further our understanding of gesture as a control.

Some of the key concepts used in the exploration of human-motion intention are

taken from Rudolf Laban (1963). In his theory of effort, he notes the dynamic nature of

movement and the relationships between movement, space, and time. Laban’s approach

is an attempt to describe, in a formalized way, the characteristics of human movement

without focusing on a particular kind of movement or dance expression. Effort-theory

principles can be applied to dance and to everyday work practices. At the center of

Laban’s theory is the concept of effort, a property of movement. From an engineering

point of view, we can consider it with a vector of parameters that identifies the quality of

a movement performance.

Page 7: Thesis Final Draft

7

The most important concept is a description of the quality of movement. Laban's

theory of effort is not concerned with degrees of joint rotation or movement directly but it

considers movement as a communication media and tries to extract parameters related to

its expressive power. During a movement performance, the vector describing the motion

quality varies in effort space. Mr. Laban studies the possible paths followed by this vector

and the intention they can express. Therefore, variations of effort during the movement

performance should be studied. This movement performance is what we will begin to

look at now.

“There is a fundamental connection between human movement and music.

Gesture interfaces reintroduce in music a concern for physicality. Composition and

execution, while remaining clearly, two different activities, share characteristics of

performance. Execution can influence structure and form because of the inherent

structural character of movement and space.” (Chabot, 1989) This reintroduction of

physicality can only be made with an introduction of technology.

Man-Machine Interaction

With the introduction of technology into a musical performance it behooves one

to consider the interaction between man and machine. The interaction between man and

machine is important to our discussion because the gestures of man will be controlling

machine. Johannes Goebel states the following about “man-machine interaction”: Quite

obviously, people who use the term ”man-machine interaction” do not use it in a way that

would imply that man reacts to a machine or that two machines react to each other. “Man

Page 8: Thesis Final Draft

8

–machine interactions” wants to go beyond the known manipulations taking place

between man and machine, it aims for the realm of mutual, reciprocal action between two

partners who act on the ground of equal potential. It may not be a misinterpretation if I

assume that the implied aim is to engineer the quantitative states of a machine in such a

way that in the ultimate state men can act in their environment of natural languages, that

they can act out of their trains of thought which they will not have to formalize but which

a machine will comprehend by adapting itself to its own would of executable order.

“To react” has already undergone an extension from animated to in-animated: “1)

to act in turn or reciprocally 2) to act in opposition 3) to respond to a stimulus 4) to act

with another substance in producing a chemical change.” “Man-machine reaction”

instead of man-machine interaction? “Man-machine reaction” describes the current

situation more accurately, too accurately – and that is a problem. It could be understood

in pejorative ways: man only reacts to the stimuli produced by a machine and the

machine executes instructions the result of which will need man’s interpretation to be

another reason for his or her reaction which again will make the machine move or not

move. A machine will never know if it reacts or if it interacts, a program will never know

if tit is capable of learning. But human beings feel much better if they think of their

working-environment as being interactive instead of reactive. “Man-machine action”

implies the traditional hierarchy between man and tool. “Man-machine reaction:

describes a hierarchy where man reacts upon stimuli presented by a machine (and where

the production of such stimuli are a result of man’s labor). For obvious reasons we do not

like to put under such condition, even though and because a large part of western

Page 9: Thesis Final Draft

9

civilization is based upon such working conditions, even though the production of

machines considered potentially interactive is based upon such conditions.(Goebel,

ICMC, 1998) Under the assumption that Goebel’s is correct about “man-machine”

interaction, it appear that the best interaction between man and machine lies in the

environment the two exist within. Interaction between gesture and control based on

environment-driven, rather than a performance-driven aspect will lead to a more visual

and aurally aesthetic end. Now that we have discussed the breakdown of gesture as

related to musical performance, and have started to investigate the nature of “man-

machine interaction”, I would like to discuss how the rest of this paper will be

constructed.

Paper Construction This paper will be constructed as follows:

I will briefly discuss in Chapter I, the history of gesture controllers, moving from

the Theremin, up to the present. In Chapter II, I will investigate various forms of

gestures. After discussing the various forms gesture can take, I will briefly describe, in

Chapter III, what kinds of sensors or systems have been used to capture these various

gestures. Chapter IV will be a discussion of various design concepts. My own designs

will be the topic of Chapter V, specifically what kind of gestures dancers can generate

and the challenges involved in capturing their gesture. Chapter VI will discuss different

types of software that can be used to manipulate the data coming in from these sensors.

Chapter VII will discuss the aesthetic decision that can be made in making the connection

between gesture and sound manipulation. Chapter VIII will discuss the connection

between dance and music. Then, in Chapter IX, I will talk about a performance that I

Page 10: Thesis Final Draft

10

staged which, is a culmination of all the ideas that I will be discussing in this paper.

Following this discussion, Chapter X will briefly discuss the connection between visual

and aural stimulus and how this effects our perception. And finally, I will discuss

direction for the future applications of my gesture capturing system, and make my

conclusion.

Chapter I Background Applications in the Past:

Gesture control has been used for all types of applications since the beginning of

the 20th

Century. In 1920 a Russian physicist named Lev Termen (his name was later

changed to Leon Theremin) was engaged in designing an electronic burglar alarm, in an

early practical application of electrical technology. Theremin’s alarm was designed to

respond to changes in electrical capacitance caused by the approach of a foreign body

(Darreg, Experimental Instruments, 1991). The Theremin is unique in that it is played

without being touched. Two antennas protrude from the Theremin - one controlling pitch,

and the other controlling volume. As a hand approaches the vertical antenna, the pitch

gets higher. Approaching the horizontal antenna makes the volume softer. The historical

significance of a musical instrument that was played not by touching, but by gesture was

groundbreaking. This allowed for a myriad of other innovations, in experiments with

other types of musical instruments that could be played with gesture and movement.

In an experiment with one of these new gestural instruments, optical light sensors,

were used in John Cage’s Variations V. Variations V was first composed and performed

Page 11: Thesis Final Draft

11

in 1965 and subsequently toured through Europe. A classic example of electronic

Gesamtkunstwerk, the work was a collaboration between Cage, Merce Cunningham,

musicians Gordon Mumma and David Tudor, Bell Labs engineers Billy Klüver and Max

Mathews (optical light triggers), visual artists Robert Rauschenberg, Nam June Paik and

Jasper Johns (projections) and the filmmaker Stan Vanderbeek. Derived from the

happening, in which the line between art and life is dissolved, the work is a surreal

juxtaposition of “real-life” movement, objects and images, in non-synchronous

combination using chance technique. Source material is taken from home movies and TV

sitcoms. Music is generated from taped sounds and live electronic synthesizers. The

movement of the dancers triggers sound and projections as a result of the optical light

sensors.

Another important innovation in the field of gesture control was produced by

Professor Max Mathews, a distinguished computer music pioneer and the inventor of

sound synthesis, is perhaps best-known for having co-authored "The Technology of

Computer Music" in 1969, which practically defined the entire field of computer music.

He is also known for his early work in gestural input for human-computer interaction,

which he did during his long career at Bell Telephone Laboratories. During the 1960s, he

and L. Rosler developed a light-pen interface, which allowed users to trace their musical

intentions on a display screen and see its graphical result before processing it.

Professor Mathews also developed the "GROOVE" system with F. R. Moore at

Bell Labs in 1970. GROOVE was an early hybrid configuration of a computer, organ

Page 12: Thesis Final Draft

12

keyboard, and analog synthesizer, with a number of input devices including joysticks,

knobs, and toggle switches. The development of this system was important to the greater

field of computer music later on, because it was with this system that Professor Mathews

determined that human performance gestures could be roughly approximated as functions

of motion over time at a sampling rate of 200 hertz. This became the basis for the

adoption of the MIDI (Musical Instrument Digital Interface) standard for computer music

data transmission, by which eight 14-bit values can be transmitted every five

milliseconds. These various applications mentioned above have spawned a whole field of

research, which, brings us to the present to discuss what application are taking place in

the present.

Applications in the Present:

There are too many examples of these new gestural instruments to mention them

all. I will focus mainly on systems that produce music from controlled body movements,

and dance. These systems basically differ on the type of sensors used and on their

musical capabilities. For more examples IRCAM’s e-book “Trends in Gestural Control of

Music” presents an overview of gesture and motion sensing for music making. I will give

two examples of suit’s that can be worn by a dancer and used to control sound, as this

will be the focus of our discussion through the rest of this paper. I will also briefly

mention “The Hands” developed by Michael Waisvisz.

Mark Coniglio has built a costume (the "MidiDancer") that hides eight sensors

placed in crucial points, such as elbows and knees. Through the sensors, the movements

Page 13: Thesis Final Draft

13

of the body are detected by a tiny computer that codifies the signal in such a way that it

can be sent to a receiver/decoder plugged to the computer, by means of a radio

transmitter.

Another such device is Digital Dance a research project conducted at DIEM

involving hardware and software development as well as the creation of a rule-based

compositional system. The project sprang from a vision that a dancer’s body movements

could control musical processes during a performance. The goal of the project was to

create a computer music composition that allows a dancer to directly influence musical

processes in a meaningful way. In November 1995 Jens Jacobsen, an electrical engineer,

began working at DIEM in collaboration with DIEM’s director, Wayne Siegel, to

research existing interactive dance interfaces and develop a new interface. Advantages

and disadvantages of several systems were considered and a new interface system was

designed in collaboration with the Aarhus School of Engineering. After years of

development and refinement the DIEM Digital Dance System version 2.0 is now being

made commercially available on a limited basis.

A different type of device uses the gesture of just the hands of a performer.

Michel Waisvisz has created a system called appropriately called “The Hands.” The

instrument consists of a number of sensors and keys mounted on two small keyboards

that are attached to the player’s hands. The combination of many different sensors to

capture the movements of the hands, the fingers and the arms is still unique and make

“The Hands” still one of the most refined and musical MIDI-controllers. Waisvisz states

that “a growing number of researchers/composers/performers work with gestural

Page 14: Thesis Final Draft

14

controllers but to his astonishment he hardly sees a consistent development of systematic

thought on the interpretation of gesture into music, and the notion of musical feedback

into gesture.”

One can analyze and create distinct relationships between the character changes

of a gesture, and the change of musical content - and context - in a way that ones musical

intentions are clearly grasped by listeners. It is possible to have the freedom to create any

gestural relationship with a vast area of sound. Before we discuss these relationships, it is

necessary to look at the gestures to be captured and then look at how to capture these

gestures.

Chapter II What Kind of Gesture

Different gestures lend themselves to different applications. The movement of the

body can be measured in many ways. There are hundreds of sensors to capture the

movement of the body. Many researchers/composers/performers have built different

systems with various sensors and put a dancer in control of the music. It is necessary to

decide what type of gesture you want to capture so that you can determine what type of

sensor to use. “The category of gesture that has to be captured will determine the kind of

sensor to use.” (Depalle et al. 1997) “Creating a new instrument that responds to gesture

in a single fixed way might result in truly new music, but this originality could not be

sustained indefinitely. The novelty would fade as the instrumental technique became

standardized.” (Rubine, CMJ, 1990) So as Rubine states it is necessary to create a system

which, is re-configurable and modular. This discussion will focus on the interaction

Page 15: Thesis Final Draft

15

between dancing and music, and will look specifically at the different types of gestures

that can be produced by a dancer.

Spatial Relationships

The spatial relationship between multiple dancers is one of the most important

gestures of dance. Spatial sensors detect location space by proximity to hardware sensors

or by location within a projected grid. There is a connection that can be made between

the spatial relationships of a dance performance and the spatial metaphors that play an

important role in characterizing material of music. The concept is that sounds that are

similar to each other are placed proximate to each other in the spatial representation. A

variety of spatial representations that go beyond the one-dimensional layout of a

traditional keyboard have been proposed for pitch. The spatial metaphor that maps

perceptual similarity to spatial proximity in the control interface can be generalized to

more complex musical processes and structures, such as those involved in the generation

of melody, rhythm and complex textures. One notable example is the musical texture

space provided by David Huron (Huron 1989, 2001). Gerog Hajdu(1994) has

demonstrated the use of Huron’s texture space as a control structure. It should be noted

that “meaningful spatial control configurations can be produced without the

multidimensional scaling of tediously collected similarity judgments for all pairs of

sounds in the set under consideration.” (Wessel, 2002) The discussion will return to this

topic of spatial relationships.

Page 16: Thesis Final Draft

16

Body Movement

The dancer’s orientation to the audience, as well as the dancer’s orientation to the

floor can be characterized as another important gesture in dance. Many important

gestures also happen in the legs, arms, head, and torso. This is where most of the

attention is given by capturing system designers. On-body sensors measure angle, force,

or position of various parts of the body. For example, piezo resistive, and piezo-electric

technologies are very common for tracking small body parts such as fingers. We will now

discuss what types of sensors can be used to capture the gestures mentioned above.

Chapter III What Kind of Sensors Different sensors for measuring body movement

The discussion will now focus on what types of sensors to use for the different

gestures that can be created by a dancer. Human gesture has to be noise filtered,

calibrated, and quantized in order to have a normalized representation. Such data can then

be analyzed in order to extract pertinent information. Calibration process does not

necessarily lead to data reaction with an implicit impoverishment of the gestural

information, but helps to get homogenous information from heterogeneous input

channels. Studies on behavior from kinetic approach point out that human beings hardly

reproduce their gesture with precision. Furthermore, by dealing with gesture it is hard to

separate the sign of gesture from the signal it conveys and then to extract it significance.

There is no general formalization of movement that may help us to recognize its

meaning. As detailed by (Cadoz, 1994) gesture allows humans to interact with their

Page 17: Thesis Final Draft

17

environment to modify it, to get information and to communicate, the ergotic, epistemic,

and semiotic functions associated with the gestural channel of communication. (Sapir)

Resistors

To capture these different gestures, it is necessary to use different types of

sensors. A vary common sensor used throughout almost every system, are flex sensors.

A flex sensor is a variable resistive strip that when bent will manipulate voltage. These

flex sensors are generally placed on the joints of the body to measure the bend angle of

the particular body part you want to capture.

Hardware Example 1.) is a flexible resistive sensor. When this sensor is bent it changes

the voltage that is output. This change in voltage increases the more the sensor is bent.

Hardware Example 1.

Gyroscopes

To measure orientation it is necessary to use some type of gyroscopic sensor, this

sensor will generally output its orientation in degrees or rotation. These sensors can be

place about the body to measure any rotational joint such as; the hands, head, legs, or

directly on the torso, to measure the orientation of the dance within the room.

Page 18: Thesis Final Draft

18

Hardware Example 2.) is a ceramic angular velocity sensor.

Hardware Example 2. Murata GyroStar, ceramic angular velocity sensor. This particular

sensor, outputs a change in voltage that is equivalent to the speed of rotation.

Location Tracking

To measure spatial relationships or track location, it is necessary to have some

sort of fixed transmitter, and movable receiver. Many systems use ultrasound, in a sonar-

type fashion whereby, an output pulse is sent to the receiver and the time between the

pulses is used to measure the dancers location. In most cases it is necessary to have

multiple transmitters, which, would allow one to triangulate the position within a room.

Other systems use sensors external to the body such as cameras. These systems detect

changes from continuous frames (e.g. Rokeby’s Very Nervous System) or changes

between the current frame and a frame of reference (e.g., STEIM’s BigEye).

Hardware Examples 3. and 4.) show the cricket location tracking pair. Example 3. is the

Beacon/Transmitter of the pair. And Example 4. is the Listener/Receiver of the pair.

Page 19: Thesis Final Draft

19

Hardware Example 3. Hardware Example 4.

“The Cricket” location tracking system is a very effective way to capture location

within a room. This device is defined as a location-support system for in-building,

mobile, location-dependent applications. It allows applications running on mobile and

static nodes to learn their physical location by using listeners that hear and analyze

information from beacons spread throughout the building. Cricket is the result of several

design goals, including user privacy, decentralized administration, network heterogeneity,

and low cost. Rather than explicitly tracking user location, Cricket helps devices learn

where they are and lets them decide whom to advertise this information to; it does not

rely on any centralized management or control and there is no explicit coordination

between beacons; it provides information to devices regardless of their type of network

connectivity; and each Cricket device is made from off-the-shelf components and costs

less than U.S. $10. “The Cricket” implements randomized algorithms, used by the

beacons to transmit information. The use of concurrent radio and ultrasonic signals are

Page 20: Thesis Final Draft

20

used to infer distance. The listener uses inference algorithms to overcome multi-path and

interference. The system is very practical, the beacon configuration and positioning

techniques help improve accuracy. My experience with “The Cricket” has included

several location-dependent applications such as in-building active maps and device

control, these applications can be developed with little effort or manual configuration.

Interconnections

After having defined what type of sensors are placed where on the body of the

dancer, one can then translate these measurements into meaningful values. These values

can then be sent to sound control parameters. It is important for all the components to be

accurate and well scaled. Most of the sensors mentioned above are analog and output

changes in voltage. Therefore, it is necessary to have an Analog-to-Digital converter

(ADC) to use the data provided by the sensors with a computer. For use within a dance

performance it is necessary for the interconnection between the output of the ADC and

the computer to be wireless. This wireless connection has a few components. One

component is power, the other data transmission. To power the system it is necessary to

use batteries. Transmitting the data has historically been achieved with RF. Using an RF

transmitter- receiver pair. The receiver is then connected to a computer. Once in the

computer the various sensors can be used to various sound control parameters.

System Latency

One of the major roadblocks in developing a gesture capturing system is latency.

Latency is the elapsed time between a stimulus and the response. Some kinds of latency

Page 21: Thesis Final Draft

21

are relatively easy to measure for example, the time between a computer’s receipt of a

control message and the beginning of the audio event generated in response.

At this point, I will diverge into the territory of the Linux operating system. I will

explain how Linux can be modified to achieve latency results of up to 10 times faster

than any other operating system. What is real-time? -- I think the IEEE definition

(supplied by Douglass Locke) is a good one, and a clear one. Namely: "a real-time

system is a system whose correctness includes its response time as well as its functional

correctness." The system doesn't just run software to plough through the process; it needs

to do it in a timely manner. So, real-time systems care about timeliness, but just how

timely isn't clear until you add a modifier to "real-time". In other words, real-time is very

much a gray region. "Response to what?" To "events", which are either external events,

communicated to the system via its input/output interfaces (which include things like

Ethernet as well as audio, keyboard, or mouse input), or to timer-triggered events within

the system itself.

I will discuss what modifications I made to the Linux kernel, to run in real-time

on the wearable computer that I developed for my performance. A Linux 2.4.17 kernel

patched with a combination of both preemption and low-latency patches yielded a

maximum scheduler latency of 1.2 milliseconds, a slight improvement over the low-

latency kernel. However, running the low-latency patched kernel for greater than twelve

hours showed that there are still problem cases lurking, with a maximum latency value of

215.2ms recorded. Running the combined patch kernel for more than twelve hours

Page 22: Thesis Final Draft

22

showed a maximum latency value of 1.5ms. This data seems to indicate that the best

solution for reducing Linux scheduler latency is to combine both sets of patches. Latency

is really a shorthand term for the phrase latent period, which is defined by webster.com to

be "the interval between stimulus and response". In the context of the Linux kernel,

scheduler latency is the time between a wakeup (the stimulus) signaling that an event has

occurred and the kernel scheduler getting an opportunity to schedule the thread that is

waiting for the wakeup to occur (the response). Wakeups can be caused by hardware

interrupts, or by other threads. Large scheduler latencies have numerous causes.

One culprit are drivers that do lots of processing in their entry points or interrupt

service routine (ISR). Another is kernel code that stays in a block of code for a long

period of time, without explicitly introducing scheduling opportunities. Both of these

things cause one problem: the kernel does not have the opportunity to perform scheduling

calculations for a long period of time. Since the scheduler is the mechanism for

determining what process/thread should run, it needs to be run with a relatively high

frequency for the kernel to be responsive. The bottom line goal of someone trying to

reduce scheduler latency is to insure that opportunities to run the scheduler occur

regularly, so that tasks needed to service events get a chance to run as quickly as possible.

A different strategy for reducing scheduler latency called the low-latency patches was

introduced by Ingo Molnar and is now maintained by Andrew Morton. Rather than

attempting a brute-force approach (ala preemption) in a kernel that was not designed for

it, these patches focus on introducing explicit preemption points in blocks of code that

may execute for long stretches of time. The idea is to find places that iterate over large

Page 23: Thesis Final Draft

23

data structures and figure out how to safely introduce a call to the scheduler if the loop

has gone over a certain threshold and a scheduling pass is needed (indicated by

need_resched being set). Sometimes this entails dropping a spinlock, scheduling and then

reacquiring the spinlock, which is also known as lock breaking. The ability to modify the

operating system and to have it “react” with low latency was a key process to beginning

the investigation into the development of a gesture capturing system. This low latency

removed many of the problems that many system designers have encountered.

Real-time computing technology and development of human gesture tracking

systems may enable gesture to be introduced again into the practice of computer music.

The study and the analysis of human gesture in the music area for the control of digital

instruments along with standardization requirements coming from the industry have led

to the birth of MIDI devices and their communication protocol. “This protocol has often

been judged insufficient, because of its narrow bandwidth and its limitations due to the

pianistic gesture production. This is even more true today as the relationship between

composer and technology is no more exclusive but is extended to together forms of

artistic projects on which multimedia interact with other forms of expression such as

dance, theatre, or video art.” (Sapir, JNMR, 2002) Now that we have looked at the

various forms that gesture can take via dance, and the what kind of sensors we can use to

capture these gestures, it is time to start discussing various concepts of design for an

interactive system.

Page 24: Thesis Final Draft

24

Chapter IV Design Concepts Performance Environments

The relationship between interactive environments and compositional process is

too often glossed over, even though a truly interactive performance environment should,

almost by definition, challenge basic compositional thinking. That interactive

environment, whether it is driven by dancers moving within a defined space, or by a

single performer playing a more conventional device, such as a MIDI keyboard or wind

controller, becomes a central part of the composition itself.

Robert Rowe in his excellent “Interactive Music Systems” distinguishes between

“score-driven” and “performance-driven” interactive systems. Score-driven systems are

those in which predetermined, or “scored” materials are compared or matched with

incoming performance data, placing the composer in his or her traditional role as the final

arbiter of the piece. Performance-driven systems, on the other hand, rely on interpreting

typically improvisational performance data, attempting to recognize classes of data, such

as repetition, speed, pitch range, and so on. Richard Povall has advocated the addition of

a third classification, “environment-driven” which in some ways synthesizes Rowe’s

classifications, and in other ways lies entirely outside them. “In an environment-driven

interactive performance system the performer is working within an algorithmic

environment in which there are elements of both score-driven and performance-driven

systems. The environment is listening to the performance data, which in its turn can

trigger predetermined or algorithmic, or even aleatoric processes. By the same token, the

Page 25: Thesis Final Draft

25

performer is also reacting to the environment, placing herself into a fully interactive

feedback situation.” (Povall, JNMR, 1995) Feedback can is an advantageous addition to

an environment-driven system. Feedback between the performer and the system provides

a much needed interconnection between the performer awareness of what is being

manipulated within the system. I tend to agree with Povall on his summation that it is

possible to have an interactive environment, which, is a combination of the two systems

that Rowe discusses.

Instrument Design

“The conception and design of new musical interfaces is a burgeoning

multidisciplinary area where technological knowledge (sensor technology, sound

syntheses and processing techniques, computer programming, etc.) artistic creation, and a

deep understating of musicians’ culture must converge to create new interactive music

making paradigms. If new musical interfaces can be partially responsible for shaping

future music, these new musical paradigms should not be left to improvisation.” (Jorda,

CMJ, 2002) I disagree with Jorda on his point that the paradigms he discusses should not

be left to improvisation. In a properly created environmentally-driven system,

improvisation with feedback is the key to an aesthetically heightened musical experience.

Often, more difficult-to-master instruments lead to richer and more sophisticated

music’s (e.g., the piano vs. the kazoo), but expressiveness does not necessarily imply

difficulty. In that sense, one of the obvious research trends in real-time musical

instrument design can be the creation of easy-to-use and, at the same time, sophisticated

Page 26: Thesis Final Draft

26

and expressive systems (considering that the best way to understand and appreciated any

discipline, whether artistic or not and music in exception is by doing and being part of it).

More efficient instrument that can subvert the aforementioned effort-result quotient will

bring new-sophisticated possibilities and the joy of real-time active music creation to

non-trained musicians.

Whether following a parallel design or maintaining two independent approaches, the

physical and logical separation of the input device from the sound production necessitates

multiple way of processing and mapping the information coming from the input device.

Mapping becomes therefore an essential element in designing new instruments. We will

come back to this design process when we start the discussion of Aesthetic

Considerations.

Chapter V System Design Construction

The discussion will now focus on how I developed a gesture capturing system for

use in a performance with three dancers. The system that I developed has the following

components: ten flex sensors placed on one finger of each hand, the wrists, elbows, the

upper thighs, and the backs of the knees. The system also included two angular velocity

sensors mounted perpendicularly to each other allowing me to capture the dancers,

orientation within the room, as well as their orientation to the floor. The angular velocity

sensors output a voltage that was equal to the angle of rotation. The final capturing

component was an indoor location tracking system developed by MIT called “The

Page 27: Thesis Final Draft

27

Cricket.” The Cricket allowed for the ability to track location in two dimensions with an

accuracy of two centimeters.

Analog-to-Digital conversion

To do Analog-to-Digital conversion I chose to use a BX-24 micro-controller. The

micro-controller was not only used to do A/D conversion but also allowed me to

multiplex the aforementioned sensors together before transmitting the data. The

multiplexing worked by giving each A/D input on the micro-controller an ID from 1-10.

The ID’s where then coupled with the value of the A/D input from each sensor. This data

was then sent in a stream out of the micro-controller’s serial interface.

Hardware Example 5.) is the BX-24 on the left mounted to a development board with a

power regulator on the right.

Hardware Example 5.

Page 28: Thesis Final Draft

28

After doing A/D conversion, the micro-controller, was connected via serial into a

wearable CPU.

802.11b

To transmit the data from the sensors an 802.11b wireless card was used. This

method of transmission was not only more reliable, it also allowed for greater amounts of

bandwidth. The increase in data transmission greatly lowered latency. To add wireless

Ethernet (802.11b) to the system I chose to investigate wearable computing. It was

important to use off-the-shelf components to minimize development time.

PC-104

The decision was made to use a form factor called PC-104, these units are

approximately 3” square, and have a 104-pin expansion bus. The104-pin bus allows for

different modules to be plugged in for the expansion of the CPU’s functionality. The PC-

104 CPU module runs on an x86 compatible 66MHz processor, and was expanded with a

PCMCIA module that allowed for the addition of an 802.11 wireless Ethernet card.

Hardware Example 6.) is the TopView of the PC-104 with the PCMCIA module and

wireless Ethernet card.

Page 29: Thesis Final Draft

29

Hardware Example 6.

The system boots from a compact flash card. The compact flash card was loaded

with a custom Linux distribution that allowed me to run various applications for

transmitting the sensor data. The system is powered by modified laptop battery. All of the

components are mounted inside of a plexi-glass housing, this housing has straps similar

to a backpack for the ease of motion and wearability. The above configuration allowed

me to have a compact, affordable, off-the-shelf, wireless system to capture all of the

dancers gestures.

Hardware Example 7.) show the BottomView of the PC-104 with a 64MB memory

module on the left and a 128MB CompactFlash card on the right.

Page 30: Thesis Final Draft

30

Hardware Example 7.

Hardware Example 8.) shows the complete system including A/D converter, CPU and

PCMCIA module with Ethernet, and the location tracking system, all mounted inside of

the plexi-glass housing.

Hardware Example 8.

Page 31: Thesis Final Draft

31

Hardware Example 9.) is a block diagram of the gesture capturing system worn by the

dancer.

Hardware Example 9.

Chapter VI Software

Once the system is operational, the next step is to start considering how the

gesture can be mapped to a sound control parameter. This mapping can be easily

achieved in software. “A software layer is always necessary between gesture acquisition

and sound production: gesture management software and performance software.” (Sapir,

JNMR, 2002) I will now discuss the various software systems that can be used for

mapping gesture to sound control. In a joint effort with Morton Subotnick, Mark Coniglio

has developed "Interactor" a Macintosh software that allows the computer to interpret

data coming from a set of sensors placed on the performer's body or somewhere on the

Flex Sensors

Gyroscopes

Cricket

Listener

Micro-

Controller

PC-104 with

802.11b

wireless

Ethernet

Page 32: Thesis Final Draft

32

stage, so allowing the performer herself to directly control different devices such as

synthesizers, theatrical lighting, digital audio effects, video apparatus and even robots.

MAX/MSP

I chose to use the widely available and pseudo-modular MAX/MSP. After having

configured my gesture capturing system to be wireless, it was necessary to configure a

small wireless network between the dancers and a machine running Linux. The machine

running Linux ran a custom application called SerialOverIP, this software can run in a

client-server configuration, whereby the dancer’s wearable computer ran the server

portion and the machine running Linux ran the client portion, this software let me bind,

the serial interfaces on all my machine to an IP address. The machine running Linux had

four serial interfaces these serial interfaces were connected to a laptop running

MAX/MSP. Using the “serial” object to input the sensor data into MAX/MSP. Once the

sensor data was in MAX/MSP, the ID’s for each sensor were stripped and routed to the

appropriate objects.

The following is a run through of the MAX/MSP patch that I developed to input,

de-multiplex, and scale the sensor data coming from my gesture capturing system.

Software Example 1.) is the main patch where all the modules are connected via sub-

patches.

Page 33: Thesis Final Draft

33

Software Example 1.

Software Example 2.) shows the serialin sub-patch which takes in the input from the

machine running Linux.

Software Example 2.

Software Example 3.) shows where the data is de-multiplexed with the MAX/MSP patch.

This sub-patch take the input from the serial object strips the ID’s from the micro-

controller via the sub-patch serial routing, this stripped data is then sent to the scaling sub

patch which, make the data suitable for MIDI control, and then outputs to the averaging

schema.

Page 34: Thesis Final Draft

34

Software Example 3.

Software Example 4.) shows the serial_routing sub-patch, the ID’s from the micro-

controller’s A/D inputs were stripped from the data stream.

Software Example 4.

Software Examples 5. and 6.) show the scaling sub-patches. Here the data stream was

scaled to achieve MIDI’s minimum and maximum values of 0-255.

Page 35: Thesis Final Draft

35

Software Example 5. Software Example 6.

Software Example 7.) is a block diagram of the software data flow.

Software Example 7.

The later portions of this diagram “Averaging of Sensor Values” and “Output to MIDI

controller” will be discussed in the following Chapter.

Chapter VII Mapping and the Aesthetic Consideration

This discussion will focus on the aesthetic consideration given to mapping a

gesture to a sound control parameter. I will discuss the techniques presently used to map

gesture to sound control. The mapping problem is at the center of the effectiveness of an

Serial Input from

Linux Machine

De-Multiplex

Scaling and

Routing

Averaging of

Sensor Values

Output to MIDI

controller

Page 36: Thesis Final Draft

36

interactive environment. To a certain extent, as described by Winkler (1995), we can

consider human movement in space as a musical instrument if we allow the physicality of

movement to impact on musical material processes, and if we are able to map physical

parameters of movement on high level musical parameters down to low level sound

parameters.

One characteristic of the mapping from gestural parameters to sound control

parameters, in the case of most traditional instruments, is its simplicity and directness.

The position of a finger on the piano keyboard maps directly to pitch the downward

velocity of the finger maps directly to amplitude. In these and many other examples, a

single gestural parameter maps directly onto a single sound control parameter. (The

timbre of the piano also changes with velocity, but since the changes are linked

inseparably with the amplitude changes, we consider the pair to be a single sound control

parameter of the piano.) This directness in not lost on the listener, who can often infer the

form of the gesture from the sound generated. Cadoz (1988) states that “the sound

phenomenon produced by a natural object or instrument is an indelible trace of the

gesture” and that the perceptual system of the brain “infers possible causes,” i.e., possible

gestures that made the sound.

For the sake of simplicity many electronic instruments use basic one-to-one linear

mappings. More complex mappings such as one-to-many, many-to-one, and many-to-

many, with optional non-linear relations, memory, and feedback (so that the output is not

only a function of the last input value, but also of previous input and/or output values)

Page 37: Thesis Final Draft

37

involve deeper design strategies. “Such strategies do not exist in a separated conception

approach. They also require more demanding programming techniques not easily

attainable in the environments commonly used for mapping design purposes, such as

Max and similar data-flow graphical programming languages.” (Puckette 1988; Puckette

and Zicarelli 1990; Winkler 1998)

“Unlike the “one-gesture-to-one-acoustic-event” paradigm, our framework allows

generative algorithms that can produce complex musical structures consisting of many

events. In this model, the performers gestures can guide and control high-level

parameters of the generative algorithms rather than directly triggering each event. One of

our central metaphors for music control is that of driving of flying about in a space of

musical processes. Gestures move through time, as do the musical processes.” (Wessel,

CMJ, 2002)

Many people have chosen to have one-to-one relationships between the gesture and

the type of data manipulation. For instance, a dancer, or performer will move their finger

in and out, and a light may turn on or off. This is not the direction that we should be

headed with gesture capturing technology. It is important to break down the gestures, and

then reconstruct a method for how the gesture should manipulate the data.

My Considerations, How I chose to manipulate data.

A performance was staged as a test bed for the gesture capturing technology that I

developed. This section is going to be a discussion of the mapping decisions used in the

Page 38: Thesis Final Draft

38

performance. Each dancer had 16 different values associated with them. The values

included the flex of each of their ten joints, 4 different measurements of orientation and

X,Y coordinates for location. Using three dancers, nearly fifty different values were

being generated at any given time. It was necessary to start combining and averaging the

data, which would in turn generate different curves, and distributions. The different

curves and distributions could then effect the entire environment on a global level. These

curves were then mapped to different sound control parameters and effected very subtle

changes in the system. The averaging of the sensor data from the dancers was done as

follows: The total value of all the sensors on each arm was averaged, then, the values of

the legs; thighs and knees, for both legs was averaged. The orientation of the dancers

position to the floor, as well as their orientation within the room was combined with their

location in the room to effect the spatialization within the environment. Aesthetically this

type of mapping was much more pleasing conceptually as well as visually, and aurally.

The following is an example of what mapping took place within MAX/MSP

Software Example 7.) shows the averaging of the sensor data.

Page 39: Thesis Final Draft

39

Software Example 7.

Software Example 8.) shows the connection of the output of the above scaled and

averaged sensor data to an Eventide Orville

Software Example 8.

Page 40: Thesis Final Draft

40

Chapter VIII Dance and Music

In order to achieve a coherent integration of music and dance, we consider their

analogous principles. These include exposition of ideas, links, and transitions between

sections, variation strategies, embellishments methods, and structural issues. In dance

exposition of ideas is achieved through the repetition of movements in different space

regions, combined with variations, jumps twists and falls. Such variations incorporate

new movements to those already exposed. The general structure is normally divided into

sections, each with its own expressive criteria characterized normally by different

scenarios, light, and clothing. “These pose several problems, including what information

should be captured, how to characterizes this information and how to relate this

information in a way that is coherent with the music.” (Morales-Manzanares, 2001)

“Traditionally, music and dance have been complementary arts. However, their

integration has not always been entirely satisfactory. In general, a dancer must conform

movements to a predefined piece of music, leaving very little room for improvisational

creativity.” (Morales-Manzanares, 2001) I disagree with Morales-Manzanares on this

point, the room from improvisation can be immense when combining a dancer with an

environmentally-driven system that uses gesture as the control.

Page 41: Thesis Final Draft

41

Human gesture has always been at the center of the musical creation process either

with the writing form of a score of for the instrumental and conducting performance. On

the contrary listening to music often leads to emotions, which reverberate at the body

level: a phenomenon, which is at the basis of dance and choreography.

It is very natural for a listener to put the gesture back into the listening experience,

many listeners play Air-guitar, or Air-drums to “get into” the music, and feel what it is

like to play the instrument, thus adding visual stimulus back into the music.

Chapter IX The Performance

The designer of the performance introduces sound and music knowledge into the

system, along with the compositional goals and the aspects of integration between music

and gesture (including a model of interpretation of gestures). “This may imply an

extension of the music language toward action, gesture languages, and visual languages.

This example raises important issues about new perspectives of the integration of music

and movement languages. For example, the performance designer in a live electronics

configuration now cooperates with dancers and performers.” (Vidolin 1997)

Lexicon, Lighting, Eventide, Spatialization

I wrote a vocal quintet in three movements, this particular piece was written to be

manipulated by reverb, and delay once it was recorded. I found a very interesting space

called Engine27, to do a performance of the vocal work with dance.. Engine27 has a 16-

Page 42: Thesis Final Draft

42

speaker system with many high-end effects processing units, including a Lexicon 960L

with 16-ins and outs, as well as an Eventide Orville with 8-ins and outs. The 16-speaker

system allows for sound to be moved from any speaker to any speaker.

Engine27

I will now discuss my aesthetic decision on connecting three dancers into

Engine27’s facilities. First it was necessary to get a recording of my vocal piece, this was

recording was made in pro-tools. I took the pro-tools recording and ran it through the

Eventide, and the Lexicon, this effected mix then ran through MAX/MSP. MAX/MSP

was where the music was spatialized through the 16-speakers system. I ran a laptop that

could take the input from the dancers, transform the input, and then output midi to the

Eventide, the Lexicon, Engine27’s Horizon Lighting System, and the MAX/MSP

spacialization system. The arms and legs were averaged for each dancer and a peak and

trough were generated from their movements, these low and high values were used to

create a curve between the minimum and maximum values. The orientation was tracked

in a histogram, and was used in conjunction with the location data to generate different

random distributions for spacialization. The lighting system was also effected by time

and location. At different points in time a lighting cue was selected by the dancers

location. The general effect that I wanted to get was a global change with multiple

variables instead of specific change with one variable.

Performance Example 1.) is the Max patch used to control spatialization.

Page 43: Thesis Final Draft

43

Performance Example 1.

Performance Example 2.) is the Eventide input output portion of the spatialization patch.

Performance Example 2.

Page 44: Thesis Final Draft

44

Performance Example 3.) is the Lexicon and Eventide Spatialization output portion of the

patch.

Performance Example 3.

Performance Example 4.) Shows the setup for each movement of the vocal composition

Performance Example 4.

Performance Example 5.) Shows the sub-patch pan_sub, this sub-patch received location

and orientation tracking information off of the dancer to control what sound would be

placed in each speaker.

Page 45: Thesis Final Draft

45

Performance Example 5.

Performance Example 6.) Shows the sub patch quikPan

Performance Example 6. This patch was developed by Dafna Naphtali. The input can be

sent to any of the 16 speakers. This is controlled by the coll object which, holds a matrix

for the different speaker combinations.

Diagram of the Performance

Performance Example 7.) is the block-diagram of the entire performance setup.

Page 46: Thesis Final Draft

46

Performance Example 7.

A DVD is included with this paper. This DVD includes the performance titled

“Epistemology” which is mixed in 5.1, as well as the above MAX/MSP patches.

Wearable

Gesture Capture

System

Wearable

Gesture Capture

System

Wearable

Gesture Capture

System

Wireless

Hub Linux

Machine

Laptop

Pro-Tools

playback

and output

Orville

Lexicon

Max

Spatial

System

Sound Web Amplifiers

Horizon

Lighting

System

Audio

Output

Page 47: Thesis Final Draft

47

Chapter X

Perception

The perceived control of the parameters affected by the dancers was not an

obvious one-to-one connection. The mapping and aesthetic decisions made in the above

mentioned performance helped to draw the audience into the piece, peaking the

audience’s awareness. The awareness was keyed into linking the dancers movement to

the music; to listen more intently to the changes in the music being effected by the

dancers, and watching the dancers movement to hear the change. This heightening of

awareness was a necessary component to the performance.

The technological description begs the compositional issues at the head of this

paper. Certainly the composer is forced to approach his/her composition in a different

way. There is no possibility here of building a score that can be stepwise triggered in

some way by the performers – Rowe’s (Rowe 1994) “score-driven” interactive system –

neither are the performers simply extending material provided for, or playing some kind

of extended instrument – Rowe’s “performance-driven” system. In the case of my

performance and others like this performance, the performer(s) are entering a virtual

environment that has precomposed, probabilistic, algorithmic and aleatoric elements. The

virtual environment is as much an instrument to be played as a score to be read.

Traditional notions of form; subject, development, repetition, become less important,

even somewhat impossible in a world of indeterminacy and eccentricity. Yet this in not a

truly indeterminate music neither is it purely improvisational. Central to the

Page 48: Thesis Final Draft

48

compositional design of the piece is the algorithmic manipulation of data – manipulation

that may not necessarily mirror its input in any recognizable way.(Povall).

Michel Waisviz, designer of THE HANDS (Waisvisz 1985), has provided one of

the best counter-examples to the notion that electronic music is inherently lifeless.

Following Waisvisz’s example, designers of new controllers should attempt to maintain

or even expand the bandwidth and parallelism of expression beyond that of traditional

instruments. Furthermore, new controllers should allow the performer to flexibly redirect

expression at whatever functional level that is appropriate. For example one might want

to maintain careful control of the timer of a cell in a quested environment or during a

solo, but then redirect one’s expressive capabilities to the control of orchestral timber

during a finale. In the latter context, the orchestra can be thought of as a large, expressive

instrument.

Gestural expression and listening are constrained by a context based on stylistic,

syntactic, and semantic conventions, including those that involve qualities such as

emotional “depth”. A musical instrument designer may want to implement relation

between a sound parameter or musical quality described in terms of one representation

and a movement parameter or gestural quality described by means of another. To

implement such relations, the designer will first need to define a way to transform the

first representation in to the second. Failure to do so implies a return to a single

representation of musical performance.

Page 49: Thesis Final Draft

49

How does the re-association of visual stimulus effect our perception of a

performance? This visual stimulus provided by a performance give the listener a more

heightened sense of awareness. This heightened awareness draws the audience into the

music that is being performed and is ultimately more satisfying.

Conclusion Future Applications Wearable computing, augmented reality

I would like to expand my knowledge of gesture capturing systems into many

different areas. Firstly, I plan on condensing many of the interconnections, associated

with this system. I am already developing a MAX/MSP object that will allow me to go

directly from the wearable computer into a laptop. This will allow me to remove all of the

serial connection from the system. Also, under consideration for the expansion of the

system, I would like to discontinue the usage of MAX/MSP, and move to a real-time

system such as Roger Dannenberg’s AuraRT. AuraRT will allow me to directly

synthesize sound off a laptop running Linux. A Revision of the hardware used for the

wearable computer is in progress as well. I want to further decrease the size of the

wearable computer. This will be accomplished by the use of Stanford University’s

Matchbox PC. The Matchbox PC, is 5 cubic inches and weighs 3.3ozs. This will be the

direction for further development with my device. I have considered different

applications for my device as well.

The gesture capturing system that I developed could be used as a video-game

controller. Allowing the player to map any body movement to a movement within the

Page 50: Thesis Final Draft

50

game. I would like to expand on this idea by first. Getting the opportunity, to do another

performance of my vocal quintet where video could be manipulated in real-time, as well

as audio, and lighting. I would like to also explore how the gesture capturing system

could be used on the Internet utilizing the 802.11 capabilities. Possibly controlling

shockwave or flash movies. Once, connected to the Internet, the system could be used to

manipulate data in a database. I would like to hook the system into a database, and do

relational data manipulations, using Multi-Dimensional object (M-OLAP)

transformations. This transformed data could be used to manipulate video and audio data

streams coming off a streaming server running java servlets. Also, the system could be

used in motion capturing. This could be further explored with the possibility of

connecting the gesturing system, via the python scripting language, into the open-source

real-time 3D rendering application Blender. There are applications in augmented reality

that could be exploited as well, the CPU module has the capability for an LCD display,

thus allowing one to connect retinal scanning glasses. These glasses would allow a

performer to view the real-world, while mapping other data over the top of it. This would

give the performer more feedback of the environmental changes that they are effecting.

Conclusion

To conclude, gesture has to be used to control sound. The concept of gesture

control has already defined itself as a field within music composition and music

technology. Technology has progressed far enough for us to implement real-time

computing, with the use of highly accurate sensors, to develop affordable gesture

capturing systems. With low-latency and high accuracy, these gesture-capturing systems

Page 51: Thesis Final Draft

51

have applications throughout various fields, outside of music composition and music

technology. The decision of mapping gesture to sound control parameters, is then the area

where further compositional control can be exercised. The idea of gesture mapping is a

burgeoning field, which with these advances in technology, will hopefully, now begin to

be further explored. Once the mapping of gesture to sound control has been clearly

defined, highly effective compositions can be created. These compositions can make

gesture, utilizing the ideas of an environmentally-driven performance, a new instrument.

These performance’s will allow composers to re-introduce the visual stimulus that has

been missing from musical performance for the past fifty years.

Page 52: Thesis Final Draft

52

Appendices

Code for SerialOverIP, this application was written in C, and is being ported to

MAX/MSP.

/* * ---------------------------------------------------------------------------- * serialoverip * Utility for transport of serial interfaces over UDP/IP * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 2 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA * */ #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <fcntl.h> #include <termios.h> #include <signal.h> #define MAXMESG 2048 char*pname; int s[2],st[2];

Page 53: Thesis Final Draft

53

void help(){ fprintf(stderr,"\ SerialOverIP comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to redistribute it under GNU General Public License. Usage: %s <source1> <source2> where <source1> and <source2> are one of the folowing: -s <IP> <port> UDP server on IP:port -c <IP> <port> UDP client for server IP:port -d <device> sss-dps local serial device sss is speed (50,..,230400) d is data bits (5,6,7,8) p is parity type (N,E,O) s is stop bits (1,2) ",pname); return; } int setserial(int s,struct termios*cfg,int speed,int data,unsigned char parity,int stopb){ cfmakeraw(cfg); switch(speed){ case 50 : { cfsetispeed(cfg,B50) ; cfsetospeed(cfg,B50) ; break; } case 75 : { cfsetispeed(cfg,B75) ; cfsetospeed(cfg,B75) ; break; } case 110 : { cfsetispeed(cfg,B110) ; cfsetospeed(cfg,B110) ; break; } case 134 : { cfsetispeed(cfg,B134) ; cfsetospeed(cfg,B134) ; break; } case 150 : { cfsetispeed(cfg,B150) ; cfsetospeed(cfg,B150) ; break; } case 200 : { cfsetispeed(cfg,B200) ; cfsetospeed(cfg,B200) ; break; } case 300 : { cfsetispeed(cfg,B300) ; cfsetospeed(cfg,B300) ; break; } case 600 : { cfsetispeed(cfg,B600) ; cfsetospeed(cfg,B600) ; break; } case 1200 : { cfsetispeed(cfg,B1200) ; cfsetospeed(cfg,B1200) ; break; } case 1800 : { cfsetispeed(cfg,B1800) ; cfsetospeed(cfg,B1800) ; break; } case 2400 : { cfsetispeed(cfg,B2400) ; cfsetospeed(cfg,B2400) ; break; }

Page 54: Thesis Final Draft

54

case 4800 : { cfsetispeed(cfg,B4800) ; cfsetospeed(cfg,B4800) ; break; } case 9600 : { cfsetispeed(cfg,B9600) ; cfsetospeed(cfg,B9600) ; break; } case 19200 : { cfsetispeed(cfg,B19200) ; cfsetospeed(cfg,B19200) ; break; } case 38400 : { cfsetispeed(cfg,B38400) ; cfsetospeed(cfg,B38400) ; break; } case 57600 : { cfsetispeed(cfg,B57600) ; cfsetospeed(cfg,B57600) ; break; } case 115200 : { cfsetispeed(cfg,B115200); cfsetospeed(cfg,B115200); break; } case 230400 : { cfsetispeed(cfg,B230400); cfsetospeed(cfg,B230400); break; } } switch(parity|32){ case 'n' : { cfg->c_cflag &= ~PARENB; break; } case 'e' : { cfg->c_cflag |= PARENB; cfg->c_cflag &= ~PARODD; break; } case 'o' : { cfg->c_cflag |= PARENB; cfg->c_cflag |= PARODD ; break; } } cfg->c_cflag &= ~CSIZE; switch(data){ case 5 : { cfg->c_cflag |= CS5; break; } case 6 : { cfg->c_cflag |= CS6; break; } case 7 : { cfg->c_cflag |= CS7; break; } case 8 : { cfg->c_cflag |= CS8; break; } } if(stopb==1)cfg->c_cflag&=~CSTOPB; else cfg->c_cflag|=CSTOPB; return tcsetattr(s,TCSANOW,cfg); } void gotint(int x){ if(st[0]&2){ tcflush(s[0],TCIOFLUSH); close(s[0]); } if(st[1]&2){ tcflush(s[1],TCIOFLUSH); close(s[1]); } printf("%s exiting.\n",pname); exit(1); }

Page 55: Thesis Final Draft

55

int main(int argc,char**argv){ int i,n,w,clen[2],nonblock[2],speed,data,stopb; unsigned char c,buf[MAXMESG],*p,parity; struct termios cfg; struct sockaddr_in addr[4][4]; struct sigaction newact,oldact; pname=argv[0]; if(argc!=7){ help(); return 1; } for(i=0;i<2;i++){ st[i]=0; switch(argv[3*i+1][1]){ case 's': st[i]=1; case 'c': bzero((char *) &(addr[i][0]), sizeof(addr[i][0])); addr[i][0].sin_family = AF_INET; addr[i][0].sin_addr.s_addr = inet_addr(argv[3*i+2]); addr[i][0].sin_port = htons(atoi(argv[3*i+3])); bzero((char *) &(addr[i][1]), sizeof(addr[i][1])); addr[i][1].sin_family = AF_INET; addr[i][1].sin_addr.s_addr = 0; addr[i][1].sin_port = htons(0); if((s[i]=socket(AF_INET,SOCK_DGRAM,0))<0){ fprintf(stderr,"%s: can't open datagram socket",pname); return 3; } if(bind(s[i],(struct sockaddr*)&addr[i][!st[i]],sizeof(addr[i][!st[i]]))<0){ fprintf(stderr,"%s: can't bind local address",pname); return 4; } break; case 'd': st[i]=2; if((s[i]=open(argv[3*i+2],O_RDWR|O_NDELAY))<0){ fprintf(stderr,"%s: could not open device %s\n", pname,argv[3*i+2]); return -1; }

Page 56: Thesis Final Draft

56

n=sscanf(argv[3*i+3],"%d-%d%c%d",&speed,&data,&parity,&stopb); if(n<4){ fprintf(stderr,"%s: invalid argument %1d from %s\n", pname,read+1,argv[3*i+3]); return 3; } if(setserial(s[i],&cfg,speed,data,parity,stopb)<0){ fprintf(stderr,"%s: could not initialize device %s\n", pname,argv[3*i+2]); return 7; } break; default:help();return 2; } clen[i]=sizeof(addr[i][1]); nonblock[i]=!(st[i]&1); } signal(SIGINT,gotint); i=0; while(1){ if(st[i]&2)n=read(s[i],buf,MAXMESG); else{ n=recvfrom(s[i],buf,MAXMESG,nonblock[i]*MSG_DONTWAIT, (struct sockaddr*)&addr[i][st[i]],&clen[i]); nonblock[i]=1; } p=buf; while(n>0){ if(st[!i]&2)w=write(s[!i],p,n); else w=sendto(s[!i],p,n,0, (struct sockaddr*)&addr[!i][st[!i]],clen[!i]); if(w>0){ n-=w; p+=w; }else{ fprintf(stderr,"%s: write error\n",pname); break; } } i=!i; } return 0; }

Page 57: Thesis Final Draft

57

The following is code from the Linux kernel with kernel pre-emptive and low-latency

patches.

+A preemptible kernel creates new locking issues. The issues are the same as

+those under SMP: concurrency and reentrancy. Thankfully, the Linux preemptible

+kernel model leverages existing SMP locking mechanisms. Thus, the kernel

+requires explicit additional locking for very few additional situations.

+

+This document is for all kernel hackers. Developing code in the kernel

+requires protecting these situations.

+

+

+RULE #1: Per-CPU data structures need explicit protection

+

+

+Two similar problems arise. An example code snippet:

+

+ struct this_needs_locking tux[NR_CPUS];

+ tux[smp_processor_id()] = some_value;

+ /* task is preempted here... */

+ something = tux[smp_processor_id()];

+

+First, since the data is per-CPU, it may not have explicit SMP locking, but

+require it otherwise. Second, when a preempted task is finally rescheduled,

+the previous value of smp_processor_id may not equal the current. You must

+protect these situations by disabling preemption around them.

+

+

+RULE #2: CPU state must be protected.

+

+

+Under preemption, the state of the CPU must be protected. This is arch-

+dependent, but includes CPU structures and state not preserved over a context

+switch. For example, on x86, entering and exiting FPU mode is now a critical

+section that must occur while preemption is disabled. Think what would happen

+if the kernel is executing a floating-point instruction and is then preempted.

+Remember, the kernel does not save FPU state except for user tasks. Therefore,

+upon preemption, the FPU registers will be sold to the lowest bidder. Thus,

Page 58: Thesis Final Draft

58

+preemption must be disabled around such regions.

+

+Note, some FPU functions are already explicitly preempt safe. For example,

+kernel_fpu_begin and kernel_fpu_end will disable and enable preemption.

+However, math_state_restore must be called with preemption disabled.

+

+

+RULE #3: Lock acquire and release must be performed by same task

+

+

+A lock acquired in one task must be released by the same task. This

+means you can't do oddball things like acquire a lock and go off to

+play while another task releases it. If you want to do something

+like this, acquire and release the task in the same code path and

+have the caller wait on an event by the other task.

+

+

+SOLUTION

+

+

+Data protection under preemption is achieved by disabling preemption for the

+duration of the critical region.

+

+preempt_enable() decrement the preempt counter

+preempt_disable() increment the preempt counter

+preempt_enable_no_resched() decrement, but do not immediately preempt

+preempt_get_count() return the preempt counter

+

+The functions are nestable. In other words, you can call preempt_disable

+n-times in a code path, and preemption will not be reenabled until the n-th

+call to preempt_enable. The preempt statements define to nothing if

+preemption is not enabled.

+

+Note that you do not need to explicitly prevent preemption if you are holding

+any locks or interrupts are disabled, since preemption is implicitly disabled

+in those cases.

+

+Example:

+

+ cpucache_t *cc; /* this is per-CPU */

+ preempt_disable();

+ cc = cc_data(searchp);

+ if (cc && cc->avail) {

+ __free_block(searchp, cc_entry(cc), cc->avail);

+ cc->avail = 0;

+ }

Page 59: Thesis Final Draft

59

+ preempt_enable();

+ return 0;

+

+Notice how the preemption statements must encompass every reference of the

+critical variables. Another example:

+

+ int buf[NR_CPUS];

+ set_cpu_val(buf);

+ if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n");

+ spin_lock(&buf_lock);

+ /* ... */

+

+This code is not preempt-safe, but see how easily we can fix it by simply

+moving the spin_lock up two lines.

Page 60: Thesis Final Draft

60

Resources

Bauer, Will; Foss, Bruce GAMS: An integrated media controller

system. Computer Music Journal, Vol. 16;

Issue 1; spring 1992; pp. 19-24; Illus. ISSN: 0148-

9267.

Bongers, Bert An interview with Sensorband.

Computer Music Journal, Vol. 22; Issue 1; spring 1998;

pp. 13-24; Illus., port. ISSN: 0148-9267.

Camurri, Antonio; Coletta, Paolo; Ricchetti, Matteo; Volpe, Gualtiero

Expressiveness and physicality in interaction.

Journal of New Music Research, 2000; Vol. 29; Issue

3; Sept 2000; pp. 187-198; Illus., charts, diagr.

ISSN: 0929-8215.

Camurri, Antonio; Hashimoto, Shuji; Ricchetti, Andrea; Ricci, Andrea;

Trocca, Riccardo; Volpe, Gualtiero EyesWeb: Toward gesture and

affect recognition in interactive dance and music systems.

Computer Music Journal, Vol. 24; Issue 1; spring

2000; pp. 57-69; Bibliog., illus., charts, diagr.

ISSN: 0148-9267.

Causse, Rene Recherche et facture instrumentale: Logiques de

l'innovation. [Research into the making of music

instruments: The logic of innovation.] Les cahiers de

l'IRCAM: Recherche et musique, Issue 7; 3rd quarter 1995;

pp. 67-76 ISSN: 1242-8493.

Cerana, Carlos Gesture control for musical processes: A MAX

environment for Buchla's "Lightning". Organised Sound:

An International Journal of Music Technology, Vol. 5;

Issue 1; Apr 2000; pp. 3-7; Charts, diagr. ISSN:

1355-7718.

Cerana, Carlos Gesture control of musical processes: A MAX

environment for the lightning. Porto Alegre, Brazil: Sociedade

Brasileira de Computacao; pp. 169-178; 1999

Page 61: Thesis Final Draft

61

Chafe, Chris Tactile audio feedback. San Francisco:

International Computer Music Association; pp. 76-79; 1993; Charts,

diagr.

Damiani, Furio; Manzolli, Jonatas; Mendes, Gilberto Controle

parametrico MIDI usando interface gestual ultrasonica. [

Parametric control in MIDI using an ultrasonic gestural interface.

] Porto Alegre, Brazil: Sociedade Brasileira de Computacao;

pp. 55-60; 1998

Darreg, Ivor; Hopkin, Bart Still nothing else like it: The

theremin. Experimental Musical Instruments

, Vol. 8; Issue 3; Mar 1993; pp. 22-26; Illus., bibliog.

ISSN: 0883-0754.

Downes, Pat Motion sensing in music and dance performance.

New York: Audio Engineering Society; pp. 165-172; 1987

Ghazala, Qubais Reed Circuit-bending and living instruments: The

odor box. Experimental Musical Instruments

, Vol. 8; Issue 2; Dec 1992; pp. 17-23; Illus.

ISSN: 0883-0754.

Harada, Tsutomu; Sato, Akio; Hashimoto, Shuji; Ohteru, Sadamu

Real time control of 3D sound space by gesture. San

Francisco: International Computer Music Association; pp. 85-88; 1992;

Illus.

Hobus, Steffi Wittgenstein uber Expressivitat: Der Ausdruck in

Korpersprache und Kunst. [Wittgenstein on expressivity:

Expressions in body language and art.] Hannover:

Internationalismus-Verlag; 211 p.; (PhD diss.: U. Bielefeld, 1994)

ISBN: 3-922218-67-9.

Keislar, Douglas [Editor] Computer music journal. XXVI/3 (fall

2002): New performance interfaces. Computer Music

Journal, Vol. 26; Issue 3; fall 2002; pp. 11-76; Illus.,

bibliog., charts, diagr. ISSN: 0148-9267.

Keislar, Douglas [Editor] Computer music journal. XXV/4 (winter

2001): Sound in space. Computer Music Journal

, Vol. 25; Issue 4; winter 2001; pp. 21-90; Illus.,

port., facs., music, bibliog., list of works, charts, diagr., sound

recording ISSN: 0148-9267.

Page 62: Thesis Final Draft

62

Logemann, George W. Experiments with a gestural controller.

San Francisco, CA: Computer Music Association; pp. 184-185;

1989

Manzolli, Jonatas The development of a gesture interfaces

laboratory. Porto Alegre, Brazil: Sociedade Brasileira de

Computacao; pp. 81-84; 1995

Modler, Paul Interactive control of musical structures by hand

gestures. Porto Alegre, Brazil: Sociedade Brasileira de

Computacao; pp. 143-150; 1998

Moog, Robert A. Position and force sensors and their

applications to keyboards and related control devices. New

York: Audio Engineering Society; pp. 173-181; 1987

Mulder, Axel Getting a grip on alternate controllers: Addressing

the variability of gestural expression in musical instrument design.

Leonardo Music Journal, Vol. 6; 1996;

pp. 33-40; Illus. ISSN: 0961-1215.

Mulder, Axel Sound sculpting: Performing with virtual musical

instruments. Porto Alegre, Brazil: Sociedade Brasileira de

Computacao; pp. 151-164; 1998

Povall, Richard Compositional methods in interactive performance

environments. Journal of New Music Research

, Vol. 24; Issue 2; June 1995; pp. 109-120; Port.,

charts, diagr. ISSN: 0929-8215.

Rubine, Dean; McAvinney, Paul Programmable finger-tracking

instrument controllers. Computer Music Journal

, Vol. 14; Issue 1; spring 1990; pp. 26-41

ISSN: 0148-9267.

Rubine, Dean; McAvinney, Paul The VideoHarp. Koln,

Germany: Feedback Studio; pp. 49-55; 1988; Bibliog., tech. Drawings

Rubine, Dean; McAvinney, Paul Programmable finger-tracking

instrument controllers. Computer Music Journal

, Vol. 14; Issue 1; spring 1990; pp. 26-41

ISSN: 0148-9267.

Settel, Zack; Holton, Terry; Zicarelli, David Remote control

applications using "smart controllers" in versatile hardware

Page 63: Thesis Final Draft

63

configurations. San Francisco: International Computer Music

Association; pp. 156-159; 1993

Tanaka, Atau Musical technical issues in using interactive

instrument technology with application to the BioMuse. San

Francisco: International Computer Music Association; pp. 124-126; 1993

Abstract: The BioMuse is a bioelectrical musical controller used in composition and

concert performance. (author)

Tobenfeld, Emile A system for computer assisted gestural

improvisation. San Francisco: International Computer Music

Association; pp. 93-96; 1992

Ungvary, Tamas; Vertegaal, Roel Designing musical

cyberinstruments with body and soul in mind. Journal of

New Music Research, 2000; Vol. 29; Issue 3; Sept 2000;

pp. 245-255; Illus. ISSN: 0929-8215.

Wanderley, Marcelo M.; Battier, Marc; Depalle, Philippe; Dubnov,

Shlomo; Hayward, Vincent Gestural research at IRCAM: A progress

report. Publications du Laboratoire de Mecanique et

d'Acoustique, 1998; Issue 148; May 1998; pp. D2.1-D2.8

ISSN: 1159-0947.

Wessel, David L. Instruments that learn, refined controllers,

and source model loudspeakers. Computer Music Journal

, Vol. 15; Issue 4; winter 1991; pp. 82-86; Port.,

bibliog. ISSN: 0148-9267.