computational nature of human adaptive control during...

22
Abstract. Learning to make reaching movements in force fields was used as a paradigm to explore the system architecture of the biological adaptive controller. We compared the performance of a number of candidate control systems that acted on a model of the neuromus- cular system of the human arm and asked how well the dynamics of the candidate system compared with the movement characteristics of 16 subjects. We found that control via a supra-spinal system that utilized an adaptive inverse model resulted in dynamics that were similar to that observed in our subjects, but lacked essential characteristics. These characteristics pointed to a dierent architecture where descending commands were influenced by an adaptive forward model. How- ever, we found that control via a forward model alone also resulted in dynamics that did not match the behavior of the human arm. We considered a third control architecture where a forward model was used in conjunction with an inverse model and found that the resulting dynamics were remarkably similar to that observed in the experimental data. The essential prop- erty of this control architecture was that it predicted a complex pattern of near-discontinuities in hand trajec- tory in the novel force field. A nearly identical pattern was observed in our subjects, suggesting that generation of descending motor commands was likely through a control system architecture that included both adaptive forward and inverse models. We found that as subjects learned to make reaching movements, adaptation rates for the forward and inverse models could be indepen- dently estimated and the resulting changes in perfor- mance of subjects from movement to movement could be accurately accounted for. Results suggested that the adaptation of the forward model played a dominant role in the motor learning of subjects. After a period of consolidation, the rates of adaptation in the internal models were significantly larger than those observed before the memory had consolidated. This suggested that consolidation of motor memory coincided with freeing of certain computational resources for subse- quent learning. 1 Introduction The electric fish relies on its ability to sense the weak electrical fields generated by other animals to identify prey and predator. Unfortunately, this field is not only influenced by the motion of other animals, but also by the motion of the electric fish itself. Therefore, the animal needs to be able to take into account self- generated changes to the surrounding electric field and subtract it from the measured field in order to estimate the external world. The problem might be solved if the animal could send signals from the motor centers to the sensory areas and provide a negative image of the predicted sensory changes that should be detected as a consequence of the just programmed motor output. Such a negative image would then add with the actual sensory input and result in an estimate of the external world (Sperry 1950). In fact, such signals have been detected in the electric fish (Bell 1981), shown to be adaptable (Montgomery and Bodznick 1994), and coded in a cerebellum-like structure of the animal (Bell et al. 1997). The computation involved in predicting sensory consequences of a motor command, as exemplified by the electric fish, is termed a forward model 1 (Jordan and Rumelhart 1992). There are now a number of studies that have suggested that a forward model may also be used by the human central nervous system (CNS) to estimate sensory consequences of motor actions (Wol- pert et al. 1993, 1995; Flanagan and Wing 1997). For example, during precision grasping of a small object, the grip forces change in concert with load forces that act to 1 In the control literature, this type of model is called an observer. Biol. Cybern. 81, 39–60 (1999) Computational nature of human adaptive control during learning of reaching movements in force fields Nikhil Bhushan, Reza Shadmehr Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA Received: 01 January 1998 / Accepted in revised form: 26 January 1999 Correspondence to: R. Shadmehr, Department of Biomedical Engineering, Johns Hopkins School of Medicine, 720 Rutland Ave/419 Traylor, Baltimore, MD 21205-2195, USA (e-mail: [email protected] Tel.: +1-410-614-2458, Fax: +1-410-614-9890)

Upload: doandan

Post on 23-May-2018

220 views

Category:

Documents


1 download

TRANSCRIPT

Abstract. Learning to make reaching movements inforce ®elds was used as a paradigm to explore the systemarchitecture of the biological adaptive controller. Wecompared the performance of a number of candidatecontrol systems that acted on a model of the neuromus-cular system of the human arm and asked how well thedynamics of the candidate system compared with themovement characteristics of 16 subjects. We found thatcontrol via a supra-spinal system that utilized anadaptive inverse model resulted in dynamics that weresimilar to that observed in our subjects, but lackedessential characteristics. These characteristics pointed toa di�erent architecture where descending commandswere in¯uenced by an adaptive forward model. How-ever, we found that control via a forward model alonealso resulted in dynamics that did not match thebehavior of the human arm. We considered a thirdcontrol architecture where a forward model was used inconjunction with an inverse model and found that theresulting dynamics were remarkably similar to thatobserved in the experimental data. The essential prop-erty of this control architecture was that it predicted acomplex pattern of near-discontinuities in hand trajec-tory in the novel force ®eld. A nearly identical patternwas observed in our subjects, suggesting that generationof descending motor commands was likely through acontrol system architecture that included both adaptiveforward and inverse models. We found that as subjectslearned to make reaching movements, adaptation ratesfor the forward and inverse models could be indepen-dently estimated and the resulting changes in perfor-mance of subjects from movement to movement couldbe accurately accounted for. Results suggested that theadaptation of the forward model played a dominant rolein the motor learning of subjects. After a period of

consolidation, the rates of adaptation in the internalmodels were signi®cantly larger than those observedbefore the memory had consolidated. This suggestedthat consolidation of motor memory coincided withfreeing of certain computational resources for subse-quent learning.

1 Introduction

The electric ®sh relies on its ability to sense the weakelectrical ®elds generated by other animals to identifyprey and predator. Unfortunately, this ®eld is not onlyin¯uenced by the motion of other animals, but also bythe motion of the electric ®sh itself. Therefore, theanimal needs to be able to take into account self-generated changes to the surrounding electric ®eld andsubtract it from the measured ®eld in order to estimatethe external world. The problem might be solved if theanimal could send signals from the motor centers tothe sensory areas and provide a negative image of thepredicted sensory changes that should be detected as aconsequence of the just programmed motor output. Sucha negative image would then add with the actual sensoryinput and result in an estimate of the external world(Sperry 1950). In fact, such signals have been detected inthe electric ®sh (Bell 1981), shown to be adaptable(Montgomery and Bodznick 1994), and coded in acerebellum-like structure of the animal (Bell et al. 1997).

The computation involved in predicting sensoryconsequences of a motor command, as exempli®ed bythe electric ®sh, is termed a forward model1 (Jordan andRumelhart 1992). There are now a number of studiesthat have suggested that a forward model may also beused by the human central nervous system (CNS) toestimate sensory consequences of motor actions (Wol-pert et al. 1993, 1995; Flanagan and Wing 1997). Forexample, during precision grasping of a small object, thegrip forces change in concert with load forces that act to

1 In the control literature, this type of model is called an observer.

Biol. Cybern. 81, 39±60 (1999)

Computational nature of human adaptive control during learningof reaching movements in force ®elds

Nikhil Bhushan, Reza Shadmehr

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21205, USA

Received: 01 January 1998 /Accepted in revised form: 26 January 1999

Correspondence to: R. Shadmehr,Department of Biomedical Engineering, Johns Hopkins Schoolof Medicine, 720 Rutland Ave/419 Traylor, Baltimore,MD 21205-2195, USA(e-mail: [email protected].: +1-410-614-2458, Fax: +1-410-614-9890)

move the object (Johansson and Westling 1984; Flana-gan and Wing 1997), even if the load forces are gener-ated by one hand and the grip forces are generated bythe other (Blakemore et al. 1998). Due to the delay in-herent in the sensory-motor loop (Johansson andWestling 1984; Blakemore et al. 1998), a close synchronybetween grip and load forces is possible only if the braincould predict the motion-dependent nature of the loadforces from a forward model of the dynamics of the limband the load.

If a forward model of the dynamics of the limb isavailable to the CNS, then an interesting use of such amodel might be to provide a means by which the CNScould gauge the consequences of the just-programmedmotor commandswithout having towait for the arrival ofthe corresponding sensory signals (Miall et al. 1993;Darlot et al. 1996;Miall andWolpert 1996; Ronco 1998).Gauging, in this case, means a comparison of the pre-dicted sensory consequences of themotor commandswiththe desired behavior. This comparison would allow forestimation of an error signal, i.e., ameasure of how far thearm is predicted to be from a desired state. Motor com-mands can then be modi®ed to reduce this predicted errorin advance of the corrections that would be possible if thebrain had to wait for the transmission delayed a�erentinformation from the moving arm, delays which mayexceed 100 ms (Lee and Tatton 1975; Cordo et al. 1994).

In the control literature, such ideas are found, forexample, in Luenberger observers and Smith predictors,which are forward models that are used for control oflinear systems with time delays (Astrom and Wittenm-ark 1984), and in metric observers for control of non-linear systems (Lohmiller and Slotine 1996). Whetherthe dynamics of the time-delayed system is linear or non-linear, the idea remains the same: use the forward modelto feed back the predicted response of the remote systemimmediately, as well as the error in this prediction whenthe real response becomes available via the transmissionchannel. Obviously, stability of these systems willstrongly rely on the accuracy of the forward model andthe ability to cancel the response from the remote sys-tem. If the human CNS uses a forward model to controlthe arm, then by manipulating the dynamics of the armone can introduce a condition where the hypotheticalforward model would be grossly inappropriate for thetask. Here, we take this approach and ask how the re-sulting behavior of the human arm compares with thatwhich would be expected if a forward model was beingused for programming the descending motor commands.

While the use of a forward model may be particularlyrelevant for feedback control of time-delayed systems,an alternate approach is through the use of an inversemodel (Atkeson 1989; Kawato 1989; Shadmehr 1990;Gomi and Kawato 1992; Katayama and Kawato 1993;Schweighofer et al. 1998). Whereas a forward modelpredicts the sensory consequences of motor commands,an inverse model predicts motor commands that areappropriate for a desired behavior. Inverse models aregenerally not considered for control of time-delayedsystems because they would seem to exclude the abilityof the controller to respond to errors, resulting in a open

loop controller. However, as a recent approach hasillustrated (Niemeyer and Slotine 1991), if a local feed-back controller stationed at the remote system isavailable, then reacting to the delayed error informationreceived by the up-stream controller may be possiblein a way that does not result in instability. In the case ofthe human arm, such a local feedback controller isthought to be present in the form of spring-like musclesand spinal re¯ex loops (Massaquoi and Slotine 1996).Therefore, at least in theory, adaptive control of thearm in the face of a novel mechanical load may takeplace through learning of a forward, an inverse, orperhaps a mixture of both types of controllers. Does thedata on the behavior of the human arm when coupledto a novel load allow us to di�erentiate between thesepossibilities?

In the current report we examine the behavior of thehuman arm as the hand is coupled to a novel dynamicalsystem during generation of reaching movements. Weconsider a reasonably realistic model of the arm's iner-tial and muscle dynamics, local re¯exes, and delays inthe various communication channels. We initially con-sider control of the arm via an adaptive inverse model.We ®nd that, while the simulation results with the in-verse model resemble the actual behavior of our sub-jects, there are also important features in handtrajectories that cannot be explained despite systematicvariations in the parameters of the model. These featuressuggest that the arm is being controlled by factors otherthan those accounted for in the adaptive inverse modelcontroller. We next consider performance of the systemunder the control of an adaptive forward model and ®ndcertain limitations to this approach. We demonstratethat this controller also fails to precisely account for thecharacteristics of the biological controller. These simu-lations give insights into characteristics of these twoapproaches and suggest a third approach, one wherethere is an interaction between the inverse and forwardmodels during the process of control. We demonstratethat the behavior of the human arm when coupled with anovel mechanical system is very similar to the behaviorthat results when the controller relies on a combinationof forward and inverse models. It appears that the bio-logical controller relies on state estimates from a for-ward model which, in turn, provide an error signal thatthrough an inverse model modulates descending motorcommands.

If the biological controller is composed of a combi-nation of forward and inverse models, then it is impor-tant to ask whether the rates of learning of these twoadaptive models are the same. Can we estimate the ad-aptation rate of each model from the actual performanceof the human subjects? An intriguing idea put forth byJordan and Rumelhart (1992) and others (Wada andKawato 1993; Miall and Wolpert 1996) is that if a for-ward model is available, it can e�ectively serve as amodel of the controlled system. Using the forwardmodel, the brain may simulate dynamics of the con-trolled system during an ``o�-line'' period and teach it-self an inverse model. The learning that may take placeduring the o�-line period may result in a fundamentally

40

di�erent control system than was apparent during theinitial practice of the task. We are intrigued by this ideabecause it may be related to the functional (Brashers-Krug et al. 1996; Shadmehr and Brashers-Krug 1997)and neural changes (Shadmehr and Holcomb 1997,1999) that we have observed during consolidation ofmotor memories. During the hours after completion ofpractice, the perturbation response characteristics of thehuman adaptive controller appears to gradually change,becoming remarkably more stable. Here, we explore thischange in terms of the elements of a control system thatincludes adaptive forward and inverse models. We askhow the perturbation response of the arm should changeas a function of adaptation rates in the forward andinverse models. We conclude with an estimate of theactual rates of adaptation for each model from data of16 subjects that were examined during the period ofconsolidation.

2 Plant dynamics

Here, we consider the merits of various theoreticalcontrol mechanisms that may act on a time-delayedsystem resembling the human arm. The psychophysicaldata that will be used for this comparison were gatheredfrom subjects that made reaching movements whileholding a robotic manipulandum, (Fig. 1). The move-ments were in the horizontal plane.

The inverse dynamics of the robot (while not beingheld by the subject) is described by:

Tr �Dr�U��U� Cr�U; _U� _U

Dr �kr1 kr3 cos�/2 ÿ /1�

kr3 cos�/2 ÿ /1� kr2

� �Cr � 0 ÿkr3 sin�/2 ÿ /1� _/2

kr3 sin�/2 ÿ /1� _/1 0

" #where Tr is joint torques on the robot due to its motionand parameters kr are constants that depend on linklengths and mass distributions of the links that make up

the robot. Similarly, the inverse dynamics of the humanarm (not holding the robot), can be written as:

Ts � Ds�H� �H� Cs�H; _H� _H ;

where matrices Ds and Cs are similar to that of the robotexcept that their parameters ks depend on mass distri-bution and lengths of the human arm. For the robot,kr � �0:3189; 0:0938; 0:1262� kg á m2, and link lengthsr1 � 0:46 m, r2 � 0:44 m. For the human arm,ks � �0:265; 0:052; 0:0844� kg á m2, and link lengthsl1 � 0:33 m, l2 � 0:32 m (Jordan et al. 1994).

At the interaction port at the hand, where a forcetransducer is housed, the two systems are coupled. Theinteraction force acting on the hand of the subject asa result of the motion of the robot is Fx � ÿ�JT

r �ÿ1Tr. Ifwe express the dynamics of the robot represented by U interms of kinematics of the human arm H, and includethe possibility that the robot may, in addition to itspassive dynamics, impose an active force ®eld F �x; _x� onthe subject's hand, then the overall inverse dynamics ofthe system can be expressed as torques acting on thesubject's arm:

T �A�H� �H�B�H; _H� _H�JTs F �x; _x� ;

A�Ds�J Ts �J T

r �ÿ1Dr Jÿ1r Js ; �1�B�Cs�JT

s �J Tr �ÿ1Cr Jÿ1r Js�JT

s �J Tr �ÿ1DrJÿ1r � _Jsÿ _Jr Jÿ1r Js� ;

where Js � dH=dx, and Jr � dU=dx.We next modeled the dynamics of some of the mus-

cles attached to the arm. The major muscles acting onthe human arm during reaching movements in the cur-rent con®guration include: anterior deltoid, posteriordeltoid, brachialis/brachioradialis, triceps (short andlong head) and biceps. To model these muscles, weconsidered a simpli®cation that consisted of three mus-cle pairs acting around the shoulder, elbow and bothjoints, respectively. Since the extent of the reachingmovements were small (10 cm), we assumed that themoment arms of the shoulder, elbow and double-jointmuscles were constant with respect to the absoluteshoulder angle, relative elbow angle, and the absoluteelbow angle, respectively. The two muscles in each pairhad a ¯exor-extensor con®guration and were assumed tobe identical to each other. The values for the momentarm (Murray et al. 1995) and maximum force (Karnieland Inbar 1997) of each muscle were estimated as fol-lows: we assumed a moment arm of 5 cm for the anteriorand posterior deltoids and 3 cm for all other muscles.Fmax was 800 N for the anterior and posterior deltoids,700 N for brachialis and the short head of the triceps,and 1000 N for the biceps and long head of the tricepsand are provided in the appendix.

We represented the force response of a single muscle®ber to an electrical impulse as the product of a nor-malized force response hi�s� and a constant c, whereR1

t�0 hi�t� � 1. The net force F �t�, produced by a musclecomposed of nm ®bers, each having an activation-forceimpulse response cnhi and receiving electrical impulseactivity An, is:

Fig. 1. Schematic of the experimental set up. The ®gure notes theposition of the robot's base and end e�ector (handle) with respect to acoordinate system centered on the shoulder of the subject. Thepositions are the same as those used for actual experiments andsimulations

41

F �t� �Xnm

n�1hi � �cnAn��t� �

Xnm

n�1

Z t

0

hi�q�cnAn�t ÿ q� dq

� hi �Xnm

n�1�cnAn��t� ;

where � denotes the convolution operation. The maxi-mum force Fmax that can be generated by the muscle isequal to the maximum value for

Pnmn�1�cnAn�, becauseR1

t�0 hi�t� � 1. Therefore, if we de®ne a mean normalized

electrical activity R =

Pnm

n�1�cnAn�Fmax

, then we can write,

F �t� � Fmax�hi � R��t� � FmaxN�t� :N�t� is the ®ltered normalized electrical activity to themuscle equal to �hi � R� and having a value between 0and 1. N directly controls the force produced by thewhole muscle and, hence, will be used as the variable torepresent the central motor command to the muscle. Theactivation/force impulse response function of the musclewas modeled as:

hi�t� � t3eÿt0:01� �

6� 10ÿ8: �2�

The results above are derived for an isometric muscleand hence F is the isometric force produced by themuscle for a given neural activation. The force producedby an active muscle also depends on muscle length andvelocity. The force/length relation was modeled as anactive elastic element where sti�ness changed propor-tionately with neural activation (Shadmehr and Arbib1992). If we denote the length of the muscle as xm andthe operating length of our isometric muscle as xm0, thenwe can represent the length-modulated force outputFa as,

Fa � F � FCm�xm ÿ xm0� :The value of Cm of each muscle was derived frommeasured joint sti�ness of the arm in intact (Gomi andKawato 1996) and dea�erented subjects (Sanes andShadmehr 1995). We had found that in patients withlarge ®ber sensory neuropathy, sti�ness of the arm wasapproximately 50% of the value that we had recorded inthe normal population. We therefore assumed that theintrinsic sti�ness of the muscles was 50% of thatmeasured by Gomi and Kawato (1996), with remainingsti�ness contributed via a stretch re¯ex loop (to bedescribed below).

We modeled the force-velocity relation of each muscleby a Hill parameterized model (Krylow et al. 1995):

Ft � bFa � a _xm

bÿ _xm_xm � 0 �shortening�

Ft � b0Fa ÿ �a0 � 2Fa� _xm

b0 ÿ _xm_xm > 0 �lengthening�

a0 � ÿ0:4Fa

b0 � ÿba0 � Fa

a� Fa

Ft is the velocity-modulated force of the muscle given alength-modulated force Fa and muscle velocity _xm. a andb are constants that govern viscosity of the muscle. Thevalue of b0 is derived to ensure continuity at _xm � 0. Forestimating a and b for each muscle, we assumed�bFmax�=�axm0� � 10 (Zajac 1989). This resulted in amuscle viscosity that was 15±20% of the muscle sti�ness.

The overall input/output relation for the muscle, re-lating the ®ltered activation N with the force Ft, can berepresented by a function fM :

Ft � F � Km�F ; xm�xm � Bm�F ; xm; _xm� _xm

� FmaxN � Km�N ; xm�xm � Bm�N ; xm; _xm� _xm

� fM�N ; xm; _xm� ; �3�where, Km represents the nonlinear sti�ness due to theforce/length relationship and Bm the non-linear viscositydue to the force/velocity relationship for the muscle.

We modeled the spinal component of the stretch re-¯ex loop as a linear approximation of the non-linearmodel proposed by Gielen and Houk (1987). The acti-vation to the muscle through the spinal re¯ex pathway,Nr, was based on reciprocal inhibition of a muscle pairand was the solution of the following simultaneousequations:

Nr1 ÿ Nr2 � Ks�xm ÿ xms� � Bs� _xm ÿ _xms�Nrc �

�������������Nr1Nr2p

where, xms; _xms were the set-point muscle length andvelocity, Ks was the re¯ex sti�ness, Bs was the re¯exviscosity, and Nrc was the reciprocal inhibition constant.The ratio of Bs to Ks has been suggested to beapproximately 0.1 (Gielen and Houk, 1987). The re¯expathway was modeled with a time delay of 0.03 s(Rothwell, 1990). The sti�ness of this pathway, Ks, wasset at 50% of the muscle sti�ness derived from measuresof Gomi and Kawato (1996). The muscle-load-spinalfeedback system is summarized in Fig. 2.

The three pairs of muscles acting on the simulatedarm are redundant for the purpose of joint torque gen-eration. In order to assign a neural activation so that adesired torque could be generated, we made the furtherassumption that elbow torque was distributed equallybetween the single-joint and double-joint muscles andthat, similarly, the shoulder torque was produced by anequal contribution from the single-joint and double-joint muscles.

To summarize, the essential components of the sys-tem are as follows:

1. The inertial dynamics of the arm are described via atwo-joint planar model that interacts with a two-jointrobotic manipulandum.

2. Muscles produce passive force via a zero-delay sti�-ness and viscosity mechanical response, and activeforce in response to neural activation via a transfor-mation that is characterized by the impulse responseof Eq. (2).

3. The spinal stretch re¯ex provides added sti�ness andviscosity but at a time delay of 30 ms.

42

4. State of the arm, i.e., its position and velocity, is madeavailable to the brain after a delay of 120 ms.

5. The motor commands have a delay of 60 ms from thebrain to the generation of a measurable force in themuscle.

3 Forward and inverse models for control

Here, we consider a number of controllers and demon-strate their performance on the system described above.To simulate performance of the system under variouscontrollers, we assumed that the desired trajectory forthe arm is minimum jerk (Flash and Hogan 1985) with amovement time of 0.5 s to targets placed at 10 cm ineight equally spaced directions about a starting point.We further considered two conditions. First, movementsin a null ®eld, i.e., the subject's arm is unloaded. Second,movements in a curl force ®eld F � Bi _x, with the ®elddescribed by B1 � ff0; 13g; fÿ13; 0ggN � sÿ1 � mÿ1 orB2 � ÿB1.

3.1 Feedforward control via inverse models

We initially consider a simple approach where assign-ments of neural activations are arrived at via an inversemuscle model so that at a desired position and velocity,the net torque acting on each joint is zero. In e�ect, weassign neural activations so that the equilibrium positionof the system moves along the desired trajectory. Thedesired trajectory in hand coordinates was transformedto muscle space and the set point to the spinal feedbacksystem was assigned to this desired trajectory. Theschematic for this system is shown in Fig. 3, as is theperformance of the system in the null ®eld and force ®eldB1. While the system is stable and reaches the desiredtarget, its performance only approximates the desiredbehavior (a minimum jerk trajectory to the target)because the controller does not attempt to compensatefor the dynamics of the arm or the ®eld. The fairly goodperformance of the system when it is not coupled to aforce ®eld (Fig. 3A) demonstrates that the CNS can doquite well in controlling the arm in unloaded situationswithout explicit compensation for the inertial dynamicsof the limb. This is in agreement with a previous

simulation result (Gribble et al. 1998). It is clear,however, that precise tracking of the desired trajectoryrequires compensation for the inertial dynamics of thearm and any loads that may be attached to it.

The performance of the feedforward approach can beimproved if, in addition to an inverse muscle model, amodel was available to compensate for the inertial dy-namics of the limb. In Fig. 4, the schematics of such acontroller is provided. In this approach, the inversemodel of the inertial dynamics of the limb transformsthe desired trajectory (xmd ; _xmd ) into a desired joint tor-que Td , which is then converted to the desired forces inthe individual muscles Ftd , then using the inverse musclemodel, provides the motor commands N ,

N � fÿ1M �Ftd ; xmd ; _xmd � :This type of control system has been applied by anumber of investigators to the problem of generatingreaching movements (Atkeson 1989; Shadmehr 1990;Katayama and Kawato 1993; Stroeve 1997). Theperformance of this system in the null ®eld and in ®eldB1 is shown in Fig. 4. By compensating for inertialdynamics of the arm, the hand almost precisely followsthe desired trajectory in the null ®eld. The reason for thesmall error is that the dynamics associated with theneural activation ®lter hi (for each muscle) cannot beexactly inverted and is approximated here. The correc-tive response of the system to unmodeled dynamics ofthe force ®eld, as shown in the bottom right chart inFig. 4, is due to the spring-like mechanical properties ofthe muscles and the feedback dynamics of the re¯exloop.

3.2 Comparison with control of the human arm

How does the performance of the controller (Fig. 4)compare with the human arm? We will show that whilethere are signi®cant similarities between the two, thereare crucial di�erences which suggest that it is unlikelythat the human arm is controlled only via this feedfor-ward system. We trained 16 subjects in the null ®eld (800targets), then in ®eld B1 (576 targets). Subjects were thenpresented with ®eld B2 (384 targets). We had previouslyobserved that, after about 200 targets, performancereached a plateau and movements converged to the

Fig. 2. Block diagram of the muscle, load, spinal feedbacksystem. Nc is the descending neural command, Nr is thecontribution of the spinal stretch re¯ex, modeled as a linearfeedback controller, and hsp is the set point for the re¯exloop in joint coordinates. D indicates delays in thecommunication channels. hi is the ®lter transforming neuralactivation into activation dynamics of the muscle (Eq. 2)

43

trajectories that were observed before introduction ofthe force ®eld (Shadmehr and Mussa-Ivaldi 1994).Movements were highly correlated to a minimum jerktrajectory (0:975� 0:004, mean correlation coe�cient�SD). We were interested in quantifying the perfor-mance of the subjects after adaptation to ®eld B1 in ®eldB2, i.e., the response to large changes in systemdynamics.

In Fig. 5 we plotted the performance of a typicalsubject after she had extensively trained in ®eld B1, butwas suddenly presented with ®eld B2. Initially, let usconsider a single movement downward. This movementappears segmented, i.e., there are points where there aresudden changes in both the derivative of hand speed andthe direction of hand velocity. We identi®ed the seg-mentation points, Si, by ®nding places where a localminimum in the hand speed pro®le coincided with amaximum in the derivative of direction of velocity signal.

To compare the subject's data with that of our con-troller (Fig. 4), we assume that, after extensive trainingin ®eld B1, performance of the subject has improved dueto adaptation of the inverse model. When the inversemodel is of ®eld B1 and the system is coupled to ®eld B2,the resulting simulated hand trajectory is as shown inFig. 5A. While both the subject (Fig. 5B) and the con-troller show stability and arrive at the target, the be-havior of the two cases is very di�erent about thesegmentation points. Some of the parameters that

identify this behavior are labeled in Fig. 5B. These pa-rameters are: ki (angles of the hand's trajectory about asegmentation point), di (distance between segmentationpoint iÿ 1 and i), ti (time between segmentation pointiÿ 1 and i), jvji (hand speed at segmentation point i),and Ns (number of segmentation points in the entiretrajectory).

In Fig. 6, we have quanti®ed these parameters in all16 of our subjects for their ®rst downward movement in®eld B2. We chose this direction of movement for illus-tration of results because, due to the shape of the arm'ssti�ness (Mussa-Ivaldi et al. 1985), the e�ect of per-turbing forces is strongest for this direction of move-ment, resulting in greatest position errors. This ®gurealso shows the performance of the controller using asensitivity-based approach where model parameterswere varied. We considered a �15% change in muscleviscosity, a �15% change in inertia of the arm and linklengths, as well as a �50% change in feedback gains andsti�ness of the arm. We further assumed a �10% vari-ation in desired movement time, based on peak speedmeasurements for subjects. We found that, for param-eters that described the behavior of the arm up until the®rst segmentation point, k1, d1, and t1, performance ofthe system of Fig. 4 almost exactly matched that of thesubjects. However, beyond the ®rst segmentation pointthe controller of Fig. 4 could not account for the ex-perimental data.

Fig. 3A,B. Block diagram illustrating a feedforwardcontroller that utilizes an inverse muscle model forassignment of neural activations to the muscles,without accounting for dynamics of the limb. A1 andB1 show simulation results of hand trajectories in anull ®eld and in force ®eld B1, respectively. Allmovements are center-out. The dotted straight line isan ideal minimum-jerk motion while the other pathsare the outputs of the controller. A2 and B2 showvelocity of the hand for a movement to the bottom-most target (at ÿ90�). The gray line is velocity in thedirection parallel to the direction of target (i.e., alongthe y-axis of Fig. 1), and the black line is the velocityin a direction perpendicular to that of target (i.e.,along the x-axis of Fig. 1)

44

Based on this result, it appears that while the oscil-lations observed in the real arm after the ®rst segmen-tation point are partly due to the intrinsic visco-elasticproperties of muscles and the spinal loops, the behaviorof the arm cannot be explained solely by this feedbacksystem. In particular, note that at the ®rst segmentationpoint, the arm does not move toward the target, but, onaverage, at k2 � 17� away from the target. Further notethat the real arm has higher frequencies in its responseto unmodeled dynamics than our controller (comparesecond row of Fig. 5). We will show that this is preciselythe behavior that one would expect if the controllerused a forward model to predict the state of the armfrom time-delayed sensory feedback, and then acted toreduce that estimated error via an inverse model basedcontroller.

3.3 Feedback control via a forward model

A forward model refers to a hypothetical computationalnetwork that can predict the change in the state of thearm from inputs that provide a copy of the descending

motor command and an estimate of the current state.The forward model may be in a Smith predictor in thecase that the distal system is linear (Astrom andWittenmark 1984; Miall et al. 1993) to predict the stateof the arm at the current time, given delayed feedbackabout the actual state and the motor commands up tothe current time. Alternatively, if the distal system isnon-linear, as is the case here, a non-linear observermust be formulated to exactly model the forwarddynamics of the system. Let us express the inertialdynamics of the arm and the dynamics of the muscles bya function fp (Fig. 4) as follows:

R : �x�t� � fp�N�t�; x�t�; _x�t��y�t� � �x; _x��t ÿ t0�

where N is the motor command input to the system, x isthe position of the system, and y is the measured outputof the system, i.e., delayed position and velocity of thelimb. We have to design an observer that can estimatethe current state x�t� from the delayed state x�t ÿ t0�,given a particular history of descending motor com-mands N C�t�. The observer design is as follows:

Fig. 4A,B. Block diagram illustrating afeedforward controller that utilizes aninverse dynamics model of the inertialproperties of the limb, as well as aninverse muscle model, for assignment ofneural activations to the muscles. A1and B1 show hand trajectory of thesystem in a null ®eld and in ®eld B1. A2and B2 show velocity of the hand for amovement to the bottom-most target.The gray line is velocity along the y-axis,and the black line is the velocity alongthe x axis

45

Fig. 5A±C. Trajectories are in ®eld B2 aftersubject and controllers had adapted to ®eld B1.A Hand trajectories for the controller of Fig. 4,where only an inverse model is used. BTrajectories for a typical subject. C Trajectoriesfor a controller corresponding to Fig. 11 (switchset to 1) which used a forward model inconjunction with an inverse model. First row*,hand paths for eight movement directions.Second row, velocity along the y-axis (gray line,parallel to the direction of target) and x-axis(black line, perpendicular to the direction oftarget) for a movement toward a target at ÿ90�.Third row, hand speed and segmentation pointsSi for a movement toward ÿ90�. Fourth row,derivative of velocity direction and correspond-ing segmentation points for a movement towardÿ90�. Fifth row, segmentation of the hand'strajectory

Fig. 6. Trajectory characteristics during a reaching move-ment toward the bottom most target (ÿ90�) for 16 subjectsin force ®eld B2 after adaptation to ®eld B1 (middle bar,dark gray). We have also plotted the results of 29simulations of inverse model controller (light gray,corresponding to the controller in Fig. 4) and 35 simula-tions of the forward-inverse model feedback controller(black, corresponding to the controller in Fig. 11, switchset to 1) for the same movement. The trajectory parametersrefer to the segmentation shown in Fig. 5. ki is angle abouta segmentation point, ti is the time to reach the ithsegmentation point, di is the distance to the ith segmen-tation point, jvji is the hand speed at the segmentationpoint, and Ns is the number of segmentation points in thetrajectory. The value printed at the top of each bar triplet isthe value at the mean for the highest bar in the triplet. Notethat for k1, d1, and t1, i.e., the initial part of the movement,performance of both controllers closely matched that of theexperimental data. However, in later stages of themovement only the forward model based controller ofFig. 11 continued to accurately predict the experimentaldata

46

R0 : �x�t� � fp�NC�t�;�x�t�;�_x�t���x;�_x� ��t ÿ t0� � y�t�

�_x�t ÿ t0 � iD� ��_x�t ÿ t0 � �iÿ 1�D� �Z tÿt0�iD

tÿt0��iÿ1�D�x�T � dT

�x�t ÿ t0 � iD� ��x�t ÿ t0 � �iÿ 1�D� �Z tÿt0�iD

tÿt0��iÿ1�D�_x�T � dT

x�t� ��x�t�_x�t� ��_x�t�

i � 1 � � � t0D

where x and _x are the outputs of the forward model, �xand �_x are intermediate variables used by the forwardmodel. The above equations represent the iterativesolution of a non-linear di�erential equation fp at timet, given the initial state of the system y and the input N Cduring the time interval t ÿ t0 to t. D is the discretizediteration time interval which should ideally be in®nitelysmall. The value of D can be determined by thefrequency response of the system and for simulationsin the current study was chosen to be 0.004 s.

A network to implement the forward model basedobserver is presented in Fig. 7. It requires multiplecopies of the forward model because of the iterativenature of the method that must be used to arrive at thesolution to the nonlinear di�erential equation describingthe dynamics of the muscles/limb/load. As it is evidentfrom this ®gure, formulating an accurate forward modelis a surprisingly intensive process. This is because dy-namics of a non-linear system are state dependent andarrival of delayed sensory feedback triggers a re-esti-mation of the current state2. Here we arbitrarily repre-sented the computation time as an 8 ms delay in thecontrol system. The e�ect of this delay is to impose alimit on the gain of the feedback loop that will act on theoutput of the forward model.

For simulations of the forward model (observer)presented here, t0 in the description of R0 has a value of210 ms. This is the look-ahead period for which the

forward model is integrated. This value represents thesum of the 150-ms delay in receiving sensory feedbackfrom the limb and the 60-ms delay in the transmission ofdescending motor commands to the muscles, includingthe delay in activation force response of the muscle. Theoutput of the observer represents the best knowledge ofthe current state of the hand. Three di�erent modalitiesof control, based on three coordinate systems in whichthe observer might estimate state of the arm are con-sidered here. First, we consider control via an observerthat estimates muscle position and velocities. Second, anobserver in joint coordinates. Third, an observer inCartesian coordinates of the hand.

Consider the control system of Fig. 8 with the switchset to position 2. The forward model provides an esti-mate of the current muscle lengths and velocities basedon a copy of descending commands and time-delayedsensory feedback. This estimate is compared with thedesired trajectory in muscle space, resulting in an errorestimate. A command NC is computed as:

NC � Kp�xm ÿ xmd � � Kv� _xm ÿ _xmd �;which acts as a linear error feedback controller. Whenthe forward model perfectly describes the dynamics ofthe muscle/arm system, and when the gains Kp and Kvare su�ciently large, the e�ect of the forward modelcoupled with the linear feedback controller is toapproximate an inverse model of the plant. To see this,consider the very simple linear system of Fig. 9. Here,we wish to control dynamics of a system, speci®ed by G,via a forward model G. Note that y=x � 1=�1� G�.Therefore, y=x � Gÿ1 when G� 1, i.e., an inverse modelof the plant.

In the case of Fig. 8, where the system is non-linear,the main question is with regard to the gains Kp and Kvin the linear feedback loop. The gains need to be high ifwe wish to closely follow the desired trajectory, but asthe gains increase, the system edges closer to instability.This is for two reasons. First, there is a small delay in thecomputations of the forward model and, therefore, inthe feedback loop. Second, with high gains, the systembecomes very sensitive to unmodeled dynamics, i.e., itwill easily become unstable when the arm is coupled toan unknown load. In contrast, if the gains are low, thesystem barely follows the desired trajectory but is robustto unmodeled dynamics. In examining this control sys-tem, we found that even with a perfect forward model,no combination of gains on the linear feedback systemcould be found so that the system closely followed thedesired trajectory and remained stable in ®eld B1.

Fig. 7. Block diagram showing how a forwardmodel fp of a non-linear system fp can be used toconstruct an observer for a time-delayed nonlinearsystem where the state at time t is estimated fromthe measured state at time t ÿ t0 and the orderlycascade of descending commands Nc from timet ÿ t0 to t

2 To reduce computational complexity, one approach is to re-estimate the current state x�t� only when there is a larger thanthreshold di�erence between x�t ÿ t0� and x�t ÿ t0�. E�ectively, thischanges the feedback system through the forward model to anintermittent feedback controller (Ronco 1998). Application of asimilar idea to tracking movements has recently been demonstrated(Hanneton et al. 1997).

47

The main problem with the control scheme of Fig. 8(switch set to 2) is that even with a perfect forwardmodel, the transformation from an error in position toneural commands is a non-linear map that depends onthe inertial dynamics of the arm and force-activationdynamics of the muscles. The performance of this systemcould be improved if an inverse muscle model wasavailable to transform the error commands into neuralactivations. This control system is shown in Fig. 10,where the forward model estimates the state of the limbin joint coordinates. We arrived at the gains Kp � 30 N/rad and Kv � 3 N � sÿ1 � radÿ1 in the linear error feed-back loop of Fig. 10 by initially setting the forwardmodel to approximate the dynamics of the arm/musclein the null ®eld, then ®nding the gains that made thesystem marginally stable in ®eld B1. Kp and Kv were setat 50% of this value. Performance of the resulting con-trol system is shown in Fig. 10 for three conditions: ®rst,when the forward model is expecting a null ®eld and thearm moves in the null ®eld; second, when the forwardmodel is expecting a null ®eld but the arm moves in ®eldB1; third, when the forward model is expecting B1 andarm moves in ®eld B1. The results demonstrate that evenwith a perfect forward model, because of the trade-o�between gain and susceptibility to unmodeled dynamics,the arm trajectories in B1 are far from the desired tra-jectory.

Similar results were found if the estimate of error inposition for the forward model was transformed toneural commands through the use of both an inverse

limb model and an inverse muscle model. The controlsystem is shown in Fig. 11 (switch set to 2), where nowthe forward model estimates the state of the limb inhand coordinates. We again found that with the gain onthe feedback loop set to a high level, the system closelyfollowed the desired trajectory, but was very sensitive tounmodeled dynamics. As a compromise, we set the gainat a level that was half as high as would make the systemmarginally stable. This resulted in a system that wasstable in ®eld B1 but did not closely follow the desiredtrajectory when the forward model correctly estimatedstate of the limb.

In summary, if our control system is driven by onlyan error signal from the forward model, then severalfactors limit the gain of the feedback loop, which in turnprevent the system from closely following the desiredtrajectory even when the forward model is accurate.These factors include the non-zero computational timeof the forward model (here assumed to be 8 ms), and thedesire to keep the system stable when the arm is incontact with an unknown load.

3.4 Control using feedforward and feedback pathways

We next considered the performance of a system whereboth a feedforward pathway (consisting of the inversemodel) and a feedback pathway (via the forward model)were used to generate descending motor commands.This corresponds to the case where the switch is set toposition 1 in Figs. 8, 10, and 11. In this approach, therole of the forward model is not to provide the drivinginput to the system, but to respond only if there aredynamics in the distal plant that are not compensatedfor in the actions of the inverse models. We tested eachcontroller with the gains Kp and Kv unchanged fromabove. Obviously, when the inverse model is perfect, theforward models pathway makes no contribution to thesystem and the arm moves along the desired trajectory.

Fig. 8. A control method that uses a forward model in muscle coordinates. The switch is in position 2 for feedback control using only theforward model, and in position 1 when a forward model is used in conjunction with an inverse model

Fig. 9. A simple linear system that uses a forward model G of systemdynamics G. Note that y=x � 1=�1� G� � Gÿ1, i.e., the feedbackloop e�ectively approximates an inverse of the plant dynamics G

48

However, the main question is, how does the systembehave when there are unmodeled dynamics? To explorethis, we re-examined the data from our subjects duringthe condition where they had trained in ®eld B1, butwere suddenly presented with B2. This presents thelargest error in expected dynamics because B2 � ÿB1,providing us with the best opportunity to ask whetherthe behavior of the biological controller could beexplained by the in¯uence of the forward model.

The data from a typical subject was shown in Fig. 5and we had previously concluded that performance ofthe controller in Fig. 4 could not account for this be-

havior. The resulting trajectories for the controller ofFig. 11 (switch set to 1) are shown in Fig. 5C. Here, weassumed that after training in B1, both the forward andinverse models accurately represented the dynamics ofB1. Without any modi®cation to the parameters of thesystem, we found a remarkable similarity between theactual and simulated trajectories when the ®eld waschanged to B2. In particular, note the higher frequenciesin the response of the system (2nd row of Fig. 5C) andthe behavior about the segmentation points. We per-formed a sensitivity analysis by varying the parametersof the model: we considered a �15% change in muscle

Fig. 10A±C. A control scheme that uses a forward model in joint coordinates. The switch is in position 2 for control via only the forward model(FM), and in position 1 for control via both the forward and inverse models (IM). Trajectories are for switch in position 2. In A1 and A2, FMexpects null ®eld, arm moves in the null ®eld. In B1 and B2, FM expects null ®eld, arm moves in force ®eld B1. In C1 and C2, FM expects B1, armmoves in B1. Hand paths are represented as dots at 20-ms intervals (the straight path is the desired trajectory). Hand velocities are for adownward movement. Gains on the linear feedback error controller, Kp � 30 N/rad and Kv � 3 N � Sÿ1 � rodÿ1, were set at 50% of the value forwhich the system was marginally stable. Even with a perfect forward model, the system is not able to follow the desired trajectory

49

viscosity, a �15% change in inertia of the arm and linklengths, as well as �50% change in feedback gains (boththe spinal loop and the linear controller attached to theforward model) and sti�ness of the arm. We furtherassumed a �10% variation in desired movement timebased on peak speed measurements for subjects. Theresults of this approach were quanti®ed via the e�ect ofmodel parameter variations on movement parameters,and are summarized in Fig. 6. This ®gure allows for acomparison of the data from all our subjects with thesimulation results. Every parameter appears to be ac-curately predicted by the behavior of the control systemin Fig. 11 (switch at position 1). Similar results werefound when the control architectures in Figs. 8 or 10were used. For this reason, for the remainder of this

report we will concentrate on the system of Fig. 11 as aprototype.

Why does the arm behave as it does about the seg-mentation points? When only an inverse model is avail-able (Fig. 4), the muscles and re¯ex pathways areprogrammed based on the desired trajectory of the arm.This implies that the equilibrium position for the musclesis the desired target at t � 500 ms and the corrective ac-tion for t > 500ms is like a visco-elastic system pulling thearm directly toward the target. However, when a forwardmodel is used in conjunction with the inverse model(Fig. 11), the descending neural commands rely on theestimated trajectory instead of the desired trajectory forthe system. Furthermore, the corrective actions taken bythe system rely on both the spinal loop/muscle

Fig. 11A±C. A control scheme that uses a forward model in hand coordinates. The switch is in position 2 for feedback control via the forwardmodel, and in position 1 for control via both the forward and inverse models. Simulated trajectories for switch in position 2. In A1 and A2, FMexpects null ®eld, arm moving in the null ®eld. In B1 and B2, FM expects null ®eld, arm moves in force ®eld B1. In C1 and C2, FM expects B1,arm moves in B1. Gains on the linear feedback error controller, Kp � 500 sÿ2 and Kv � 50 sÿ1, were set at 50% of the value for which the systemwas marginally stable. Even with a perfect forward model, the system is not able to follow the desired trajectory

50

visco-elastic properties, and the predicted error signalfrom the forward model. In this situation, there are threereasons for the behavior about the segmentation points inFig. 5C: ®rst, the external force ®eld B2 pushing the handin an anti-clockwise direction; second, the forward modelincorrectly anticipates a clockwise ®eld B1 and generatesposition estimates accordingly; third, the inverse modelincorrectly generates additional torques in the anti-clockwise direction to counteract the clockwise ®eld B1.Therefore, both the wrong inverse and forward modelsare contributing to the behavior about the segmentationpoints. Which is more important?

To assess the relative role of forward and inversemodels in the controller of Fig. 11, simulations in force®eld B2 were carried out for three conditions: ®rst, weassumed that after training in B1, only the forwardmodel might have adapted to ®eld B1, i.e., inverse modelcontinued to expect a null ®eld; second, we assumed thatonly the inverse model might have adapted to B1, i.e.,forward model expected a null ®eld; third, both modelshad adapted to B1. The results of simulations in ®eld B2

are shown in Fig. 12. We note that if, after training, onlythe inverse model has adapted, we fail to see a signi®cantsegmentation pattern. The pattern is observed only whenthe forward model has adapted, regardless of the state ofthe inverse model. This establishes that the segmentationbehavior is mainly a result of adaptation of the forwardmodel to force ®eld B1 and is not signi®cantly a�ected bythe state of the inverse model.

It is now a question of how the forward model con-tributes to the behavior about the segmentation points.The state estimates are used as input to detect errors inposition and velocity, and provide state input to theinverse model. Which is the main cause of the segmen-tation? We simulated the behavior of the system in ®eldB2 with the forward model expecting ®eld B1 and theinverse model correctly modeling ®eld B2. The results of

the simulation are plotted in Fig. 13. The estimated tra-jectories are shown along with the desired and actualtrajectories in Fig. 13 A±C. It is di�cult to visualize thecontrol process through only these estimates; therefore,in Fig. 13 D±F the desired, corrective and actual accel-eration signals are plotted as vectors. It is immediatelyapparent from Fig. 13E that the cause of the segmenta-tion behavior is inappropriate feedback from the forwardmodel that tries to accelerate the hand in an anti-clock-wise direction away from the target. This establishes thatthrough error feedback, the forward model generatesincorrect state estimates which are the main reason be-hind the behavior about the segmentation points.

3.5 Robustness to measurement noise

While the primary purpose of our report is to accountfor the behavior of the human arm, it is worth notingsome of the other properties of the controller of Fig. 11from a purely practical perspective. In the design of acontrol system that relies on a forward model (orobserver), two concerns are paramount: robustness tounmodeled dynamics, and robustness to measurementnoise. In the above discussion we presented the behaviorof the system when the distal dynamics were substan-tially unmodeled and found the system to be stable.Moreover, the proposed control system appeared toclosely account for the behavior of the biologicalcontroller. What happens if measurements of state arenoisy? A forward model is expected to be particularlysusceptible to measurement noise because it is attempt-ing to predict the future state of a non-linear processfrom some initial conditions. If these initial conditionsare measured by noisy sensors, how will the performanceof the control system be a�ected?

Fig. 12A±C. Trajectories for the controller of Fig. 11(switch set to 1). All movements are in ®eld B2. Weconsidered three di�erent states of adaptation of theinverse model (IM) and the forward model (FM). InA1 and A2, IM = B1, FM = B1. In B1 and B2,IM = null ®eld, FM = B1. In C1 and C2, IM = B1,FM = null ®eld. The term null implies that the modelcompensates for only the inertial dynamics of thelimb. Note that the segmentation behavior and thehigh frequency in the response of the system arepresent regardless of the state of adaptation of theinverse model. However, the segmentation is presentonly if the forward model has adapted

51

To address this concern, we note that it has beenshown that for systems with no delay in feedback, addi-tion of an element that provides appropriate local errorfeedback on the plant can provide for an accurate velocityestimation from only position measurements in an iner-tial system (Lohmiller and Slotine 1996). Here, we dem-onstrate that for the system of Fig. 11, the design of themusculo-skeletal system allows the forward model to beremarkably robust to noise in velocity measurements.

We injected a random 5-Hz noise into the velocitysignal received by the forward model in order to observethe behavior of the system. The magnitude of this noisewas at its maximum equal to the actual velocity signal.In Fig. 14, the hand paths and hand velocity signals areplotted for the actual movement trajectory, the mea-sured movement trajectory and the estimated movement

trajectory for each of these cases. Despite the fact thatthere is substantial noise in velocity measurements,the estimated velocity pro®le, i.e., the output of theforward model, is almost exactly the same as the actualone.

The reason for this remarkable robustness of the for-ward model to measurement noise is that the descendingcommands are not specifying a particular torque. Rather,the commands are specifying an equilibrium-like state forthe distal visco-elastic system. The forward model inte-grates the descending neural commands over a 200-msperiod given some initial state of the system. Therefore,even when the initial states are incorrect due to noise, theoutput from the forward model tends towards the actualstate of the system because of the equilibrium propertiesof the system that it is being controlled.

Fig. 13A±C. Simulation results for controller ofFig. 11 (switch set to 1) for movements in ®eld B2

for a movement in a downward direction. The inversemodel correctly expects B2 while the forward modelexpects B1.AHand velocity parallel to the direction oftarget for the actual trajectory of the arm (gray line),estimated trajectory (black line, i.e., output of theforward model), and desired trajectory (dotted line). BHand paths for actual trajectory (gray dots) andestimated trajectory (black dots). C Similar to A,except for a plot of the hand velocity perpendicular tothe direction of target. D±F Desired, estimated andactual acceleration signals plotted as vectors at 20-mstime points on the actual hand trajectory. The largestacceleration vector in the three plots has a magnitudeof 4.6 m/s2 and all other vectors are scaled relative tothat

Fig. 14. Simulation results for a movement down-ward in a condition where measurements of velocityare very noisy. 1 Actual hand path (top row) andvelocity (bottom row) of the arm. Hand velocityperpendicular to the direction of target is shown ingray. Hand velocity parallel to the direction of targetis shown in black. The velocity signal is noise free. 2Behavior of the controller when measured velocity isinaccurate. The actual hand path is very similar to thecase when there was no noise in the velocitymeasurements. The controller is robust to measure-ment noise in velocity. 3 Output of the forward modelduring the control process. Estimated hand path andvelocity are plotted. Estimated velocity is very robustto measurement noise

52

4 Rates of adaptation of the internal models

While certain features of the biological controller, e.g.,behavior about the segmentation points, suggest thatgeneration of descending commands relies on a combi-nation of forward and inverse models (Fig. 11), we haveyet to determine how these models might change duringa practice session with a novel force ®eld. Of particularconcern is the question of whether the data from our 16subjects who practiced in novel force ®elds allows us todi�erentiate between the rate of adaptation of themodels. In other words, during practice in a force ®eld,how fast do each of models adapt? Is there any evidencethat the two models adapt at di�erent rates?

4.1 Learning of ®eld B1

To illustrate our approach, we begin with an extremeexample. Consider the behavior of the controller inFig. 11 (switch set to 1) when the arm is initially exposedto ®eld B1. How much improvement in performance canwe expect if the inverse model completely adapts to B1

but the forward model does not? How much improve-ment in performance can we expect if only the forwardmodel adapts? To answer these questions, we assumed adecaying exponential change in either the forward or the

inverse models and plotted key parameters of theperformance of the system for the sequence of targetsthat were presented to our subjects (Fig. 15). There aretwo lines in each sub-®gure, one corresponding to thechange in a particular parameter as the forward modeladapts (black line), and the other corresponding to thechange as the inverse model adapts (gray line). Weinitially assumed an exponential learning rate with atime constant of 0.02/movement. This implies that bythe 50th movement, each model accounts for 63% of thedynamics of the force ®eld. The two cases clearly predictdi�erent adaptation curves for most movement param-eters. When the forward model is adapting but theinverse model is not, performance improves dramatical-ly. In contrast, when only the inverse model is adapting,there are much smaller improvements in performance.Therefore, the performance of the system is highlydependent on the rate of adaptation of the forwardmodel, and much less so on the rate of adaptation of theinverse model.

Now consider the possibility that during practice,both the inverse and the forward models are adapting butwith possibly di�erent rates. Assume that these rate arerfm for the forward model and rim for the inverse model.

FM�n� � FM�0� � DfFMg�1ÿ eÿnrfm�IM�n� � IM�0� � DfIMg�1ÿ eÿnrim�

Fig. 15. Adaptation curves for movement parameters during learning of ®eld B1 in two cases ± (1) dotted line, only the inverse model adaptsexponentially to B1 at a rate of rim � 0:02, rfm � 0; (2) solid line, only the forward model adapts to B1 at a rate of rim � 0, rfm � 0:02. The desiredtrajectory was a 10-cm minimum jerk motion performed in 0.5 s. Jerk ratio is the ratio of cumulated squared jerk in a movement with respect tothe minimum jerk possible for a movement of the same peak speed. The correlation coe�cient is with respect to the minimum jerk motion. Theperpendicular distance refers to the distance of the hand from the min jerk motion at 150 ms into the movement. Perp. Power refers to the powerin the frequency spectrum of the velocity of hand along a direction perpendicular to the direction of target. di, ti and ki, refer to the distance to,time to, and angle at the ith segmentation point. Ns is the number of segmentation points, and jvjsp1 is the hand speed at the ®rst segmentationpoint

53

FM�n� and IM�n� are the adaptation states of the twomodels at movement number n in the force ®eld andrepresent the time course of adaptation for the twomodels. FM�0� and IM�0� are the initial states of theforward and inverse models at the beginning of the force®eld training. DfFMg;DfIMg are the di�erence in theinitial state of the models and the force ®eld beinglearned. The equations are obtained by considering arate of learning of the models that is proportional to thedi�erence in the model and the ®eld at any instant oftime. When subjects begin the training in ®eld B1, theyinitially expect the null ®eld. Hence, the initial states ofboth the forward and inverse models are set to thedynamics appropriate for the null ®eld, i.e., the coupleddynamics of the human arm and the robot with noforces being produced by the robot's motors. This isreferred to FM�0� � IM�0� � null. Furthermore,DfFMg � DfIMg � B1. Here, we tried to ®nd a bestestimate of the two learning rates by comparing theperformance of the model system at a given rate ofadaptation to the performance of the 16 subjects. Weconsidered six values for rim � f0:0003; 0:003; 0:01;0:03; 0:1; 0:3g, and ®ve values for rfm � f0:003; 0:01;0:03; 0:1; 0:3g, for a total of 30 combinations.

To compare performance of the controller at a givenrates of adaptation with the performance of our sub-jects, we initially quanti®ed each movement of eachsubject with 16 parameters. These parameters weremovement time, movement distance (total length of amovement), peak speed, jerk ratio (the ratio of the cu-mulative squared jerk of a movement with respect to thecumulative squared jerk for a minimum jerk movementof the same peak speed), correlation with a minimumjerk movement, perpendicular displacement of the handfrom a straight line to the target at 150 ms into themovement, power in the frequency spectrum of the ve-locity of the hand along a vector perpendicular to thedirection of target, di (distance to the ith segmentationpoint), ti (time to the ith segmentation point), ki (angleof the hand trajectory about the ith segmentation point,see Fig. 5), Ns (number of segmentation points) jvjsp1(speed at the ®rst segmentation point). Let us refer tothese parameters with variable p � 1 � � �m, m � 16.

These movement parameters were quanti®ed for each ofthe 576 movements of each subject. Let us refer to themovements with variable n � 1 � � � 576. The resulting``learning curves'' for all subjects are shown as average�SD in Fig. 17.

We next quanti®ed the same movement parameters pfor a control system with particular learning rates rimand rfm. To exactly simulate experimental conditionsfaced by our subjects, we used the same sequence oftargets (movement directions) that were experienced bythe subjects. This procedure was repeated for all 30combinations of rim and rfm. Let us label each pair ofrates by the variable q � 1 � � � 30. The next step was to®nd the one pair of rates that resulted in a control sys-tem that had movement characteristics that most re-sembeled data of our subjects. To do this, an errormeasure e was de®ned for each movement parameter p,as measured over n movements for adaptation rates q:

epq �P

n�jypqn ÿ lpnj � rpn�Pq

Pn�jypqn ÿ lpnj � rpn� �4�

where, ypqn is the value of the simulated movementparameter p at the nth movement corresponding theadaptation rate q. lpn is the mean of the parameter valuefor our 16 subjects at movement n and rpn is thecorresponding standard deviation. The numerator inthe equation is almost equal to the average area betweenthe simulated and experimental adaptation curves. Thiserror is normalized by the denominator which is the sumof the errors for the 30 di�erent rates of adaptation forthe inverse and forward models, making it independentof parameter units and values. To combine the infor-mation from the di�erent movement parameters p, theerrors were summed together to give a net error Eq for aparticular pair of adaptation rates q,

Eq �P

p epq

m�5�

This net error is plotted as a function of rfm and rim inFig. 16. The region surrounded by the thick line outlinesthe minima. The error is at its lowest for rfm � 0:01. Thismeans that by the 100th movement, a typical subject's

Fig. 16A,B. Normalized error in matching performance of the adaptive controller of Fig. 11 (switch set to 1) with that of 16 subjects thatpracticed in ®eld B1 for 572 movements. A The error is plotted as a function of rates of adaptation of the forward model rfm and inverse modelrim. B The region of minimum error is highlighted by the thick black line in the two-dimensional projection. Note that while the error surface issharply de®ned in terms of changes in the forward model, it is fairly ¯at to variations in the learning rate of the inverse model

54

forward model accounted for 63% of the dynamics ofthe force ®eld. This value lies at the bottom of a sharplyde®ned region, suggesting a high degree of sensitivity,and therefore con®dence, in estimating the rate ofadaptation of the forward model. In contrast, the rateof adaptation for the inverse model cannot be preciselyestimated because the minimum lies in a fairly shallowvalley. There are no signi®cant di�erences in the errormeasure whether the rate of learning of the inversemodel is rim � 0:003 or rim � 0:03. The reason for theshallowness of the valley for rim is the much weakerdependence of movement parameters on the rate ofadaptation of the inverse model.

To visualize how well the rate of adaptations of theinverse and forward models accounted for the pattern oflearning in our subjects, we compared changes in per-formance of the model with that of the subjects. InFig. 17 we have the changes in various movement pa-rameters for the 16 subjects (mean �SD) as they practicein the force ®eld. In this ®gure we also have the changesin movement parameters of the adaptive controller for aparticular combination of adaptation rates of the for-ward and inverse models, rfm � 0:01; rim � 0:01. Theperformance of the adapting model controller accuratelycaptures the trajectory of changes in the performances ofour subjects. The simulation results even mimic the setstructure, which is due to the sequence of movementdirections for the experimental data.

In summary, theoretical results suggest that signi®-cant improvements in performance can be achieved withadaptation of only the forward model. In contrast,adaptation of only the inverse model results in muchmore modest improvements in performance. While thismight suggest that a reasonable policy for an adaptive

controller might be to allocate resources mostly tolearning of the forward model, the actual performanceof our subjects does not provide convincing evidence tosupport this hypothesis. The psychophysical data sug-gests that the inverse and the forward models adapt atfairly comparable rates, accounting for approximately63% of the dynamics of the ®eld by the 100th move-ment. We note, however, that our estimation of the rateof adaptation in the inverse model is much more tenta-tive than that of the forward model because of the rel-ative insensitivity of the movement parameters (for thetask studied here) to changes in the inverse model.

4.2 Learning of ®eld B2

A fundamental observation in the way humans learnforce ®elds is that the ability of subjects to learn acounter example (®eld B2) of a previously learned ®eld(B1) depends on the time that has passed since adapta-tion to that ®eld (Brashers-Krug et al. 1996). During thisperiod, signi®cant changes appear to occur in thefunctional properties of the motor memory (Shadmehrand Brashers-Krug 1997), changes which coincide withshifts in the neural correlates of the memory (Shadmehrand Holcomb 1997). Here, we applied the abovecomputational framework and quanti®ed the rate ofadaptation of the forward and inverse models under twoconditions: ®rst, when subjects were exposed to B2 5 minafter completing 572 targets in B1; second, when anothergroup of subject was exposed to B2 6 h after completionof practice in B1.

The estimation procedure was as described above,except that at the start of practice in ®eld B2 we assumed

0

0.1

0.2

Movement Time (s)

0

0.01

0.02

0.03

Movement Dist. (m)

0

0.1

0.2

Jerk Ratio

0.06

0.04

0.02

0

Correlation Coeff.

6

4

2

0

2

x 103 Perp.Disp.(m) at .15 s

0.01

0.02

0.03

0.04

Perp. Velocity Power

4

2

0

2

4x 10

3 d1 (m)

0.1

0.05

0

t1 (s)

0.05

0

0.05

0.1

λ2 (rad)

0

5

10

15

20x 10

3 d2 (m)

0

0.1

0.2

t2(s)

0 200 400

0.4

0.2

0

3λ (rad)

0 200 400

0

0.5

1

NS

0 200 400

0

0.02

0.04

0.06

|v|SP1

(m/s)

Movement number Movement number

0

0.1

0.2

Movement Time (s)

0

0.01

0.02

0.03

Movement Dist. (m)

0

0.1

0.2

Jerk Ratio

0.06

0.04

0.02

0

Correlation Coeff.

6

4

2

0

2

x 103 Perp.Disp.(m) at .15 s

0.01

0.02

0.03

0.04

Perp. Velocity Power

4

2

0

2

4x 10

3 d1 (m)

0.1

0.05

0

t1 (s)

0.05

0

0.05

0.1

λ2 (rad)

0

5

10

15

20x 10

3 d2 (m)

0

0.1

0.2

t2(s)

0 200 400

0.4

0.2

0

3λ (rad)

0 200 400

0

0.5

1

NS

0 200 400

0

0.02

0.04

0.06

|v|SP1

(m/s)

Movement number Movement number

Fig. 17. Changes in movementparameters during learning of®eld B1 for the adapting con-troller (black) with adaptationconstants rfm � 0:01, rim � 0:01,and for experimental data from16 subjects (gray) plotted as themean and standard deviation.All values are represented aschange in the movement param-eter with respect to values re-corded after subjects/modelshad adapted to movements inthe null ®eld. Movement param-eters are as in Fig. 15

55

that the forward and inverse models had fully adapted toB1. This was based on the result that the best estimate ofthe adaptation rate in ®eld B1 was 0.01 for both models,i.e., by the 300th target, both the inverse and forwardmodels could account for 95% of the dynamics of theforce ®eld (subjects received 572 targets).

We simulated performance of the system (Fig. 11,switch set to 1) in ®eld B2 with the following rates ofadaptation: rfm � �0:008; 0:01; 0:015; 0:025; 0:04�, andrim � �0:001; 0:003; 0:005; 0:008; 0:01�. An error mea-sure, as de®ned in Eq. (4), was estimated for eachmovement parameter and adaptation rate. We usedEq. (5) to arrive at a normalized error measure for allparameters at a given adaptation rate. The results areplotted in Fig. 18. We found that in learning of ®eld B2

at 5 min after completion of practice in B1, the minimumin this error function was at rim � 0:0028 andrfm � 0:015. This suggested that the forward model wasadapting at a rate that was approximately 5.4 times thatof the inverse model.

When the procedure was repeated for the data of the6-h group, the normalized error was minimized whenrim � 0:004 and rfm � 0:020. Remarkably, in this group,the forward model also adapted at approximately 5.0times that of the inverse model. To our knowledge, thisis the ®rst evidence that in learning to control the arm,the human adaptive controller may rapidly learn a for-ward model but acquire an inverse model at a muchmore slower rate.

The results also show that when subjects were pre-sented with ®eld B2, the rates of adaptation in both theinverse and the forward models were faster in the 6-hgroup than in the 5-min group. This suggests that as thememory of B1 consolidated, certain computational re-

sources that might have been used in adaptation of theinternal models once again became available, resultingin more rapid rates of adaptation at 6 h as compared to5 min.

5 Discussion

Here, we approached the problem of controlling a time-delayed mechanical system similar to the human arm.The task that we considered was reaching movements innovel force ®elds. We demonstrated that essentialcharacteristics of the arm's trajectory could not beaccounted for if the supra-spinal controller was an openloop system composed of a model of the inversedynamics of the arm (Fig. 4). These characteristics wererelated to the frequency response of the system to aperturbation and behavior about segmentation points,i.e., points where both the derivative of hand speed andthe direction of hand velocity changed rapidly.

We next considered the design of the supra-spinalcontroller with a forward model (also known as an ob-server). The forward model predicted the position andvelocity of the arm from a copy of descending motorcommands and the latest sensory feedback from themoving arm. Descending neural commands were gen-erated through a comparison of the predicted state ofthe system and its desired state. It was found that psy-chophysical data could not be suitably modeled with thissystem. First, because of small delays inherent in thecomputations of the forward model, gains of the feed-back loop that generated an error signal could not be setlarge enough to e�ectively provide an inverse model ofthe dynamics of the distal system. Second, the gains were

Fig. 18A,B. Normalized error inmatching performance of the adap-tive controller of Fig. 11 (switch setto 1) with that of 16 subjects thatlearned ®eld B2 after training in B1.A1 and A2 Error as a function ofadaptation rates in the forwardand inverse models for subjectsthat trained in B2 at 5 min aftercompletion of training in B1. Theposition with minimum error valueis at rim � 0:0028 and rfm � 0:015.That is, the forward model ap-peared to adapt about 5.4 timesfaster than the inverse model. B1and B2 Subjects that trained in B2

at 6 h after B1. The position withminimum error value is atrim � 0:004 and rfm � 0:020.Again, the forward model ap-peared to adapt about ®ve timesfaster than the inverse model.Furthermore, at 6 h both theforward and inverse models appearto be adapting signi®cantly fasterthan at 5 min

56

further limited because of the instability that resultedwhen there was unmodeled dynamics in the distal sys-tem, for example, when the arm was coupled to a novelforce ®eld. This suggested that it was unlikely that thebiological controller generated descending motor com-mands solely through the use of a forward model.

We next considered the possibility that descendingcommands were generated through a control architec-ture that included both an inverse and a forward model.In this scenario, the forward model did not provide thedriving input to the system, but an input only when thefeedforward pathway (which consisted of the inversemodel) could not precisely account for dynamics of thedistal system. In this architecture, the essential idea wasthat the entire control system, including the set pointsfor the spinal re¯ex loops and state-dependent maps inthe inverse model, relied on an estimate of the currentstate, i.e., the output of the forward model, rather thanthe desired state trajectory for the system. We showedthat using this architecture the behavior of the humanarm in a force ®eld could be almost precisely accountedfor. The response of the system in a novel force ®eld haddynamics which included segmentation points. The be-havior about these points were remarkably similar tothose we had observed in our subjects. This suggestedthat the trajectory of the human arm about its seg-mentation points were due to the action of a supra-spinal feedback system that used a forward modelwhich, in turn, provided an error signal that, throughan inverse model, resulted in modi®cation of descendingcommands. These results were found to be robust tochanges in model parameters, including inertial prop-erties of the limb, muscle visco-elastic properties, jointsti�ness, and gains of the spinal and supra-spinal con-trol loops.

A number of previous theoretical studies have rep-resented the architecture of the biological controller viaeither an adaptive inverse model based system (Kawato1989; Shadmehr 1990; Katayama and Kawato 1993;Barto et al. 1998; Schweighofer et al. 1998), or anadaptive forward model based system (Miall et al. 1993;Darlot et al. 1996). Our results show that an architecturethat is more parsimonious with our experimental data isone where both systems play a role in the generation ofdescending motor commands.

Why should the CNS have both a forward and aninverse model of dynamics of a system? If we assumethat motor memory is a collection of internal models,then an important problem facing the controller is thatof selecting an appropriate model when the hand comesinto contact with the environment. If a collection of bothforward and inverse models are present, then the be-havior of the arm in the environment could be imme-diately compared to the outputs of a collection offorward models (Wolpert and Kawato 1998). The modelthat most closely predicted the behavior of the armcould then be used for identi®cation of the appropriateinverse model. In contrast, if motor memory consisted ofonly inverse models, each would have to be tried out byactually controlling the arm using that model andmeasuring its performance. It appears that a controller

that used both models would have a distinct advantagein unstructured environments.

5.1 Rates of learning of the two models

From an adaptive control perspective, an interestingprediction can be made regarding the behavior of theproposed control system of Fig. 11. When this controllerneeds to make reaching movements in a novel force ®eld,performance is much more sensitive to changes in theforward model than in the inverse model (Fig. 15). Ifthere were limited resources available for adaptation, itwould be more important to quickly adapt the forwardmodel than to adapt the inverse model (obviously,adaptation in both models is required to completelyeliminate errors). Does this actually happen whensubjects are learning force ®elds? We found that whenthe dynamics of the arm changed from the null ®eld toB1, subjects appeared to learn a forward model at a ratethat was approximately the same as the rate of learningin their inverse model (Fig. 16). However, when thedynamics changed from B1 to B2 (a magnitude of changethat resulted in signi®cantly more errors in performance,possibly straining learning resources), there was atendency for a more rapid adaptation of the forwardmodel than the inverse model (Fig. 18). In this situation,the forward model appeared to adapt at approximately®ve times the rate of the inverse model.

This can only be taken as preliminary evidence be-cause, for the reaching movements considered here, theability to estimate the rate of adaptation in the inversemodel was hampered by the relative insensitivity ofmovement parameters to changes in this part of theadaptive controller. Nevertheless, it is intriguing that thedata from our subjects shows that the forward modeladapts more rapidly than the inverse model when thereare very large errors in performance. This is in agree-ment with a crucial prediction of the theory that suggeststhat the CNS may rapidly learn a forward model inorder to use the acquired knowledge for o�-line learningof an inverse model (Miall and Wolpert 1996). Furtherexperiments are necessary to determine whether theadaptive state of the inverse model changes during thiso�-line period. However, we note that, in the currentexperiment, we found evidence in support of the ideathat passage of time during an o�-line period a�ectedstates of the models; the rates of learning in bothmodels substantially accelerated when the memory of®eld B1 had been allowed to consolidate before B2 wasintroduced.

5.2 Noise sensitivity of the forward model

From a control perspective, the use of a forward modelcan lead to serious problems because estimating thecurrent state of a non-linear system from delayedsensory feedback depends strongly on the quality ofthe sensory information. Actions taken based onerroneous sensory data might easily destabilize the

57

system. In e�ect, the further in time we need to estimatea non-linear process, the more sensitive we might be toits initial conditions. However, here we found thesurprising result that the forward model was, in fact,quite accurate even when there were larger errors in statemeasurements (of velocity). This, we hypothesize, isbecause of the nature of the descending commands tothe distal system. Whereas in a robotic system thedescending commands might represent a torque about ajoint, in the biological arm, the descending commandsloosely correspond to force ®elds, i.e., torques that areposition and velocity dependent. These neural com-mands drive the system toward an implicit equilibriumposition and velocity (Mussa-Ivaldi and Giszter 1992). Ifa forward model can accurately describe the dynamics ofthe arm, then a copy of the neural commands willsimilarly drive the estimated dynamics of the systemtoward this equilibrium position and velocity, providingrobustness to errors in measurements of state. There-fore, it appears that a forward model of the biologicalarm can function adequately despite signi®cant noise inproprioceptive velocity sensors.

5.3 Neural representation of internal models

The current report suggests that acquiring a motor skilllikely involves learning two di�erent kinds of computa-tional models, and that the states of these models may bein¯uenced by the passage of time. What regions of thebrain might be involved in the implementation of theforward and inverse models? Many reports have spec-ulated that there may be a role for the cerebellum inrepresenting the forward (Miall et al. 1993), inverse(Kawato and Gomi 1992; Shidara et al. 1993; Houk andWise 1995; Barto et al. 1998; Schweighofer et al. 1998),or both models (Wada and Kawato 1993; Miall andWolpert 1996; Wolpert et al. 1998b). The cerebellum ofthe electric ®sh provides an excellent example of theneural implementation of a forward model (Bell et al.1997). In primates, there are also some experimentaldata to support a role for the cerebellum in represen-tation of the inverse and forward models. For example,simple spike activity of Purkinje cells during generationof eye and arm movements can be closely ®tted to aninverse dynamics representation of the movements of theeye (Gomi et al. 1998) and the hand (Ebner and Fu1997). Alternatively, if the cerebellum is involved inrepresentation of a forward model, then during the delayperiod before initiation of a movement, simple spikeactivity may represent the predicted sensory outcome ofa movement (Miall et al. 1998). If this predictedoutcome di�ers from the desired behavior, complexspikes representing predicted error should be detectedsoon afterwards. In fact, there is evidence for thistemporal order in the ®ring activity of some Purkinjecells (Miall et al. 1998).

Lesion studies in humans, however, are not currentlyin agreement with the view that the forward model isrepresented in the cerebellum. To show this, in Fig. 11consider a switch that would allow for cutting o� of

descending commands, Nc, to the spinal cord. In thissituation, the supra-spinal controller can generate theneural commands (output of the inverse model) andobserve its sensory consequences (output of the forwardmodel) without actually making the movement. This isone way by which a forward model can be used tomentally simulate a movement. In fact, humans are ableto accurately predict the consequences of imaginedmovements of the hand (Sirigu et al. 1996). Damage tothe parietal cortex adversely a�ects this ability (Siriguet al. 1996), while damage to the motor cortex (Siriguet al. 1996) and the basal ganglia (Dominey et al. 1995)does not. This would suggest that the parietal cortexmust be involved in some aspect of using the output, orstoring the contents, of the forward model. Indeed, arecent report of a parietal patient with de®cits in main-taining state estimates of his arm supports this view(Wolpert et al. 1998a). Accordingly, if motor imagery isa�ected in the parietal patients because a pathway forinterpreting the output of the forward model has beenlost, then perhaps the forward model resides in thecerebellum and its output is provided to the parietalcortex via thalamic pathways. However, a recent reporthas shown that patients with cerebellar lesions can pre-dict consequences of imagined arm movements as ac-curately as normal subjects (Kagerer et al. 1998).Therefore, this casts some doubt on the idea that, inhumans, the forward model is stored in the cerebellum.Current lesion studies only implicate the parietal cortexin the neural machinery that might represent or use theforward model.

Our observation in this report has been that learningof reaching movements likely involves adaptation of twodistinct internal models, possibly at di�erent rates. Let usconsider the possibility that one of these models, perhapsthe inverse model, resides in the cerebellum, while theother does not. Note that if this is the case, when there isdamage to the neural representation of the inversemodel, some improvement in performance will still takeplace through adaptation of the forward model. How-ever, if the damage to the neural representation of theinverse model is temporary, when it comes back on-linethe performance of the system would suddenly decline,requiring relearning. In fact, it has been observed thatcats can learn to move a manipulandum and retain thismotor memory despite inactivation of their cerebellarnuclei (Wang et al. 1998). However, they have to relearnthe task when the cerebellum comes back on-line. It ispossible that this occurs because of a mismatch betweenan adapted forward model that might reside outside thecerebellum and the inverse model in the cerebellum.

A glance at the design of the forward model suggeststhat a number of functions must be closely synchronizedin order for the forward model to learn the dynamics ofa force ®eld. Perhaps the most important function is tohave at any time t a copy of the e�erent commands fromtime t ÿ t0 until t. This would allow the forward modelto estimate the current position of the arm by integratingthe descending commands from an initial condition setby the sensory measurements taken at time t ÿ t0. Ine�ect, there should be a place where a copy of the

58

descending commands is held for up to 200 ms. Whileshort-term retention of sensorimotor information is of-ten associated with the dorsolateral prefrontal cortex(Fuster 1997), there is no evidence to suggest thatdamage to this part of the brain results in a movementdisorder. The neural basis of this very short-term motormemory remains to be explored.

We have demonstrated here that the human motor-control system has an architecture that likely includesboth forward and an inverse models. When there arelarge errors in performance, improvements duringtraining occur because of rapid adaptation in the for-ward model but a much slower rate of adaptation in theinverse model.

Acknowledgements. This work was part of a Masters Thesis sub-mitted by N.B. to the Biomedical Engineering Department at JohnsHopkins University. The thesis, which includes more completedetails of the model parameters and further experimental results, isavailable from www.bme.jhu.edu/~reza/nbthesis.pdf. We are verygrateful to Kurt Thoroughman who provided us with the humansubject data used in this report. Our work has been greatly enrichedby our interactions with Maurice Smith, Kurt Thoroughman, andSteve Wise. This work was funded in part by grants from the U.S.O�ce of Naval Research, the Whitaker Foundation, and a Multi-University Research Initiative from the U.S. Department ofDefense.

References

Astrom KJ, Wittenmark B (1984) Computer controlled systems.Prentice-Hall, Englewood Cli�s, NJ

Atkeson CG (1989) Learning arm kinematics and dynamics. AnnuRev Neurosci 12:157±183

Barto AG, Fagg AH, Sitko� N, Houk JC (in press) A cerebellarmodel of timing and prediction in the control of reaching.Neural Comput

Bell CC (1981) An e�erence copy which is modi®ed by rea�erentinput. Science 214:450±453

Bell CC, Han VZ, Sugwara Y, Grant K (1997) Synaptic plasticityin a cerebellum-like structure depends on temporal order. Sci-ence 287:278±281

Blakemore SJ, Goodbody SJ, Wolpert DM (1998) Predicting theconsequences of our own actions: the role of sensorimotorcontext estimation. J Neurosci 18:7511±7518

Brashers-Krug T, Shadmehr R, Bizzi E (1996) Consolidation inhuman motor memory. Nature 382:252±255

Cordo P, Carlton L, Bevan L, Carlton M, Kerr GK (1994) Prop-rioceptive coordination of movement sequences: role of veloc-ity and position information. J Neurophysiol 71:1848±1861

Darlot C, Zupan L, Etard O, Denise P, Maruani A (1996) Com-putation of inverse dynamics for the control of movements.Biol Cybern 75:173±186

Dominey P, Decety J, Broussolle E, Chazot G, Jeannerod M (1995)Motor imagery of a lateralized sequential task is asymmetri-cally slowed in hemi-Parkinson's patients. Neuropsychologia33:727±741

Ebner TJ Fu, Q (1997) What features of visually guided armmovements are encoded in simple spike discharge of cerebellarPurkinje cells? Prog Brain Res 114:431±447

Flanagan JR, Wing AM (1997) The role of internal models inmotion planning and control: evidence from grip force ad-justments during movements of hand-held loads. J Neurosci17:1519±1528

Flash T, Hogan N (1985) The coordination of arm movements: anexperimentally con®rmed mathematical model. J Neurosci5:1688±1703

Fuster JM (1997) The prefrontal cortex: anatomy, physiology, andneuropsychology of the frontal lobe. Lippincott-Raven, Phila-delphia

Gielen CCAM, Houk JC (1987) A model of the motor servo: In-corporating nonlinear spindle receptor and muscle mechanicalproperties. Biol Cybern 57:217±231

Gomi H, Kawato M (1992) The cerebellum and VOR/OKRlearning models. Trends Neurosci 15:445±453

Gomi H, Kawato M (1996) Equilibrium-point control hypothesisexamined by measured arm sti�ness during multijoint move-ment. Science 272:117±120

Gomi H, Shidara M, Takemura A, Inoue Y, Kawano K, KawatoM (1998) Temporal ®ring patterns of Purkinje cells in thecerebellar ventral para¯occulus during ocular followingresponses in monkeys I. Simple spikes. J Neurophysiol 80:818±831

Gribble PL, Ostry DJ, Sanguineti V, LaBoissiere R (1998) Arecomplex control signals required for human arm movement?J Neurophysiol 79:1409±1424

Hanneton S, Berthoz A, Droulez J, Slotine JJE (1997) Does thebrain use sliding variables for the control of movements? BiolCybern 77:381±393

Houk JC, Wise SP (1995) Distributed modular architectures link-ing basal ganglia, cerebellum, and cerebral cortex: their role inplanning and controlling action. Cerebral Cortex 5:95±110

Johansson RS, Westling G (1984) Roles of glabrous skin receptorsand sensorimotor memory in automatic-control of precisiongrip when lifting rougher or more slippery objects. Exp BrainRes 56:550±564

Jordan MI, Rumelhart DE (1992) Forward models: supervisedlearning with a distal teacher. Cogn Sci 16:307±354

Jordan MI, Flash T, Arnon Y (1994) A model of the learning ofarm trajectories from spatial deviations. J Cogn Neurosci6:359±376

Kagerer FA, Bracha V, Wunderlich DA, Stelmach GE, Bloedel JR(1998) Ataxia re¯ected in the simulated movements of patientswith cerebellar lesions. Exp Brain Res 121:125±134

Karniel A, Inbar GF (1997) A model for learning human reachingmovements. Biol Cybern 77:173±183

Katayama M, Kawato M (1993) Virtual trajectory and sti�nessellipse during multijoint arm movement predicted by neuralinverse models. Biol Cybern 69:353±362

Kawato M (1989) Adaptation and learning in control of voluntarymovement by the central nervous system. Adv Robot 3:229±249

Kawato M, Gomi H (1992) A computational model of four regionsof the cerebellum based on feedback-error learning. Biol Cy-bern 68:95±103

Krylow AM, Sandercock TG, Rymer WZ (1995) Muscle models.In: Arbib, MA (ed) The handbook of brain theory and neuralnetworks. MIT Press, Cambridge, mass, pp 609±613

Lee RG, Tatton WG (1975) Motor responses to sudden limb dis-placements in primates with speci®c CNS lesions and in humanpatients with motor system disorders. Can J Neurol Sci 2:285±293

Lohmiller W, Slotine JJE (1996) On metric observers for nonlinearsystems. Proc IEEE Int Conf Control App 320±326

Massaquoi SG, Slotine, J-JE (1996) The intermediate cerebellummay function as a wave-variable processor. Neurosci Lett215:60±64

Miall RC, Wolpert DM (1996) Forward models for physiologicalmotor control. Neural Netw, 9:1265±1279

Miall RC, Weir DJ, Wolpert DM, Stein JF (1993) Is the cerebelluma Smith predictor? J Mot Behav 25:203±216

Miall RC, Keating JG, Malkmus M, Thach WT (1998) Simplespike activity predicts occurrence of complex spikes in cere-bellar Purkinje cells. Nat Neurosci 1:13±15

Montgomery JC, Bodznick D (1994) An adaptive ®lter thatcancels self-induced noise in the electrosensory and lateralline mechanosensory systems of the ®sh. Neurosci Lett174:145±148

59

Murray WM, Delp SL, Buchanan TS (1995) Variation of musclemoment arms with elbow and forearm position. J Biomech28:513±525

Mussa-Ivaldi FA, Giszter, SF (1992) Vector ®eld approximation: acomputational paradigm for motor control and learning. BiolCybern 67:491±500

Mussa-Ivaldi FA, Hogan N, Bizzi E (1985) Neural, mechanical andgeometric factors subserving arm posture in humans. J Neu-rosci 5:2732±2743

Niemeyer G, Slotine JJE (1991) Stable adaptive teleoperation.IEEE J Oceanic Eng 16:152±162

Ronco E (In press) Open-loop intermittent feedback optimal con-trol: a probable human motor control strategy. IEEE TransMan Machine Cybern

Rothwell JC (1990) Long latency re¯exes of human arm muscles inhealth and disease. In: Rossini, PM, Mauguiere F (ed) Newtrends and advanced techniques in clinical neurophysiology.Elsevier, Amsterdam, pp 251±263

Sanes JN, Shadmehr R (1995) Sense of muscular e�ort and som-esthetic a�erent information in huamns. Can J Physiol Phar-macol 73:223±233

Schweighofer N, Arbib MA, Kawato M (1998) Role of the cere-bellum in reaching movements in humans. I. Distributed in-verse dynamics control. Eur J Neurosci 10:86±94

Shadmehr R (1990) Learning virtual equilibrium trajectories forcontrol of a robot arm. Neural Comput 2:436±446

Shadmehr R, Arbib MA (1992) A mathematical analysis of theforce-sti�ness characteristics of muscles in control of a singlejoint system. Biol Cybern 66:463±477

Shadmehr R, Brashers-Krug T (1997) Functional stages in theformation of human long-term motor memory. J Neurosci17:409±419

Shadmehr R, Holcomb HH (1997) Neural correlates of motormemory consolidation. Science 277:821±825

Shadmehr R, Holcomb HH (1999) Inhibitory control of motormemories: a PET study. Exp Brain Res (in press)

Shadmehr R, Mussa-Ivaldi FA (1994) Adaptive representation ofdynamics during learning of a motor task. J Neurosci 14:3208±3224

Shidara M, Kawano K, Gomi H, Kawato M (1993) Inverse dy-namics model eye movement control by purkinje cells in thecerebellum. Nature 365:50±52

Sirigu A, Duhamel J-R, Cohen L, Pillon B, Dubois B, Agid Y(1996) The mental representation of hand movements afterparietal cortex damage. Science 273:1564±1568

Sperry RW (1950) Neural basis of spontaneous optokinetic re-sponse produced by visual inversion. J Comp Physiol Psychol43:482±489

Stroeve S (1997) A learning feedback and feedforward neuromus-cular control model for two degrees of freedom human armmovements. Hum Move Sci 16:621±651

Wada Y, Kawato M (1993) A neural network model for arm tra-jectory formation using forward and inverse dynamics models.Neural Netw 6:919±932

Wang JJ, Shimansky Y, Bracha V, Bloedel JR (1998) E�ects ofcerebellar nulclear inactivation on the learning of a complexforelimb movement in cats. J Neurophysiol 79:2447±2459

Wolpert DM, Miall RC, Kerr GK, Stein JF (1993) Ocular limitcycles induced by delayed retinal feedback. Exp Brain Res96:173±180

Wolpert DM, Kawato M (1998) Multiple paired forward and in-verse models for motor control. Neural Netw 11:1317±1329

Wolpert DM, Ghahramani Z, Jordan MI (1995) An internal modelfor sensorimotor integration. Science 269:1880±1882

Wolpert DM, Goodbody SJ, Husain M (1998a) Maintaining in-ternal representations:the role of the human superior parietallobe. Nat Neurosci 1:529±533

Wolpert DM, Miall RC, Kawato M (1998b) Internal models in thecerebellum. Trends Cogn Sci. 2:338±347

Zajac FE (1989) Muscle and tendon: properties, models, scaling,and application to biomechanics and motor control. Crit RevBiomed Eng 17:359±411

60