mcadams bregman streams

Upload: matthew-sullivan

Post on 13-Apr-2018

240 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 McAdams Bregman Streams

    1/21

    Hearing Musical StreamsAuthor(s): Stephen McAdams and Albert BregmanSource: Computer Music Journal, Vol. 3, No. 4 (Dec., 1979), pp. 26-43+60Published by: The MIT PressStable URL: http://www.jstor.org/stable/4617866.

    Accessed: 27/09/2011 13:53

    Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at.http://www.jstor.org/page/info/about/policies/terms.jsp

    JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of

    content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new form

    of scholarship. For more information about JSTOR, please contact [email protected].

    The MIT Pressis collaborating with JSTOR to digitize, preserve and extend access to Computer Music

    Journal.

    http://www.jstor.org/action/showPublisher?publisherCode=mitpresshttp://www.jstor.org/stable/4617866?origin=JSTOR-pdfhttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/page/info/about/policies/terms.jsphttp://www.jstor.org/stable/4617866?origin=JSTOR-pdfhttp://www.jstor.org/action/showPublisher?publisherCode=mitpress
  • 7/27/2019 McAdams Bregman Streams

    2/21

    Hearing

    Musica l

    Streams

    Stephen

    McAdams

    Hearing

    and

    Speech

    Sciences

    Stanford

    University

    School of Medicine

    Stanford,

    California

    94305

    Albert

    Bregman

    Department

    f

    Psychology

    McGill

    University

    Montreal,

    Quebec,

    Canada

    Introduction

    The

    perceptual

    ffectsof

    a

    soundare

    dependent

    pon

    the musical ontext nwhich hatsounds imbedded. hat s,

    a

    given

    ound's

    erceived

    itch,

    imbre,

    nd oudness re

    n-

    fluenced

    by

    the sounds hat

    precede

    t,

    coincidewith

    it,

    and

    even ollow

    t in

    time.

    Thus,

    hiscontext nfluences

    he

    way

    a listener

    will associate

    he soundwith

    various

    melodic,

    rhythmic,

    ynamic,

    armonic,

    nd

    timbral tructures ithin

    the

    musical

    equence.

    t

    thus behooves he

    composer

    nd

    interpreter

    o

    understand

    he various

    erceptual rganizing

    principles

    hataffect he

    derivation

    f musical ontext

    rom

    sequences

    f

    acoustic

    vents.

    We

    nclude

    he

    interpreter

    ere

    because

    everal

    musical

    dimensions,

    uch

    as

    timbre,

    attack

    and

    decay

    transients,

    nd

    tempo,

    are

    often not

    specified

    exactlyby

    the

    composer

    nd

    are

    controlled

    y

    the

    performer.

    In this articlewe shalldiscussprincipleshatdescribe

    how

    variousmusical

    imensions

    ffect he

    perceived

    ontinui-

    ty

    of music.

    Leon

    van

    Noorden

    as

    stated hat "in

    sequences

    where

    he tones follow

    one another

    n

    quick

    succession,

    effectsare

    observed

    which ndicate

    hat

    the tones arenot

    processed

    ndividually

    y

    the

    perception

    ystem.

    On

    the one

    hand

    we

    find various

    ypes

    of

    mutual

    nteraction etween

    successive

    ones,

    such as

    forward

    nd

    backward

    masking,

    loudness

    nteractions

    ndduration

    nteractions. n

    he other

    hand,

    a kind of connection

    s

    found

    between

    he

    successive

    perceived

    ones."

    [28,

    p.1

    ].

    As

    for

    simultaneousonic

    events,

    Bregman

    5,

    9]

    has

    suggested

    hat

    different

    ounds

    are

    ex-

    tracted

    ccording

    o

    various

    erceptual

    nd

    cognitive

    rga-

    nizational

    mechanismsrom

    he

    superimposed

    coustic ibra-

    tions.While omeresearchers ould ike to attribute hese

    phenomena

    o

    mechanisms

    n the

    peripheral

    r

    early

    entral

    nervous

    ystem which

    are

    surely

    nvolved o

    some

    extent,

    see

    [24,

    33]),

    prominent

    members

    f thismore

    psychophys-

    ically

    oriented school"

    re

    beginning

    o

    think he

    organiza-

    tion

    s

    too

    complex

    o

    be so

    easily

    explained.

    However,

    ather

    than

    delve

    nto

    theoretical

    xplanations

    f these

    phenomena,

    it will suffice

    or

    the

    present

    purpose

    o describe

    hem

    n

    general

    nd o

    briefly

    quantify

    ome

    salient

    parameters

    hat

    have

    ntrigued

    s

    with

    compositional

    ossibilities.

    his

    artic

    thuspresents,na tutorialashion, review f researchin-

    cluding

    ur

    own)

    whichhas direct

    mplications

    or musi-

    cians,

    specially

    or

    composers

    orking

    with

    computer

    musi

    Inall of our

    researchnd n

    mostother

    research

    ited,

    com-

    puters

    predominantly

    DP-11

    minicomputers)

    ereused

    o

    synthesize

    he

    sounds

    presented.

    They

    were

    alsousedfor

    presentation

    f sound

    timuli,

    ollection

    f

    responses

    rom h

    listeners,

    nd

    analysis

    f

    the data.For

    thorough

    ummari

    and heoretical

    reatmentsf thisarea

    of

    research,

    ee

    [2,

    5,

    9.]

    .

    What

    s An

    Auditory

    tream?

    Auditory treamormationheory s concernedwith

    how the

    auditory

    ystem

    determines

    hether

    sequence

    f

    acoustic

    vents

    esults rom

    one,

    or

    more

    han

    one,

    "source

    A

    physical

    source"

    may

    be

    considered

    s

    some

    sequence

    f

    acoustic vents

    emanating

    romone

    location.

    A

    "stream"

    is a

    psychological

    rganization

    hat

    mentally

    epresents

    uch

    a

    sequence

    nd

    displays

    certain

    nternal

    onsistency,

    r con

    tinuity,

    that allows

    that

    sequence

    o be

    interpreted

    s a

    "whole."

    By way

    of

    example,

    wo

    possible

    perceptual

    r-

    ganizations

    f

    a

    repeating

    ix-tone

    sequence

    re llustrated

    n

    Figure

    1. Time

    s

    represented

    n

    the horizontal

    xis and

    frequency

    s

    represented

    n

    the

    vertical

    xis.

    Thedotted

    ine

    connecting

    he tones

    n the

    figure

    ndicate

    he stream

    erce

    In the first

    configuration,

    ix

    tones

    are

    heardone

    after

    the

    other n a continuouslyepeatingycle (Taped llustratio

    la)l;

    it is

    easy

    to

    follow

    the entire

    melodic

    pattern.

    n

    the

    second

    percept,

    hough,

    ne

    might

    hear wo

    separate

    hree-

    tone

    patterns

    which

    appear

    o

    have ittle

    relationship

    o

    eachother

    Taped

    llustration

    1b).

    It is

    difficult

    n thiscase

    to

    follow

    he

    original

    ix-tone

    pattern.

    Note

    that

    n

    the

    first

    example

    ne stream

    s

    heard,

    nd n

    the

    second,

    wo

    are

    hea

    ?

    1979

    by

    Stephen

    McAdams

    nd

    Albert

    Bregman

    (1)

    A

    tape

    containing

    ound

    examples

    s

    available

    rom

    Mr.

    McAdams.

    escriptions

    f

    the

    taped

    llustration

    are

    ound

    n

    Appendix

    .

    Page

    26

    Computer

    Music

    Journal,

    Box

    E,

    Menlo

    Park,

    CA

    94025

    Volume

    3

    Numb

  • 7/27/2019 McAdams Bregman Streams

    3/21

    D

    D

    D

    B"B

    /B

    B'

    F

    F

    %

    F

    -

    "

    I

    I

    .

    O

    ST

    I

    I

    Cr

    I

    I

    -

    %

    %

    (a)

    (b)

    Time

    -,

    Time

    One

    Stream

    Two Streams

    Figure

    1. A repeating6-tone sequence

    composed

    of

    interspersed

    high

    and low tones can resultin differentpercepts.In Figure

    l

    with

    high

    and

    low tones

    alternating

    at a

    tempo

    of 5

    tones/sec.,

    one

    perceptual

    tream

    is

    heard.

    In

    Figure

    1

    b,

    at

    a

    tempo

    of

    10

    tones/sec.,

    the

    high

    tones

    perceptually

    egregate

    rom the low tones

    to form two streams

    (cf.

    Taped

    Illustration

    1).

    D

    4

    F

    t

    t

    D

    o

    0

    L.

    F

    C

    C

    u

    "=I

    -U

    A

    ,/

    A

    /

    Time

    -+

    Time

    -+

    I : J J

    J i

    I : J J J :

    II

    U

    High

    Stream Low

    Stream

    High

    Stream

    Low Stream

    Figure

    2.

    Due

    to the

    competition

    among

    stream

    organizations,

    one

    F

    may

    be

    perceived

    as

    belonging

    o

    either the

    higher

    stream

    or

    the

    lower

    stream

    but

    not to

    both.

    The

    organization

    of

    the

    streams

    changes,

    among

    other

    things,

    the

    perceived

    rhythmic

    structure,

    as

    indicatedunder

    each

    diagram

    cf.

    Taped

    Illustration

    2).

    Stephen

    McAdams

    nd Albert

    Bregman:

    Hearing

    Musical

    Streams

    Pag

  • 7/27/2019 McAdams Bregman Streams

    4/21

    In

    the rest

    of

    this section we

    will

    discuss

    some

    of the

    properties

    exhibited

    by

    a

    stream.

    For

    example,

    it is

    possible

    to focus

    one's

    attention

    on a stream and follow

    it

    through

    time

    [6,

    28].

    In the first

    taped

    example

    one

    can

    follow the

    six-tone

    pattern

    without

    any

    trouble.

    Thus

    a

    stream

    must

    exhibit

    a certaincoherence

    over

    time.

    However,

    n

    the second

    example,

    it

    is

    difficult

    to follow

    the six-tone

    pattern,

    but

    it is

    easy

    to

    follow

    either three-tone

    pattern.

    Notice that one

    can

    pay

    attention

    to either the

    higher

    or lower

    stream,

    switching

    between

    them

    at

    will,

    but that it

    is not

    possible

    to attend

    to

    both

    simultaneously (Taped

    Illustration

    ib).

    Indeed,

    each

    three-tone pattern n Figurelb constitutesa separate tream

    and

    maintains

    ts own

    temporal

    coherence.

    While

    a listener

    s

    paying

    attention

    to

    one

    coherent

    stream,

    other acoustic

    in-

    formation

    is

    perceptually

    relegated

    to

    the

    background.

    If

    one

    group

    of

    sounds is distinct

    enough,

    the

    foreground

    back-

    ground

    relation

    may

    be almost

    involuntary

    and it

    may

    require

    a

    great

    deal

    of attentional

    effort to focus

    on

    streams

    nitially

    relegated

    o the

    background.

    The

    information-processing

    ature

    of the

    stream

    segre-

    gation

    process

    s

    suggested

    by

    the

    observation

    hat the

    segre-

    gation

    of

    a

    sequence

    nto

    smaller treams

    akes time

    to

    occur

    [4,

    13].

    In

    Figure

    lb,

    notice that

    one

    can

    hear a

    six-tone

    pat-

    tern for

    the first

    few

    cycles

    before

    it

    segregates

    nto

    two

    separate treams TapedIllustration b). It thus appears hat

    the

    perceptual

    system

    assumes

    things

    are

    coming

    from

    one

    source

    until it

    acquires

    enough

    information

    to

    suggest

    an

    alternate

    nterpretation.

    When

    temporal

    coherence

    is

    lost in

    a

    sequence,

    it

    be-

    comes

    more difficult to order the

    events of

    that

    sequence

    n

    time

    [6,

    11,

    14,

    28]

    .

    In

    Figure

    la,

    it

    would

    be

    easy

    to tell the

    orderof

    tones

    A,

    B,

    andC. But as this

    larger

    treambreaks

    down into

    the smaller

    treamsof

    Figure

    b,

    i.e.,

    as

    temporal

    coherence

    s

    lost,

    it

    becomes

    more difficult

    to

    judge

    the order

    of these tones. In such a case

    one

    might

    notice

    that tone

    A

    comes

    before tone

    C but

    it

    would be hard

    to tell

    whether

    tone B

    came before

    A,

    between

    A

    and

    C,

    or

    after

    C. This

    statement

    should be

    qualifiedby noting

    that in

    the

    tone se-

    quences

    mentioned

    all

    of the

    tones are

    of

    equal

    loudness

    and

    timbre.

    The

    tone

    sequences

    are furthermore

    continually

    recycling

    and

    are

    faded

    in

    so that the

    listener cannot label tone

    A as

    being

    the first

    tone in the

    sequence.

    Information n a

    musical

    context,

    such

    as

    the

    first

    beat

    being emphasized

    as

    a

    downbeat,

    may give

    the

    listener an

    anchor

    point

    against

    which

    to relate

    the

    temporal

    positions

    of other

    tones. Thus

    one

    can

    judge

    the order

    of

    events in

    time

    within

    a

    given

    perceptual

    streambut

    not

    necessarily

    across

    streams.

    Accompanying

    his

    is the observation

    hat different streamscan

    appear

    o

    overlap

    in

    time even when

    they

    do

    not.

    Again

    in

    Figure

    lb,

    notice

    that one can hear two three-tone

    patterns apparently

    going

    on at

    the same time even

    though

    the tones are

    alternating

    (TapedIllustration b).

    If an

    event

    is

    potentially

    a memberof

    more

    than

    one

    "competing"

    tream,

    one

    may perceive

    t as

    belonging

    o one

    stream

    or another

    but

    not to both

    simultaneously[3,

    10,

    11] .

    This

    is not

    to

    say

    that

    a musiciancannot

    hear

    several

    simul-

    taneous

    lines. The

    point

    is that it is

    impossible

    o

    use several

    parsing

    schemes at the

    same time.

    Figure

    2 illustrates

    the

    effect

    on

    our second

    example

    of

    moving

    the

    higher

    triplet

    into closer

    frequency

    proximity

    to

    the

    lower

    triplet.

    We can

    find an

    intermediate

    position

    where

    the lowest

    tone,

    tone

    F,

    can

    group

    with

    either

    the

    higher

    or lower

    stream

    (Taped

    Illustration

    2).

    Notice

    that

    this

    regrouping

    esults n

    a

    rhyth-

    mic transformation

    as

    shown under

    each

    configuration.

    Some

    non-musical

    examples

    of

    this

    phenomenon

    are the

    fac

    vase illusion

    and

    the reversible

    Necker

    cubes;

    Escher

    and

    Vasarely

    have

    produced

    art works

    based

    on such

    principle

    of

    perceptual

    organization.

    Finally,

    this

    brings

    up

    the

    relationship

    between

    "sourc

    and

    "stream."

    A stream

    s

    perceived

    as

    emanating

    rom

    a

    sin

    source.

    So,

    in

    the

    first

    example,

    the

    pattern

    was

    fairly

    con

    tinuous

    and was

    easily

    recognized

    as

    coming

    from

    one

    sourc

    but in Figure lb, the large frequencydistancebetween the

    two

    three-note

    groups

    introduced

    a sort

    of

    discontinuity

    that caused

    the

    perceptual

    system

    to

    interpret

    the

    sequenc

    as

    resulting

    rom

    two sources.

    Since

    at

    any given

    moment

    th

    composite

    pressure

    variations

    timulating

    he ear

    result from

    several

    ources,

    the

    auditory system

    needs

    a

    battery

    of heuri

    tics to

    parse,

    or

    segregate,

    he

    information

    into

    separate

    streams.

    It thus needs

    to

    build a

    description

    of the

    acoustic

    environment rom

    separate

    descriptions

    of

    the

    various

    stream

    and

    the

    relationships

    between

    them

    [2].

    Factors

    which the

    perceptual

    ystem

    uses to

    build

    descriptions

    of

    streams,

    and

    subsequently

    sources,

    are

    frequency,

    rate

    of occurrence

    of

    events

    (or

    tempo),

    intensity,

    timbre,

    and

    attack/decay

    tran

    sients.

    In

    the

    rest

    of the

    paper

    hese will be discussed

    n

    som

    detail.

    Of

    course

    it is obvious

    that sounds

    are

    assigned

    perce

    tually

    to different

    sources

    when

    the

    physical

    sources

    are at

    different

    spatial

    positions.

    In

    this

    case,

    intensity, spectral,

    and

    temporal

    cues

    are

    all

    utilized to

    parse

    the

    sound

    into

    separate

    sources.

    However,

    we

    will

    primarily

    confine

    our

    discussion

    to the

    illusion

    of

    many

    sources which

    occurs

    due

    to the

    organizations

    within

    a

    single

    emanation

    of sound

    Frequency

    and

    Tempo

    Consider

    hat

    a

    repetitive

    cycle

    of tones

    spread

    over a

    certain

    frequency

    range

    may

    be

    temporally

    coherent,

    or in

    tegrated,

    at a

    particular

    empo.

    It is

    possible

    to

    gradually

    n-

    crease the

    tempo

    until certain tones

    group

    together

    into

    separate

    treams

    on the

    basis

    of

    frequency,

    as

    discussed

    abov

    The faster

    the

    tempo,

    the

    greater

    he

    degree

    of breakdown

    o

    decomposition

    into narrower

    treams

    until

    ultimately ever

    given

    frequency

    might

    be

    beating along

    in its own

    stream.

    This

    last

    possibility

    is

    dependent

    upon

    a

    number

    of other

    factors which

    will be

    discussed

    ater.

    Figure

    3 illustrates

    the

    possible

    stages

    of

    perceptual

    decomposition

    or

    a

    recycling

    six-tone

    pattern

    as

    one

    gradu

    ly

    increases

    the

    tempo (Taped

    Illustration

    3).

    Note

    that

    streams

    per

    se are

    not tracked

    beyond

    a

    certain

    point,

    but

    a

    texture

    or

    timbre s

    perceived

    since

    the

    ability

    to

    temporally

    resolve he

    individual

    ones

    degenerates

    altogether.

    Of

    cours

    one couldhold the tempo constantandgradually xpandthe

    frequency

    relationships

    o achieve

    a

    similar

    streaming

    effec

    [3,

    6,

    14,

    17],

    but

    the

    musical

    consequences

    would

    be

    vastly

    different

    as

    one can hear

    n

    Taped

    Illustration

    4.

    Dowling [1

    used

    simple

    melodies

    to illustrate

    this

    frequency-based

    streamingprinciple.

    He interleaved

    wo melodies

    n

    the

    same

    frequency

    range

    hereby making

    t

    very

    difficult,

    without

    pr

    knowledge

    of the

    melodies,

    to

    separate

    them

    perceptually

    But as

    they

    were

    pulled

    apart

    n

    frequency,

    .e.

    when

    all

    the

    tones

    of one

    of

    the melodies

    were

    transposed

    upward,

    each

    melody

    became

    apparent Taped

    Illustration

    5).

    Page

    28

    Computer

    Music

    Journal,

    Box

    E,

    Menlo

    Park,

    CA

    94025 Volume 3 Numb

  • 7/27/2019 McAdams Bregman Streams

    5/21

    r

    I

    ?

    I

    r

    I

    I

    I

    i I I

    I

    -

    I

    .

    "

    .

    .2-.

    ,

    ,

    =.

    cc

    L

    L

    0

    0)0

    cm

    Q/)

    %

    r-

    S

    IE

    0

    oo

    E

    O

    -------

    -----,

    --------

    --o,)=----------

    -

    CC

    u,

    QI

    ----------------------?

    -I

    Time

    Timbre/Texture

    Figure

    3.

    This

    figure

    illustrates the

    decomposition

    of

    an

    acoustic

    sequence

    into

    smallerand

    smaller

    perceptual

    treams

    as

    the

    frequency

    separation

    between the

    tones or the

    tempo

    of

    the

    sequence

    increases.In the

    latter

    case,

    a

    point

    is

    ultimately

    reachedwhere

    one can

    no

    longer perceive

    individual onal

    events;

    a texture or

    timbre is

    heard

    nstead

    (cf.

    Taped

    Illustra-

    tions

    3

    and

    4).

    The

    requency

    ange

    withinwhich he

    perceptual

    ystem

    groups

    ones on the basisof

    frequency

    roximity

    s

    not

    constant;he groupinganvarywith the particular attern

    of

    frequencies

    resented.

    or

    example see Figure ),

    two

    tones,

    A and

    B,

    are

    arranged

    ith

    given

    frequency

    nd

    temporal

    eparations

    uch

    that

    they

    will

    always

    stream

    together

    whenno other ones are

    present.

    We can create

    a

    similar

    rganization

    ith

    tonesX and

    Y

    in another

    requency

    range.

    These wo

    groups

    will

    each

    form

    a stream

    s

    can

    be

    heard n

    Taped

    llustration

    a.

    Bybringing

    he

    pairs

    nto

    the

    same

    requency ange,

    new streams an be

    formed,

    uchas

    A-X

    and

    B-Y,

    on

    the basisof

    an

    alternative

    roximity

    organization

    3]

    (Taped

    llustration

    b).

    Thus,

    he

    particular

    relationships

    etween

    requencies

    n a

    tonal

    pattern,

    ndnot

    just

    the

    frequency

    eparation

    etween

    adjacent

    ones,

    plays

    a

    vital

    role n

    the formation f

    streams.

    It

    appears

    romour

    examples

    hatthere s an

    essentially

    inverse

    and

    strictly

    interdependent

    elationship

    etween

    tempo

    and

    frequency

    elationshipsmong

    ndividual

    ones

    [26,

    28]:

    the

    faster he

    tonesfollowone

    another,

    he

    smaller

    the

    frequency

    eparation

    t

    which

    hey

    segregate

    nto

    separate

    perceptual

    treams.

    Conversely,

    he

    greater

    he

    frequency

    separation,

    he slower

    he

    tempo

    at

    which

    egregation

    ccurs.

    This llustrates

    et

    another

    spect

    of

    music

    which,

    with the

    aid

    of

    the

    computer,

    ancomeunder

    he

    composer's

    ontrol.

    Suppose

    we

    make

    a

    graph,

    s in

    Figure

    ,

    which

    relates

    frequencyeparation

    f

    two

    alternating

    ones

    on one axisto

    A&B A&B

    Isolated Absorbed

    A

    A

    je.-"

    x,0.1

    4

    S

    O

    O

    Y

    >

    B

    B

    a)

    (D

    L

    X

    Time

    Time-

    Figure

    4. This

    figure

    illustrates

    he

    effect

    of

    frequency

    cont

    as

    opposed

    to

    frequency

    separation

    on

    stream ormation.

    In

    the

    first

    part,

    tones

    A

    and

    B

    form

    one

    perceptual

    tream

    wh

    is unaffectedby the simultaneous treamcomposedof tone

    and

    Y.

    When

    ones

    X

    and

    Y

    are moved

    into the

    proximity

    o

    tones

    A and

    B,

    which

    remain

    unchanged,

    an

    alternate

    perceptual

    nterpretation

    esults

    whereby

    tones

    A

    and B ar

    assigned

    o

    separate

    treams

    on

    the

    basis

    of

    a new

    frequenc

    context

    (cf.

    Taped

    Illustration

    ).

    the rate

    of

    alternation

    f

    the

    tones

    on

    the

    other

    axis.

    Note

    thatthe horizontal xis ndicates

    ncreasing

    one

    repetiti

    time,

    which

    corresponds

    o

    decreasing

    empo.

    Onecandra

    boundaries n

    this

    graph

    ndicating

    he

    frequency-tem

    regionsnwhich hetonescohere s a single tream nd ho

    in

    which

    hey

    segregate

    nto

    two

    simultaneous

    treams f

    different

    requencies.

    here

    are

    two

    such

    boundaries.

    eo

    VanNoorden

    28]

    has

    termed hese he

    ission

    boundary

    temporal

    oherence

    oundary

    ndnoted

    hat

    they

    are

    slig

    different oreach

    person.

    n

    Figure

    the

    upper

    urve

    s

    the

    temporal

    oherence

    oundary.

    Above his

    boundary

    t is

    impossible

    o

    integrate

    he

    two

    alternating

    ones into one

    stream.Below

    this

    boundary

    ies

    the

    temporal

    oheren

    region,

    where t is

    possible

    o

    integrate

    vents nto

    a

    sing

    perceptual

    tream.

    The lower

    boundary

    s the

    fission

    boundary.

    elow his t

    is

    impossible

    o

    hearmore

    hanon

    stream.Note

    that

    the

    regions

    f

    fission

    and coherence ls

    overlap, reating n ambiguousegionwhereeitherperce

    may

    be

    heard.

    As the

    tempo

    decreases

    emporal

    oherence

    an b

    maintained

    t

    greater

    requency

    eparations.

    owever,

    he

    frequency

    eparation equired

    or

    fission remains

    airly

    constant

    with

    decreasingempo.

    The

    emporal

    oherence

    fission boundaries bove

    and

    below

    a

    given

    tone

    are

    symmetrical

    ith

    respect

    o

    pitch.

    This

    ndicates

    hat

    the

    phenomena

    f

    temporal

    oherence nd

    fission

    may

    occur

    depending

    n the

    absolutemusical

    nterval

    etween

    he

    ton

    but

    irrespective

    f

    the direction

    f

    pitchchange.

    While

    he

    are substantial

    uantitative

    ifferencesn

    theseboundar

    Stephen

    McAdams

    nd

    Albert

    Bregman:

    Hearing

    Musical

    treams

    Pag

  • 7/27/2019 McAdams Bregman Streams

    6/21

    Tempo

    (Tones/Sec)

    20 10

    5

    3

    o

    15

    Al

    /

    Alwaysays

    c/

    Segregated,

    O

    10 5

    Tone

    Repetition

    TiAm

    biguous

    =

    IRegion

    Always

    Fission

    BoundaryCoen

    0 50 100 150

    200

    250 300

    350 400

    Tone Repetition Time (msec)

    Figure

    5.

    Boundaries of

    temporal

    coherence

    (upper

    curve)

    and

    fission

    (lower

    curve)

    define three

    perceptual regions

    in

    the

    stream

    relationship

    between two

    alternating

    tones each

    lasting

    40

    msec.

    This

    relationship

    is

    a

    function of both

    the

    tempo

    of

    alternation

    and

    the

    frequency separation

    between the tones

    (after [28]

    ).

    between

    listeners,

    the

    qualitative

    rends

    tend

    to

    be similar

    [28].

    Note that the difference between boundaries is

    increasingly

    ubstantial

    for

    tempi

    below about

    10

    tones/sec.

    (tone

    repetition

    time

    =

    100

    msec.).

    This

    means that

    at

    very

    slow

    tempi

    of, say,

    five

    tones/sec.,

    a

    separation

    of more

    than

    a minor 10th is

    necessary

    o induce

    streaming.

    Below

    this

    it

    becomes

    virtually

    impossible

    to induce

    streaming.

    In the

    limiting

    case,

    if the

    temporal

    distance

    between tones

    is too

    great,

    they

    do

    not

    seem to

    be

    connected

    at all but sound

    as isolated events.

    A

    musically

    relevant

    aspect

    of these

    boundaries hould

    be mentioned.

    The

    region

    between

    the

    two

    boundaries

    may

    be

    considered

    o be an

    ambiguous egion

    since

    either

    a

    segregated

    or an

    integrated

    percept

    may

    be heard.

    The

    primary

    determining actorin this region s attention. In otherwords,

    it is

    possible

    to shift one's

    attention back

    and forth

    between

    the two

    percepts

    when the

    sequence

    falls

    in this

    region.

    For

    example,

    one

    may

    focus either on a whole

    stream

    percept

    or

    on

    smaller ndividualstreams.

    It

    appears

    that the closer

    the

    sequence

    ies

    to

    one of

    the

    boundaries,

    he

    easier

    t

    is

    to

    focus

    on the

    percept

    which

    is

    predominant

    beyond

    that

    boundary.

    Conversely,

    t is

    very

    difficult in this

    situation to shift

    one's

    attention back

    to

    the

    other

    percept.

    Once

    the

    physical

    values

    go beyond

    either of these

    boundaries,

    attaining

    the

    comple-

    mentary

    percept

    may

    be

    considered

    o be

    impossible.

    The

    role

    of

    attention will be

    discussed n more detail later.

    FrequencyTrajectories

    Another

    frequency-based

    effect involves

    frequency

    trajectories.

    These are

    important

    on two

    levels. The

    first

    involves

    trajectories

    between

    tones

    (see

    Figure 6).

    Bregma

    and

    Dannenbring

    7]

    have

    found

    that tones

    that are

    con-

    nected

    by

    glissandi

    are

    much

    less

    likely

    to

    segregate

    under

    given

    conditions

    of

    tempo

    and

    frequency

    separation

    than

    those

    which

    make

    abrupt requency

    transitions. ntermedia

    situations

    beget

    intermediate

    results.

    Yet even

    when

    using

    sinewave for frequencymodulationof a sine tone, it is pos-

    sible to discern

    a sort

    of

    streaming

    ffect

    of the

    higher

    and

    _I

    I

    Semi-

    a I

    D

    I

    I I

    I

    Ramped

    I

    I I

    r- I

    rapeady-

    Tas

    Sta

    1

    a.,

    a

    a

    Discrete

    aaTime-+

    I __ _ __ _

    Steady- Transition

    Steady-

    State State

    Figure

    6.

    These

    are

    the 3

    types

    of

    frequency

    transitions

    between

    tones used

    by

    Bregman

    and

    Dannenbring

    (1973).

    In

    the

    top

    section

    the tones are

    completely

    connected

    by

    a

    frequency

    glissando.

    In the

    middle section

    an

    interrupted

    glissando

    is

    directed towards

    the

    succeeding

    tone.

    No

    frequency glide

    occurs

    in the bottom

    section;

    the

    first tone

    ends on

    one

    frequency

    and

    the next tone

    begins

    on

    another

    (after

    [71 ).

    The unconnected

    tones

    segregate

    more

    readily

    than the

    others.

    lowerpeaksof the modulationat certainmodulationfreque

    cies

    (Taped

    Illustration

    7).

    At

    lower

    modulating frequenci

    one can

    track

    the

    modulation,

    but higher

    modulating

    fre-

    quencies

    result

    n the

    effect

    of

    a texture or timbre. This con

    tinuum

    is

    found

    for

    modulation

    involving

    discrete

    changes

    in

    frequency

    as

    well.

    The second

    type

    of

    trajectory

    might

    be

    called

    a melod

    trajectory.

    The

    basic rule

    goes: large

    jumps

    and

    sudden

    changes

    n

    direction

    of a

    melody produce

    discontinuity

    in

    that

    melody.

    In

    terms

    of stream

    formation,

    one

    or

    two

    tone

    in

    a

    melody

    that

    are

    removed

    rom the

    melodic

    continuity

    o

    the

    rest could

    be

    perceived

    as

    coming

    from

    a different sourc

    Page

    30

    Computer

    Music

    Journal,

    Box

    E,

    Menlo

    Park,

    CA

    94025

    Volume 3 Num

  • 7/27/2019 McAdams Bregman Streams

    7/21

    and

    would not be

    integrated

    nto

    the

    melody

    as

    a

    whole,

    perhaps eaving

    a

    rhythmic

    gap

    in the

    phrase

    depending

    on

    the natureof the

    main

    sequence.

    It

    has been

    reported,

    how-

    ever,

    that

    the

    excluded tones

    are

    sometimes

    noticed in the

    background,

    with their absence

    having

    ittle

    effect

    on

    the

    main

    melody

    line

    [23].

    Implied

    polyphony

    in a

    solo

    compound

    melody

    line,

    as

    in the suites

    for

    solo

    instruments

    by

    Bach,

    is a

    compositional

    use

    of

    this

    principle.

    Schouten

    [30]

    reported

    that if

    an

    ascending

    and

    de-

    scending

    major

    scale

    fragment

    played

    with

    sine

    tones is

    con-

    tinually

    repeated,

    temporal

    coherence

    is

    maintained

    up

    to

    a

    tempo of about 20 tones/sec. However, f these same tones

    are

    arranged

    t

    random,

    he

    maximum

    tempo

    at which

    coher-

    ence still occurs s

    reducedto

    about 5-10

    tones/sec.

    It

    might

    be inferred

    that

    the reduction

    of

    predictability

    reduced the

    pitch

    boundary

    within

    which the

    auditory

    system

    could

    successfully

    integrate

    the

    incoming

    information. But

    van

    Noorden

    [28]

    found

    that

    previous

    knowledge

    of

    the

    order

    of

    tones

    had

    no

    effect on

    the

    coherence

    boundary.

    It

    might

    be that the

    small

    frequency

    umps

    in

    Schouten's

    demonstra-

    tion are effective

    in

    holding

    the

    stream

    ogether.

    Heise and

    Miller

    [23] investigated

    our

    melodic

    con-

    tours: a

    V-shaped

    contour,

    an

    inverted

    V,

    and

    rising

    and

    fall-

    ing

    scale

    patterns;

    he

    V-shaped

    patterns

    which

    change

    di-

    rection

    can be

    thought

    of

    as

    being

    less

    "predictable"

    han

    the scale

    patterns

    which move

    in

    only

    one

    direction.

    Each

    pattern

    was

    eleven tones

    long

    and

    the

    frequency

    of

    the

    middle

    tone was

    variable.

    They

    found

    that the

    degree

    to

    which

    the

    variable

    one

    could

    be

    separated

    n

    frequency

    from the

    rest

    of

    the

    pattern

    before

    it

    segregated

    nto its

    own stream

    was a

    function

    both of

    the

    shape

    and of

    the

    steepness

    of

    the

    pattern.

    The

    steepness

    was varied

    by

    keeping

    the

    tempo

    constant

    and

    varying

    he

    interval

    between

    successive

    ones.

    As

    the rate

    of

    frequency change

    for

    the

    entire

    pattern

    increases

    over

    time,

    so does

    the

    amount of

    frequency separation

    of the

    middle

    tone

    required

    o

    producesegregation.

    Less

    separation

    of

    the

    middle

    tone

    is

    required

    to

    produce

    segregation

    with

    the

    V-shaped patterns

    than with

    the

    scale

    patterns,

    possibly

    indicatingthat the perceptualsystem can follow

    "predict-

    able"

    patterns

    more

    quickly

    than

    "unpredictable"

    ones.

    (However,

    certain

    anomalies

    n

    their

    data,

    which will not be

    discussed

    here,

    might

    suggest

    other

    interpretations.)

    Van

    Noorden

    found

    similarresults

    for

    patterns

    with as

    few

    as

    three

    tones

    [28]

    .He

    investigated

    he relative

    emporal

    coherence of

    so-called

    linear

    and

    angular

    three-tone se-

    quences.

    For

    linear

    sequences,

    i.e.,

    those

    with two

    tone

    intervals

    n the

    same

    direction,

    the

    temporal

    coherence

    bounda-

    ry

    occurs

    at

    faster

    tempi

    than were

    found

    with

    angular

    se-

    quences

    in

    which the

    melodic

    pattern

    changed

    direction

    at

    the middle

    tone. In

    this latter

    case

    the

    first

    and third

    tones

    are more

    contiguous

    in

    frequency

    than are the first

    and

    second, or second andthirdtones, which facilitatesa loss of

    temporal

    coherence

    similar o that found

    by

    Schouten. It

    should be

    noted

    that while

    these melodic

    trajectory

    effects

    do

    seem to

    play

    a role in

    musical stream

    ormation

    beyond

    the

    simple

    frequency

    separation ffect,

    they

    are

    certainly

    con-

    founded

    by

    other

    contextual

    organizations.

    Both of

    these

    trajectory

    examples

    llustrate

    a

    principle

    of

    perceptual

    organization

    that uses

    pattern

    continuity,

    in some form

    or

    another,

    as a

    criterion or

    "source"

    distinc-

    tion.

    Figure

    7

    illustrates,

    however,

    that

    frequency proximity

    may

    sometimes

    compete

    with

    trajectoryorganization;

    Deutsch

    found

    that

    simultaneous

    ascending

    and

    descending

    cale

    pat-

    terns

    presented

    to

    opposite

    ears

    segregate

    nto

    upright

    and

    inverted

    V-shaped

    melodic

    contours

    [16].

    Each

    contour

    is heard as if

    being

    presented

    o one ear. The

    same result

    wa

    found

    by

    Halpern

    [22]

    for simultaneous

    ascending

    and

    de

    scending

    sine tone

    glissandi,

    as can be heard

    n

    Taped

    Illust

    tion

    8;

    in this case both

    glissandi

    were

    presented

    o both

    ea

    In

    these two

    examples

    a

    stream

    boundary

    s establishedat

    the

    pitch

    where

    the two lines

    cross.

    a

    a

    High I

    \

    Contour

    I

    t

    1

    --I

    S

    Low

    .Contour

    L

    a

    / Low

    '

    _

    Time-+

    Figure

    7. This

    stimulus,

    used

    by

    Deutsch

    (1975),

    is

    com

    posed

    of

    ascending

    and

    descending

    scales

    presented

    sim

    taneously

    to

    opposite

    ears. Two

    V-shaped

    patterns

    (hi

    and low contours outlined

    with

    dashes)

    are

    perceived

    rath

    than the

    complete ascending

    and

    descending

    cale

    pattern

    Loudness

    nd

    Continuity

    ffects

    Fission

    an be obtained

    by alternating

    equences

    f

    tonessimilar

    n

    frequency

    nd

    imbrebut

    differing

    n

    inten

    ity.

    Figure

    8

    illustrates

    he

    range

    of

    percepts

    achieved

    as

    the

    amplitude

    evelof toneA is varied elativeo thatof toneB

    in the

    alternating

    equence

    ABAB...

    .

    The reference

    ev

    for toneB used

    by

    vanNoorden

    n these

    experiments

    28]

    was35 db

    SPL;

    he

    frequency

    f both toneswas

    1

    KHz

    an

    each

    tone

    lasted40 msec.

    If the levelof tone A is

    below

    the

    auditory

    hreshold

    approximately

    db

    SPLat

    1

    KHz

    only

    the B streams heard

    t half

    tempo,

    as

    might

    be

    expec

    (see

    Figure a).

    When

    one A is loud

    enough

    o be heard

    n

    is at least

    5

    db below

    tone

    B,

    two

    separate

    treams f

    dif

    ferent loudness can be

    perceived,

    each at half

    tempo,

    with

    A

    being

    the softer stream

    (see Figure

    8b).

    When

    A is within

    5

    db of

    B,

    a

    "pulsing"

    tream

    s heard and neither the

    A

    nor

    the

    B

    streamcan be heard

    ndependently,

    .e.,

    tempora

    coherences inevitablen thisrangesee Figure c). As th

    level

    of

    the A tone

    s

    increased bove

    hatof the

    B

    tone,

    th

    different

    percepts

    may

    result,

    depending

    n

    the

    alternati

    tempo

    of A andB. If this

    tempo

    s lessthanabout

    13

    tones

    sec.,

    fission s the next

    percept

    eard.This ime

    B is the

    sof

    stream

    see Figure

    d).

    Thusa certain

    degree

    of

    loudness

    difference llowsus to focus

    on eitherstream

    using

    only

    this

    nformation.

    f

    the

    tempo

    s

    greater

    hanabout

    12.5 1

    tones/sec.,

    he

    percept

    encountereds the "roll"

    effect

    discovered

    y

    vanNoorden. t

    soundsas if stream

    A,

    the

    louder

    tream,

    were

    pulsing

    t

    half

    tempo

    as in the

    fissio

    percept,

    ut the B

    stream ounds

    s f it were

    pulsing

    t

    ful

    Stephen

    McAdams

    nd Albert

    Bregman:

    Hearing

    Musical

    treams

    Pag

  • 7/27/2019 McAdams Bregman Streams

    8/21

    A

    Below Threshold

    a

    -K

    A

    B

    (

    B

    time

    Fission

    b

    fA

    B

    A

    I

    Coherence

    c

    A

    B

    A

    B

    Fission

    d

    A

    B

    A

    B

    Roll

    e

    A

    B

    A

    B

    Continuity

    Masking

    g

    A

    B,

    A

    I

    Figure

    8.

    This

    figure

    illustrates

    the

    range

    of

    possible per-

    cepts

    found

    by

    van Noorden

    (1975)

    for

    two

    alternating

    ine

    tones

    differing

    in

    each case

    only

    in

    their

    intensities.

    Tone

    B

    was

    kept

    at 35 db SPL

    throughout;

    both tones were 40

    msec.

    long

    and

    had

    a

    frequency

    of

    1

    KHz.

    a)

    The level

    of

    tone

    A

    is

    below the

    auditory

    threshold.

    b)

    Tone A is

    at

    least

    5

    db below

    tone B.

    c)

    Tone A

    is

    within

    5 db

    of

    tone B.

    d)

    Tone

    A

    is

    louder

    than tone B with an

    alternation

    empo

    less

    than about

    13

    tones/sec.

    e)

    Tone A

    is louder

    than tone

    B with more

    than

    13

    tones/sec.

    f)

    Tone

    A

    is

    about

    18-30

    db

    louder

    than

    tone B

    and

    the

    tempo

    is still above

    13

    tones

    /sec.

    g)

    Tone

    A

    is

    more

    than 30

    db louder

    than

    tone

    B.

    The

    arrows

    indicate

    the

    percepts reported;

    a more

    complete

    description

    s

    given

    in

    the

    text.

    tempo

    (see

    Figure

    8e).

    Thus

    the A stream

    may

    be heard

    n-

    dependently,

    but

    not

    the

    B

    stream.

    In other

    words,

    t is as

    if

    the A tones consisted

    of

    two

    parts:

    one

    that combines

    with

    the B

    stream

    to

    give

    a

    full

    tempo

    roll,

    and

    another,

    at half

    tempo,

    which can

    be

    perceived

    separately.

    At a

    tempo

    of

    about

    13

    tones/sec.

    another effect

    emerges

    when

    the level

    of A

    is about

    18-30 db above that

    of

    B.

    This is the

    continui

    effect

    shown

    in

    Figure

    8f,

    so

    namedbecausetone

    B

    is

    not

    heardas

    pulsing

    but ratheras a

    fairly

    soft,

    continuous

    tone

    under the

    louder,

    pulsing

    A

    stream;

    this is an

    example

    of a

    class of effects

    that will be

    discussed below.

    Finally,

    if the

    levelof the A stream s incremented till further, his stream

    completely

    masks

    the B

    stream

    Figure 8g).

    Again,

    this

    set o

    loudness-based

    phenomena

    exhibits an

    ambiguous region

    between the coherence and fission boundaries

    where one

    might pay

    attention to either the A or B

    streams

    ndividuall

    or

    to the AB

    stream as

    a

    whole.

    There are

    thus

    three

    per-

    ceptual

    regions

    for

    alternating

    ones

    at

    the same

    frequency

    where the

    tempo

    is above about 12.5

    tone/sec.:

    the

    roll

    regi

    the

    continuity region,

    and the

    temporal

    coherence

    region.

    Van Noorden made

    quantitative

    measurementsof

    the

    fission

    boundary(see Figure 9).

    For

    tempi

    of about

    2.5

    to 10

    tones/sec.,

    the

    fission

    boundary

    is more or less

    hori-

    zonal,

    i.e.,

    the

    intensity

    difference

    (AL) necessary

    for a

    segregatedpercept

    does

    not

    change

    with

    tempo

    over this

    range,

    but

    lies about 2 to

    4

    db

    on either

    side of

    the

    referenc

    tone level. For

    tempi

    less than 2.5

    tones/sec.,

    the minimum

    level

    difference

    for

    fission increaseswith

    decreasing empo

    (or

    longer

    inter-tone

    intervals

    of

    silence)

    and is

    symmetric

    about AL

    =

    0

    (no

    difference

    in

    level).

    For

    tempi

    greater

    than

    10

    tones/sec.

    the level differenceat which fission occur

    increases

    with

    increasing

    empo

    but

    the situation s

    not

    sym

    Tempo

    (Tones/sec)

    20

    0

    5

    3 2

    Masking

    SContinuity

    "

    LA>

    LB

    20o

    Fission

    LA

    >LB

    08

    .,

    Roll

    Inevitable

    ev

    le

    --*

    35 db

    SP

    _Coherence

    20

    Fission

    LA

    -

    0

    0.

    mtTe

    r

    ?

    Time

    Time

    -

    Figure

    21. One result

    of

    McAdams'

    tudy

    (1977)

    suggested

    hat there

    is

    an

    interaction

    between

    pitch "height"

    and timbral

    "sha

    ness"

    (see

    text).

    In

    a

    repeating

    4-tone

    sequence,

    one of the

    pairs

    of

    tones was

    selectively

    enriched

    by adding

    he

    third

    harmonic

    A

    greaterdegree

    of

    segregation

    of

    the

    high

    and low

    streams

    was found for the

    formercase.

    The

    dashed ines

    indicatethe two-

    stream

    percepts

    and the

    dotted

    lines

    indicate

    a

    potential

    one-stream

    percept.

    The

    vertical

    solid lines

    represent

    he fusion and

    timbre

    of

    the 2-tone

    complex.

    Page

    38

    Computer

    Music

    Journal,

    Box

    E,

    Menlo

    Park,

    CA

    94025

    Volume 3

    Numb

  • 7/27/2019 McAdams Bregman Streams

    15/21

    (or

    frequency) organizations

    and

    simultaneous

    (or

    timbral)

    organizations

    n the

    formation

    of

    auditory

    streams.

    The

    stimulus used

    by

    Bregman

    nd

    Pinker

    was a sine

    tone

    alter-

    nating

    with

    a

    two-tone

    complex.

    In

    Figure

    22,

    tones

    A and

    B would

    represent

    he

    sequential

    organization

    and

    tones B

    and

    C would

    represent

    the

    simultaneous

    organization.

    The

    harmonicity

    and

    synchronicity

    of

    tones B

    and

    C

    in

    the

    complex

    were

    variedaswas the

    frequency separation

    between

    the sine

    tone

    A

    and the

    upper

    component

    B

    of

    the

    complex.

    The rationale or

    varying

    hese

    two

    parameters

    was as

    follows.

    Tones with

    frequency

    relationships

    derived

    from

    simple

    ratios,i.e. those that exhibit "consonance,"should tend to

    fuse

    more

    readily

    than

    combinations

    considered

    to

    be

    dis-

    sonant. While

    the

    evidence for

    this

    was

    very

    weak

    in the

    Bregman

    and

    Pinker

    study,

    work

    currently

    n

    progress

    n

    Bregman's

    aboratory

    strongly

    suggests

    that

    this

    is

    indeed

    the

    case.

    The

    new

    evidence

    further

    suggests

    that

    the

    effect

    of

    harmonicity

    tself

    is

    relatively

    weak

    and

    may

    be

    over-

    ridden

    by

    stronger

    factors

    such

    as

    frequency

    contiguity

    and

    synchronicity

    of

    attack

    and

    decay.

    Tones

    with

    synchronous

    and

    identically

    shaped

    attack

    and

    decay

    ramps

    are

    more

    likely

    to

    fuse

    than

    those

    with

    asynchronous

    or

    dissimilar

    attacks

    and

    decays

    [12, 15].

    This

    may

    be

    a

    major

    cue in

    being

    able

    to

    parse

    out the

    different

    instruments

    playing

    together

    in

    an

    orchestrasince they all have substantially

    different attack characteristics.

    n

    addition,

    there is

    a

    very

    low

    probability

    of several

    people precisely

    synchronizing

    he

    attacks.

    In

    light

    of this

    work,

    one

    might

    make the

    following

    predictions

    or the

    perception

    of the stimuli

    used

    by

    Bregm

    and Pinker:

    1) Sequential streaming

    s favored

    by

    the

    frequency

    proximity

    of tones

    A and B

    (as

    we have

    illustrate

    in the

    earlier

    examples).

    2)

    The

    simultaneous

    or timbral)

    fusion of tones

    B and C is favored

    by

    the

    synchrony

    of

    their

    attacks.

    3)

    These two effects

    "compete"

    for

    tone B's

    memb

    ship

    in their

    respective

    perceptual

    organizations.

    4)

    Finally

    when tone B is "captured"by tone A, it is removed romth

    timbral

    structureand

    tone C sounds less rich.

    Thus,

    it

    is

    reasoned hat if

    the two

    simultaneous ine

    tones B and

    C are

    perceived

    as

    belonging

    to

    separate

    streams,

    they

    should

    be

    heardas sine tones.

    But if

    they

    are heard

    as one

    stream,

    hey

    should

    sound like one

    rich

    tone.

    It would

    be

    appropriate

    o

    introduce

    the notion

    of

    "belongingness"

    t this

    point,

    since

    we talk of tones

    belongi

    to

    streams,

    and

    of

    frequency components

    and the

    timbre

    resulting

    rom their interaction

    belonging

    o a

    perceived

    on

    event.

    "Belongingness"

    a

    term used

    in the

    perceptual

    itera

    ture of Gestalt

    psychology)

    may

    be

    consideredas

    a

    principl

    of

    sensory organization

    which serves to

    reconstruct

    physic

    "units" nto perceptualeventsby grouping ensoryattribute

    Stimulus

    C B

    LL

    t

    4-

    Time

    -

    Percepts

    :3

    A

    A

    Cr

    B

    B

    \

    ,

    B B

    C C C C

    A &

    B

    Stream

    A

    &

    B

    Segregate

    C

    Pure

    C

    Rich

    Time

    -

    Time

    -

    Figure

    22.

    The

    competition

    between

    sequential

    and simultaneous

    organizations

    n

    the

    formation

    of

    auditory

    streams

    s

    shown

    here.

    Tone B

    can

    belong

    either

    to the

    sequential

    organization

    with tone

    A or to

    the simultaneous

    organization

    with tone

    C

    but not to both at

    the

    same time.

    Bregman

    and Pinker

    (1978)

    varied

    the

    frequency

    separations

    between tones

    A and

    B

    an

    between

    tones

    B

    and

    C and also varied he relative

    synchrony

    of onset of tones B

    and

    C.

    The

    dotted

    lines

    in the

    figure

    indicat

    the

    stream

    percepts

    and the

    vertical

    solid

    lines

    represent

    he fusion

    of

    tones

    B and

    C

    (cf.

    Taped

    Illustration

    15).

    Stephen

    McAdams

    nd

    Albert

    Bregman:

    Hearing

    Musical treams

    Pag

  • 7/27/2019 McAdams Bregman Streams

    16/21

    of

    thoseevents nto

    unified

    percepts.

    As

    Bregman2]

    points

    out,

    "belongingness

    s a

    necessary

    utcomeof

    any

    process

    which

    decomposes

    mixtures,

    ince

    any

    sensory

    ffect

    must

    be

    assigned

    o some

    particular

    ource." n

    this

    case,

    when

    the

    simultaneous

    ones

    B

    and

    C are

    segregated,

    he

    timbre

    resulting

    rom

    heir

    nteraction

    till

    exists

    andcanbe

    heard

    f

    one

    istens

    or

    t,

    but it

    is not

    perceptually

    ssigned

    o

    either

    of

    the tones

    B

    or

    C,

    and

    husdoes

    not

    affect he

    perception

    of them.The

    nature

    f a stream

    s

    such

    hat ts

    qualities

    re

    dueto the

    perceptual

    eatures

    ssigned

    o

    it.

    In

    Taped

    llustration

    5

    one

    can

    heara

    case

    n

    which

    A

    is close o BinfrequencyndCisasynchronousith B.Then

    a

    case s

    heard

    where

    A

    is further

    way

    rom

    B in

    frequency

    and

    C

    is

    synchronous

    ith

    B.

    Tones

    B

    and

    C

    have

    he

    same

    frequencies

    n

    both

    cases.Listen

    or

    both the

    A-B

    stream

    nd

    the richness f

    tone

    C.

    The

    istenersn

    Bregman

    nd

    Pinker's

    study

    reported

    erceiving

    as

    being

    icher

    when

    B

    and

    C

    were

    synchronous,

    nd

    this

    judged

    ichness

    ropped

    ff

    with an

    increase

    n

    asynchrony,

    .e.

    as

    C either

    preceded

    r

    followed

    B

    by

    29

    or 58

    msec.

    As

    the

    frequency

    eparation

    etween

    A and

    B was

    ncreased,

    was

    reportedly

    erceived

    s

    being

    increasingly

    ich.

    The

    Role

    of

    Context

    in

    Determining

    Timbre

    These

    indings

    ndicate hat

    the

    perceived

    omplexity

    of a

    moment

    of

    sound s

    context-dependent

    see [19]

    as

    another

    xample

    of

    the

    trend

    toward

    viewing

    imbre

    as

    depending

    n

    context).

    Context

    may

    be

    supplied

    y

    a

    number

    of

    alternative

    rganizations

    hat

    compete

    or

    membership

    f

    elements

    ot

    yet

    assigned.

    imbres a

    perceived

    roperty f

    a

    stream

    rganization

    ather han

    he

    direct

    esult

    f

    a

    particular

    waveform,

    nd

    is

    thus

    context-dependent.

    n

    other

    words,

    two

    frequency

    omponents

    hose

    synchronous

    nd

    harmonic

    relationships

    ould

    cause

    hem

    o

    fuse

    under

    solated

    ondi-

    tions

    may

    be

    perceived

    s

    separate

    ine

    tones if

    another

    organizationresents

    tronger

    vidence

    hat

    they

    belong

    o

    separate

    equential

    treams.

    A

    very

    compelling

    emonstration

    f the

    decomposit

    of a timbre

    organization

    y

    alternate

    requency treamin

    organizations

    s illustrated

    n

    Figure

    3. When

    one A

    is

    presented

    y

    itself

    t elicitsa

    timbre,

    enoted

    TA.

    f this

    ton

    is

    preceded y

    tone

    B

    eliciting

    imbre

    TB,

    andsucceeded

    y

    tone

    C

    eliciting

    imbre

    TC,

    one noticesthat

    timbre

    TA

    completely

    isappears

    nd s

    replaced y

    timbres

    TB

    and

    TC

    Here

    the

    highest

    and lowest

    components

    f tone

    A

    are

    streamed

    with those

    of toneB and

    subsequently

    ssume

    timbre

    dentical

    o thatof toneB.

    Also,

    he inner

    ompone

    of tone A

    streamwiththose of tone

    C andassume

    like

    timbreTaped llustration6).

    An

    important

    uestion oncerning

    he

    assignment

    f

    timbre

    and

    pitch

    to a tonaleventarises.

    Bothtimbre

    and

    pitch

    havebeen found

    to be context

    dependent10,

    21,

    respectively].

    But each

    may

    be determined

    y

    different

    ongoing

    contextual-organizations;

    s such

    they may

    be

    considered

    o

    be

    associated

    erceptual

    imensions

    f a

    soun

    but

    may

    not be

    inextricably

    ound

    o one another.How

    rel

    vant o music

    heory,

    hen,

    arestudies hat

    dealwiththe

    pe

    ceived

    pitch

    and imbre

    f tones n isolation?

    This

    question

    not meant o insinuate

    hat

    sensory

    nd

    psychophysicalxp

    imentation reuseless.

    Far rom t. If we

    think n termsof

    investigating

    he

    experience

    f music

    at

    different

    evels

    of

    processingFigure 4), we seehow important ll of these

    areas f research

    re n

    building

    he whole

    picture.

    The

    raw

    physical

    nput

    s

    modified

    by

    the

    limitsof the

    sense

    organ

    whose

    output

    s stillfurthermodified

    y

    cognitive rocesse

    But

    by

    studying teady-state,

    r at least

    relatively

    imple

    signals,

    we can find

    the limitsand

    interactions f the

    sens

    organs

    nd

    peripheralrocesses,

    uch

    as

    temporal

    nd

    spect

    resolution,

    ateral

    nhibition,

    nd

    masking,

    hich imits

    affe

    the final

    percept.

    Beyond

    hese,

    the

    central

    perceptual

    processes

    uch as

    pitch

    extraction,

    imbre

    buildup,

    and

    coherence

    nd fission

    modify

    the initialneural

    result

    of

    stimulation

    f the

    sensory

    ystem.

    Further

    nteractions

    n

    higher

    rain

    processes

    uchas

    attentional

    rocesses,memor

    and

    comparison

    f

    pitch,

    imbre nd

    oudness,

    ontext

    extr

    One

    Timbre

    Two

    Timbres

    :TA

    versus

    I I :

    1 2

    11

    A

    B

    A

    C

    Figure

    23.

    The

    first

    part

    of

    this

    figure

    shows

    a

    repeating

    one

    (A)

    consisting

    of

    4

    harmonics.

    This

    tone would

    elicit

    a certain

    timbre

    percept,

    TA.

    In

    the

    second

    part

    this

    tone

    is

    preceded

    by

    tone

    B,

    consisting

    of the

    top

    and bottom

    harmonics,

    and is

    suc-

    ceeded

    by

    tone C

    consisting

    of the

    two

    inner

    harmonics.Tone

    B

    elicits timbre

    TB

    and

    tone C

    elicits

    timbre

    TC.

    However,

    due to

    the

    streaming

    of tone

    A's

    components

    with

    those of

    tones B

    and

    C,

    TA

    totally

    disappears

    nd

    is

    replacedby

    TB

    and

    TC

    (cf.

    Taped

    Illustration

    16).

    Page

    40

    Computer

    Music

    Journal,

    Box

    E,

    Menlo

    Park,

    CA

    94025

    Volume3 Numb

  • 7/27/2019 McAdams Bregman Streams

    17/21

    Physical

    Environment

    Levels

    of

    Processing

    Temporal

    and

    Spectral

    Resolution

    Lateral

    nhibition

    "Sensory"

    Masking

    etc.

    Pitch

    Extraction

    Timbre

    Buildup

    "Perceptual"

    Coherenceand FissionLimits

    etc.

    Attention

    Memory

    Context

    Extraction

    Form andTextureIntegration

    etc.

    Figure

    24.

    This block

    diagram

    uggests

    a

    possible

    arrangement

    of

    the

    processing

    of

    acoustic

    information

    at

    different

    inter-

    connected levels

    of

    the

    auditory

    system.

    tion,and ormand exturentegration,culpt hetransduced

    information

    nto

    meaningful ercepts.

    t is felt

    thatall

    of

    these

    levels

    of

    processing

    eed

    into each

    other n a

    sort

    of

    heter-

    archical

    as opposed

    o

    a

    hierarchical)

    ystem.

    The

    point

    being

    made

    s that

    in

    the

    framework

    f

    music

    where

    all of

    these

    complex

    nteractionsre of

    great

    mportance,

    he

    context

    that

    s created

    may

    be

    the

    essential

    eterminantf

    the

    musical

    resultof

    a

    given

    ound.

    One

    sound s

    potentially erceivable

    n

    a

    great

    number

    f

    ways,

    depending

    n its

    context.

    Melody

    This eads

    us

    to

    believe

    hat

    the

    fundamental

    erceptual

    element n musicmaybe the "melody" ather hanthe

    isolated

    one.

    Or

    n the

    terminology

    f

    auditory

    erception,

    the fundamentaltructure

    s the

    auditory

    tream.

    This s

    not,

    of

    course,

    new

    notion;

    but an

    empirical

    pproachmay

    allow

    us to

    clarify

    and

    delimit he

    concept

    o the

    extent

    that

    we

    may

    predict

    he

    perceptual

    esults.

    That,

    n

    themindof the

    first

    author,

    s the

    primary

    oncern f the

    composer.

    et

    us,

    then,

    examine

    melody

    and ts

    relation o

    attention.

    For our

    purposes,

    we can

    think

    of

    melody

    as a

    connected nd

    ordered uccession

    f

    tones

    [28].

    It

    follows

    thenthat

    temporal

    oherence

    s

    necessary

    or

    a

    sequence

    f

    tones

    to be

    perceived

    s

    a

    whole.

    On the

    other

    hand,

    a

    sequence

    f tones

    may

    segregate

    nto two

    or more

    separa

    streams

    which

    are

    ndividually

    oherent.

    n

    this

    case,

    we wo

    perceive

    everal imultaneousmelodiesrather

    han

    one.

    If

    such

    an

    operational

    efinition

    f

    melody

    s

    tenable

    we

    must hen

    question

    ow t is thatsome

    of the extension

    elements

    ther han

    pitch

    for

    "melodic"

    material

    revali

    perceptually.

    or

    example,

    when

    a

    composer

    ses timbre

    thematic

    material,

    o

    we

    still

    perceive

    he

    sequence

    s main

    taining

    temporal ontinuity?

    ometimes

    oherence

    s

    mai

    tained

    by

    sensitive

    erformers

    ndsometimes

    t can

    be

    very

    difficult

    o

    perceive.

    Of

    course,

    we

    have

    not

    included

    th

    elementswhichaffectthe perception f melody,suchas

    underlying

    armonic

    tructure,

    ut

    we

    are

    only

    attemptin

    convey

    the notion that

    temporal

    coherence

    hould

    be

    considered

    ssential

    o

    melody

    ormation.

    Conversely,

    ne

    may

    use the

    principles

    f

    fission

    o

    develop

    ules

    or

    creati

    polyphony

    nd

    counterpoint

    n

    sequences

    f

    acoustic

    ven

    Attention

    ndMusical

    tructure

    It

    has

    become

    apparent

    o

    the

    first

    author

    hatmusic

    structure,

    sit is

    perceived

    n

    real-time,

    s

    inextricably

    ou

    to

    attentional

    rocesses.

    bit

    of

    introspection

    illreveal h

    there

    are at least

    two

    kinds

    of

    attentional

    rocesses,

    whic

    we

    might

    allactiveor willful

    attention,

    nd

    passive

    r

    auto

    matic

    attention.One

    mightwillfully

    direct

    one's

    attention

    some

    object

    or

    sequence

    of

    events,

    such

    as

    listening

    o

    particular

    vents

    within

    a

    piece

    of

    music.

    Or,

    someunusu

    event

    might

    attract

    ne's

    attention

    nexpectedly,

    uchas

    th

    honking

    orn

    of

    an

    oncoming

    ar

    you

    had

    not

    noticedas

    yo

    stepped

    ntothe streetabsorbed

    eep

    n

    thought.

    Thathorn

    demands

    our

    attentionand

    n

    all

    probability

    ets

    t in a

    hurry.

    n

    particular,

    othkinds

    of

    attention

    mayparticipa

    in the

    process

    f

    listening

    o

    auditory

    treams.

    or

    nstance

    in

    Figure

    when

    a

    sequence

    f tones

    ies

    above

    he

    tempor

    coherence

    boundary,

    o amount

    of

    activeattention

    can

    extract

    he

    percept

    f

    one

    coherent tream.

    Here

    perceptio

    is limited

    by

    passive

    ttentional

    rocesses 28].

    Thishas

    importantonsequencesorcomposers ho ntend o usefa

    melodic

    equences,

    ince

    t

    suggests

    hat there

    are

    tempi

    a

    which

    he

    listener

    may

    not

    be able o

    follow

    as

    a

    melody

    h

    sequence

    ou

    have

    constructed,egardless

    f the attention

    will

    power

    nvoked.An

    example

    f theseeffects

    may

    be

    fou

    in

    the

    sequences

    resented

    t different

    ates n

    Charles

    od

    Earths

    Magnetic

    Fields.

    Without

    pretending

    o

    know

    the

    composer's

    ntent,

    t can be

    amusing

    o listento

    the same

    sequence

    decompose

    nd

    re-integrate

    tself

    during

    variou

    tempochanges.

    n

    a multi-streamed

    equence

    ne can rela

    attentional

    effort with the

    result

    that

    attention

    might

    randomly

    alternate

    among

    the available

    treams.Or one

    mig

    selectively

    focus

    attention on

    any

    one of them

    individual

    and evenplay them againstone another.

    Van Noorden

    reported

    that the

    temporal

    coherenc

    boundary

    (the

    boundary

    below which

    all tones

    may

    belon

    to

    one

    stream)

    s not affected

    by

    previousknowledge

    of the

    sequence,

    and considered

    t to be a

    function

    of a

    passive

    attentional

    mechanism,

    given

    that the listener

    "wants to

    hea

    coherence."

    However,

    what

    happens

    between

    the boundari

    of

    temporal

    coherence

    and fission

    dependsupon

    a number

    factors,

    such

    as

    context,

    and seems

    to

    be under

    the influenc

    of attention.

    Dowling [17],

    for

    example, reported

    that if

    listeners knew beforehand

    whi:h

    melodies were

    being

    inte

    leaved,

    they

    could,

    with

    a

    bit of

    practice,

    extract the

    appro

    Stephen

    McAdams

    nindAlh"rt

    Rrtnmn-

    o

    Ra,;

    ...A--

    Pag

  • 7/27/2019 McAdams Bregman Streams

    18/21

    priate melody

    even at

    very

    small

    separations

    of

    the

    ranges

    traversed

    by

    each

    melody.

    It is

    currently

    assumed

    that this

    ability

    would

    degenerate

    at faster

    tempi.

    In

    addition,

    van

    Noorden

    found an

    effect

    at

    very

    fast

    tempi

    where t

    was

    virtually

    mpossible

    o

    hear

    sequences

    within a

    range

    of two

    or three

    semitones as

    other than

    temporally

    coherent.

    At

    very

    fast

    tempi

    (about

    12.5

    tones/

    sec.)

    the

    tones

    of

    such

    narrow

    patterns

    are not heard

    as

    separate

    membersof a

    sequence,

    but

    actually

    merge

    nto

    a

    continuously

    rippling

    exture,

    as one can

    hear in

    Taped

    Illustration

    17.

    There

    are

    thus

    attentional

    limits

    in

    the

    ability

    of

    the

    auditory system

    to tracka

    sequence

    of events. When events

    occur too

    quickly

    in

    succession,

    the

    systemusesthe variousorganizational ulesdiscussed n this

    article

    to

    reorganize

    he

    events

    into

    smaller

    groups.

    It

    may

    then track

    events

    within

    a

    particular

    group

    if

    the

    listener

    is

    paying

    attention

    to

    it,

    but

    this

    narrowing

    f

    focus

    necessarily

    causesa

    loss of

    information.One result s

    the

    inability

    to

    make

    fine

    temporal

    order

    udgments

    between

    streams.

    These

    organi-

    zational mechanisms eflect the

    tendency

    of the

    auditory

    sys-

    tem to

    simplify things

    in

    the face of

    excessive

    complexity.

    In

    the

    example

    where the

    fast

    sequence

    of

    tones

    merges

    nto a

    continuous

    "ripple,"

    he

    auditory system

    is unable to

    success-

    fully

    integrate

    all of

    the

    incoming

    nformation nto a

    temporal

    structureand

    simplifies

    the

    situation

    by

    interpreting

    t

    as

    texture

    (see

    also

    [31]

    ).

    Thus

    the

    auditory

    system,

    beyond

    certaintempi,mayinterpret he sequenceasa singleevent and

    assign

    o it the

    texture

    or timbre

    created

    by

    its

    spectral

    and

    temporal

    characteristics.

    An

    understanding

    f

    (or

    intuition

    about)

    these

    organi-

    zational

    processes

    can

    lead to new

    dimensions

    of

    control

    over

    musical

    structure.

    For

    example,

    one

    might

    construct contra-

    puntal

    sequences

    that

    play

    across various

    stream

    boundaries

    and

    through

    different

    borderline

    regions

    between

    temporal

    coherence

    and

    fission.

    (It

    may

    be that

    composers

    such

    as Bach

    were

    already

    using perceptual

    ambiguityconsciously

    in their

    work.)

    Any

    or

    all

    of

    the

    relevant

    musical

    parametersmight

    be

    used

    to

    accomplish

    this.

    Then,

    an

    appropriate

    use of events

    that

    vie for or demand he

    listener's

    attention

    can be

    used

    by

    the

    composer

    to

    "sculpt"

    the

    attentional

    processes

    of the

    listener.Sincesomeeventsseemmore

    striking

    o some

    persons

    than to

    others,

    this

    attentional

    sculpture

    n

    time would lead

    different

    listeners

    through

    different

    paths

    of

    auditory

    experience.

    Further,

    perception

    s

    bound

    to

    vary

    from

    time

    to

    time within

    a

    single

    person,

    so

    the

    experience

    would

    be

    different

    with

    each

    listening.

    A

    composition

    of

    sufficient,

    controlled

    complexity

    might

    thus be

    perceptually

    nfinite

    for

    a

    given

    istener.

    Conclusion

    An

    attempt

    has

    been

    made here to

    point

    out that

    composers

    and

    music theorists should

    thoroughly

    examine the

    relationshipbetween the "musical"principlesthey use and

    espouse,

    and

    the

    principles

    of

    sensory,

    perceptual,

    and

    cognitive organization

    hat

    operate

    in the human

    auditory

    system. Many

    of

    the

    principles

    discussed n this

    article extend

    to

    higher-level

    perceptual

    analysis

    of

    musical

    context

    and

    structureand

    may

    well

    represent

    a

    scientific

    counterpart

    o

    some

    extant music

    theoretical

    principles.

    In other

    cases,

    though,

    this

    group

    of

    phenomenasuggests

    perceptualorgani-

    zations

    which have

    little

    relation to

    methods

    currently

    used

    to construct

    or

    analyze

    musical

    structure. To

    ignore

    the

    evidence

    from

    the real

    life

    system

    in

    developing

    a

    theory

    of

    music

    or a musical

    omposition

    s to takethe chance

    of

    relegating

    ne's

    work o the realm f what

    might

    be

    termed

    "paper

    music."

    Acknowledgments

    This

    paper

    s basedon a

    workshop

    ntitled

    "The

    Perceptual

    actoring

    f

    Acoustic

    Sequences

    nto

    Musica

    Streams"

    elivered

    t the

    1978 International

    omputer

    Music

    Conference,

    vanston,

    llinois,

    and

    published

    n

    the

    Proceedings

    f

    the

    Conference.

    he

    authors

    would

    ike

    to

    thank

    Dr. Leon

    van

    Noorden or

    his

    many

    helpful

    uggestio

    andvaluable riticisms f themanuscript;igures , 8, 9,

    10

    and

    20 were

    redrawn,

    with

    permission,

    rom

    Dr. van

    Noorden'shesis.

    The

    present

    version f this artic

    was

    prepared

    hileMr.

    McAdams

    as

    a Graduate

    ellow

    of the

    National

    ScienceFoundation.

    Dr.

    Bregman's

    esear

    hasbeen

    supported

    y

    grants

    rom

    he

    NationalResearch

    Council

    f

    Canada,

    he

    Quebec

    Ministry

    f

    Education,

    nd

    the

    McGill

    University aculty

    of

    Graduate

    tudies

    and

    Research.Muchof

    the research sed

    the

    facilities f the

    Computer-Based

    aboratory

    f the

    McGill

    University

    Department

    f

    Psychology.

    The

    editors

    of

    Computer

    Musi

    Journalwould ike

    to

    thank

    Mr.

    McAdams

    or

    preparing

    he

    illustrations

    n

    this

    article.

    Appendix

    Description

    f

    Taped

    llustrations

    1.

    A

    repeatingequence

    f

    three

    high

    ones

    1600,

    2000,

    2500

    Hz.)

    s

    interspersed

    ith three

    ow

    tones

    350,

    430,

    55

    Hz.)

    used

    by

    Bregman

    nd

    Campbell

    1971). a)

    At

    a

    tempo

    o

    five

    tones/sec.

    he

    sequence

    s

    perceived

    s one

    stream f

    alternating

    igh

    and ow tones

    (cf.

    Figure a).

    b)

    At

    a

    tempo

    of

    ten

    tones/sec.

    he

    sequence

    egregates

    erceptually

    nto

    on

    streamof

    high

    tones and one stream

    of

    low tones

    (cf.

    Figure

    b).

    2. Another

    epeating

    ix-tone

    sequence

    s

    played

    at

    a

    tempo

    of ten

    tones/sec.,

    but the

    higher

    riplet

    s closer n

    frequency

    o the

    lower.Tone

    F

    may

    be

    perceived

    s

    belong

    to either

    he

    high

    stream r the

    low

    stream

    epending

    n the

    listener's

    ocus.

    Note

    that tone

    F

    cannot

    belong

    o both

    streams t

    once

    (cf.

    Figure

    ).

    3.

    A

    repeating

    ix-tone

    sequence

    tartsat

    a

    slow

    tempo

    As the

    tempo

    s

    gradually

    ncreased,

    he

    sequence

    s

    progres

    sively

    decomposed

    nto

    smaller

    erceptual

    treams

    ntil

    t is

    no

    longerpossible

    o

    follow

    he tonaleventswhich

    merge

    n

    the

    percept

    f

    timbre

    r

    texture

    cf

    Figure ).

    4.

    Using

    he same nitial

    equence

    s

    n

    Taped

    llustratio

    3,

    the

    frequencyeparation

    etween

    emporallydjacent

    on

    is

    gradually

    ncreased. he samesort of

    decomposition

    nto

    smaller streams occurs. The limits in this example are

    determined

    ot

    by temporal

    esolution

    ut

    by

    the

    audible

    frequency

    ange

    cf

    Figure

    ).

    5. The

    tones of two familiarmelodies are interleaved n

    th

    same

    frequency