clu8teranalysisapplie'd to the nsw meat processing lntlu$try · clu8teranalysisapplie'd...

18
Clu8terAnalysisApplie'd to the NSW Meat Processing lntlu$try Alison Sberidan-Nethery!' Kate o weIll 1 Department of Agricultural Economics & Business Management University ,of New England 2The Rural Development Centre University of New England A contributed paper to the 35th Annual Conference ortbe Australian Agricultural Economics Society, lIto 14 February 1991, University of New England Tbeauthorswould Uke tothatik Ray Cooksey of the.Psychology Department at UNB forbis valuable comments,

Upload: buinhan

Post on 03-May-2019

214 views

Category:

Documents


0 download

TRANSCRIPT

Clu8terAnalysisApplie'd to the NSW Meat Processing lntlu$try

Alison Sberidan-Nethery!'

Kate o weIll

1 Department of Agricultural Economics & Business Management

University ,of New England

2The Rural Development Centre University of New England

A contributed paper to the 35th Annual Conference ortbe Australian Agricultural Economics Society, lIto 14 February 1991, University of New England

Tbeauthorswould Uke tothatik Ray Cooksey of the.Psychology Department at UNB forbis valuable comments,

2

Inttodu¢tion

With 1imi~dataavalIablcrelating'to ~·~y·productssectproftbe'Apstra1i~meat indusUy, very little research has been directed lOW$'dS understandingtheJundattlental w()rkingsofthi$sector~ Thi$has:imped~the devel(Jpmento(,(:Oh~ntpoUci~con~i~g

by· products •. TheJarger$tudythat,prpmpted thisp~i$ 'specifically CQncentc:d with redressingthisproblemtbybJ,li1ding-upac()m~ltensivCllndCJ'SfMdingoftb~by~products

sector frOnt the fU'rtl·level(abattQir';level)~

WithmQrethan400abattoirs andslfiughterb(>uses J)peratif1~'\jnAttstnllia.ltw~ oot possible to examine each wOrk$tothendr{lw .~pclusions aboutthe:totalsector~ Foctlsing on a sample of .tums was the only vjableapproacb~ With lilllited'1esearehfunds8vailable, onlytho~works in 'NSWcouldbe considered. HQwto.setect theappropri~tefirms .to study in order to capture the:important feanrres of'tbeindustry is acritica1issuewhenuslng case .. studiesas a researrb 'tool. In this case, the problem w8show tQselectabattoirs.in NSW that capture the irnPOrtant (eatures of the totill by-ptodJ,lctsse<;:tor.

Whentllckling thisissuc" cluster .analysis seemed an appropriatetool.to use. Cluster analysis isa generictetm encolllpassingarange oftechnique$ thatse¢k ,tQcla~sify :8 set of entities into a number ofctassc$.such ,that the entities withinc~sha.ve :gteatetshnilarity to each otherlbantheybavetoentities in.otherclasses{Everitt, 19.81). ,Applyillgcluster :ana1ysis .to tbethe abnttoirsinNSWresultedin .soluUonswhich'initially·appeai'edro be rairlyrobustandintuitivelyappealing~ However, during the Malysi's certainissues~e.to light that were cf)t1sidered wortb.examiningintJlQre detail. These ,related·tQ the different fonnationrules that clustering algorithms apply ingeneratlJlg .clUSfers a.ndtlle concept of similarity. Both issues are discussed 1n ... lcjilperand possiblc·sotutioos.offered.

TheClusterim: Concept

There are essentially 5 steps involved lnany classification procedute ( Vogel, 1975). These are:

* the selection of tbeparameters to be used for the classification

• the choice of the· similarity ~su", * the choice of the cluster algorithm to employ . *.tbedeelsion as to 'tbeappropriatenurriberof clusters * an evaluation of the usefulness of·the resulting classification (the clusters)

As noted in Bventt (1981,p.t03) ,;. iti$g¢neratiy lmpos$ibtc .prioritQanticipate what comhinatlQli of variablss,sinillatitymeas~s,and chtsterln:B t(:Chniques ~ .likely to leadto.interestingand Infonnativcclassifi¢atiol)s".;()(ten 'therl,theanalysi$that'~ invtllves severa1stages:atwbith .there~het mayirttcxV¢netQ makcc~gt$to~ variablC$,.select'~a1temativcsimi1~tyma~, tQcusoo4ifrerent'sub-sctsOfentiti.;s.and 'make othertllinot; ~justn1ert.ts ,tQ the 'C1us~gtQ,ensure.thattbe'clQs~gresult$ inu$CM groups. It isin)portant then. ,lhattllc~bet~gnisetheh¢uristics of cluster .analysi$~ ~ydoing $O.nth~user iSfarless lil(e~yt() make tbeJllisWce oi:tCifYing:~ cl~tetsolution" ,(Aldendetfer ,&;alashfield, :1984,p.t 4).

TheCIJoiceand ;MeasurementQf Variables

The initial choice of variables to include in clustes:analys!s isajud~ntbythe researcherastQ whichattributesare,conside~ important (orthepllrp(>Sesot.classification. AsstJch, thefinalclassifi~tionsattivedatare1! in :asense,arbitraryasthe classe$ prod~ bytheclusteringprQCedu~aret natunPly. greatly influenced 'hythe ·Variablesemploy¢<t Theclusterssbould,therefore,'t()nlybev~uedwithrespectJ"iheirpurposes and US¢s"

(Man, 1989,p.27). 'Irtthisstudy, t1teprimatyQbj~tive r..ticlust¢nn,g·wasto id~ndfy classesofabauoirsinwhicbthemelllbers ·~rela.tively ,a1ike,intbeirPrQduction~C$.

The infonnation available concerning dle61licensed abattPirseummtlyopenttingin NSW was fairly limited asproductioninfonnatiOll.iscortSid~ comi11efCiallysensitive. cot theinfonnationlliatcouldbe attained relating to the works"thefoltowi~g vari~les were c()nsideredrelevant for sorting abattoirs 'into 'relatively homogeneous groups.

YARIA'BLE

Volume of throughput

Type of license

Rendering~dcnities

Species Slaughtered

Part or a Company

STATE

high medium loW

Export Class 1 ClasS 2

Presence Absence

Cattle Sheep Other

Yes No

4

'I'he'Qataconcelllingthesevariablcs forthe,abAttQirsin 'NSWW~ obtaitledCrom' perS()nlllcQtnmllmcation with staff of the !NewSouthWales Meatlnd.ustry ,Authority (NSWMIA.)anddteJ)~partmentof.P,rbnaryIndust:tan(iEnergy.as w.eU 'asinformanon Pllblisbed;inth~ 'NSWMIA,'s Anlluat'Repon,(l989). 11tedermitiOl1;otea¢hof~variables isgenenllly se1f~xpl1JJUlt()t')'.With respect to the type ofUcensesheld"tbe ,sen~ distinction betwec:nthe three 'licenses isconcerow With ~ :healthandbygien~stand~ littit :th¢ WQrksbave to m~t, 1berequ~mehts for the :'axpon li~n$C.a11' roorestringentthan for the Class 1 Ucense, wblch in. tum are mQrestrlngentthan fortheCtas,s.Z liceJl~. The'othel' species slaughtered'arepigs and goats.

AU :thevariables w~tteated·~ di¢hotomous,$1d:S(tW~e ~'lQtOtpin4icaJe the presenccQr·a,I)$enc¢' QfUle fcatllr¢, respectively. For the Yariabte$"tvQlurneof thrOOghput" 'and "'type ofUcensetJasQUtlined abov¢,the~ were .more· ,thantwC)states possible. However, byincl1l4ingpnly ~ostates for ~ch variable, then if bptll W~ coded O,by implication the 'variable must ~··in the third state. Fotinstante,byinc1llgmg.onJy high ~d mediuM as:the states possible for "volume .ofthroughput".andifboth.are cod~'O, theabattoirrnu$thavea lowtbrol1ghput Similarly, byinclU;dingonlytbe;$tates,JlxPQtt and' ·Class 1 for tb~ variable "type of1icense~'.and.botbtu'ecoded()~by illlplicab"p ,the abattOir has a Class 2 License.

Ifa works proCesses Ii lowyolume of.thrOughputthasl'CI~s 1 U~nseilnd rendering.(acilitics. ~laughters tattle IUldSheePtlbutnptothers~ieSt SJKtd~sl\()tbelo~tt to I company, then it WQu]d be coded-as follows:

wbereH=Higb throughput M~'MC(iiumthroughput

X =E'gpottUc¢n$e Cb=Class 1 Ucense RF::RenderingFtlCiliti~

CA=Cattle$laughten:d SH=Sheepslaughtered OT= Othetspecies slaughten:d Co=Part of·,acQmpany

:&ch ofthe61abattotrs werec.oded in this manner 10 create. the data set. 'Tabachnick ;flndFidell(1989,p.8?)provideau~ful~hecklistlhat the researcher should folIowwben

:s

~in'll c.tat.. ~t t()"~nsut¢ there~, n()problenl$Within'dle~ttbat WiUQtdy~lY'aff~ 'an1l1Y8es~ Catrying:~t,~heck$ ,fQr,~~battQirdata$Ct.thcdata scxeei'dngid~ndtied '2of,~

61abattoirs~ntpttivari~te()uille:r$ • ... At;.~ .2ab~ttQirs,slaughtet~[neithercatfl~ ot~Pt ,~y wcrenotP~or~ ,popU}atlonthC:Wget,$tu4yiscoficerQtdwiili. ,~$Owere del~trA~ Tbe dam :set~aining was roadcupof S9 abattQh-s wblr9clichototnQ\l$ VAtiJlbl~$.

To ~gtU~ entinC$,;lS sinPlator dissitllUar is the,b!lSis fot ~lpSsification~ Similaritythowev~.cannot.bc~ilyde,rtn¢(L'l'beprQbletl1.QfsimitarityisJlot$imply

whether entities are ,alikcQrnot atikerbutinst~irt ,tjew~y!ftbeseco~ept$ ,81e¢xl'ressed and implemented"lnJ'e~b (Alder1(lert~,6G Blashfield, 1984~ p~17).~n<Un~onm~ panicularcircgmstallCes with wbichpne isd~b1g, tbeexptessions iorsitnilari~canvary greatly. As ,clus~ringhas l>e(:ome increasinglypopul~,andJlPpltedto div~fieldsof n.:~h".~ number oftecbniqueshavebecm,developedto~sute,simiWity,betw~n:c~ to reflect particular cireumstances~, A Vital.issue to'~addressedWben;.catl)1mgClutclpsw . analysis,tben,is what.simil{Ujty mea$~$bo~ld beu,~togaugethedegreeQflikencs$ betW~J'lthe.entitieswith which the r¢searcheri$.~ng.

Wben Ute data are dicnotQlT14)u$, .sin'lilarlt)'and .dissimilaritybetwetQ'tWo cases .. i$ represen~intbe 'fpresence·tQ1"·'absence~ .orJ).J~mret·~ depi¢,~jntbe followingtw~

way rss&:ietiooiable:

CASE 1 l~~+-__ ~~--~ __

CASE.2 1

o a b

wh~a =joint.presence ofafeatllre

.a+b

b =~bsence?f f~ture .for Case 1 and presence ;offeature for Case 2 c=presenceof feature Car ease 1 and absence pf feature for Qu;e .. 2 d= jointabsenceof.a feature p:;a+btctd

A numbetQrru~tna.tivesA1'e: availabl~t()re"Pressingthe simi1$itybc:tw~n cndties when 'dJedataaredicbotQfi)()us(Wl$bart1987)~However,tIlQ$t ·are$ltnply~a.tiQns on ~bl$ic measu~s.of association:

td.n:apl¢matdling Ja~card

binary . euclidean

a+d/a+p"f1:+d a/a+b+c b+c:/a+btc+<l.

aoth &irnple matcbin8andla~aresinillarity~fficicmt$~fQCPsQntlle ~omtnQmUitybetw~ntwoentities. ~ydifferonlyind)eir ~~lttQfthej9int,absen~,pf

2ulattnbute.Wilh sirnPlernatcbing, joint absences tndica~sbnilarity. Ja¢~'scoefficient. which 19noresth~ joint absen~ of features, 'was devel~pedt()mcttslIl'Csimilarityin 9Qe$ wbereitisinapPJ'()priate, t() con$idertwo~titlesassimihJ.rb¢cause :1x>mlttck MtUtrlbute. Binary elltUde~is adistance~mcifmt.and.dif(e"fromUte.,sitnilaritycoefficientsini tIultit focuses on .thenumOOrof Jlttribllt~thattw() .entities (lQnot :haveinCC>fJ1ll'1(ln~Whether the rese~her 'wisbes to focus on sbl11'ed ornon-sltaredattriPllfesd*nds(){l tb~ natUtePf.lhe problern~ing aQdressed.

Fortlwpm:PO$es()fthis.pa~r thesimptetMtehing·~mcientwasadop~~ce:qn

all variables sil'llilarity WflS lrnpUclt inJointapsences.

In chQi~ofalgoriJhm,the proposiijc:m thatclustenpg·PtQvidesan aYentJcfQf selecting representa!ivetirms sJJ"estedtwohigHlyde~irablcattributes tbatana!gpritbm sbould.possess.F~t,tbathm~imisetbe$bnilarity between entiticswidtin gro,upsandUte distance.(dlssirnilarit:y)betwecngrQup$~From thelarg~amyof ~S6rUbmsavAllabletbQ~

thatbadatenden~y .to find :tightbypel'$pheric~clust~ we~ CPllsidercdthe Dl()St

appropriate. ThenattlrCQf:the~ clgstersjmpliesth.atentiti.~sare .$;ls$imllar.aspPS$ibleoyer all dimensions (variables). 1bis.tendencyhas beencriticiSC!d 'inSQn)e .discip1i~$:f«its persi$tancei.n3Pplyinghypc~pberes n!gfU'<Uess ()f meutldel'lyjngshape' (Everitt,pp~95~97), For tXQJll»le,hypeJ."Spheres .may be considered inappropriate where entities can have a WicJe :rnnge pVcr one.()t .more dimensions but $~UbeCQnsidei¢dsimilm-.That b,real.cl\l$terstnay be~l()flptednlther tha,nhypmphericaJ.WbUeh:ypet$~city is ~nsi~~ppropri~tein this$wdy iim4 indeed conceivably lnmany econonUc applications, itroaY' ~ con$id~red inappropriate in some instances.

7

'The $~Qnd, ·~m<)re.ienem1ly,~~~a.tribu~ti$tbat ~lu,~t~g!"gQtjthtrls sbou1dgel1erJltCoptimal$c>llltio~ 'WbU~a:JulmbcrQf ~$Qritb~baveb@'dey¢loped

(Bverit~).(ewctttl.~iiy·be applie4rodi.chotOJll()US '~~ ;TWQ.,e~«:~pQQn..~;lno~Utetlc .di~ive .. ~ ,~.~Jijng:techmqlJes'(avlliJabl¢in~ Wi$luu:t(1987)suito:of.pro,nuns)" 'In th~inltW.Stag~.ofanmy$istbe~esQfitte$e .. ,orit.hms~·.~~pl~,but,.·~ $Olllti()ns we~At()dds w.ith (idlerre$ulQ;;prod~~1wm dn>ppe.c:ltiQnrlh~·;ana1ysls. :nemonotb~ticdivi~ivetc:chniQue(Wi$Jmrt, pp~79~90) ~~IAlu~~~inth~

anlUy,is whe,rHbe~ffi~~nt :~ntedm dle·.ijnal ~tiQn •. 9f,the.pa~wu~wUC4 .JO .. ~,.

data.

As · ... ~tClTl3tiveto optimimngalgQridul1,S~~"~ad9pJ. ~lp$~B;sttateaY whict~ ~PQ~S ~ lt1itia1S91utionlJsinB·algorithmsm.~ genet1il:p~ M4sqbject tbe~ ~lQtionst('l.MQPtiIni$lngJ11sQrlthrn.The~ultin.s~l"$ter.i~pre$MtA :JQCAl.Qpgmum only~ 'T()incretl$ethe lU<el!boPdtluttaglpbilloppqlumhas'*n ~h«I'requ~Stc~ppmtJon Jrom a nurnber()fs~ngpomts ($¢lec~~itb~.nuld()Q)ly(tfcx>mputed thrOugh·otb¢r '~gQrithQl$). ,,1.,$' $QftW~1iQlitations p~ltJde4 theJlseof,mis$tra~gy ~tb.~app~btfUlt W{IS

$Cttle4 ,pponfor thi~$tudy W.MtQllJla1y$Cdteda~1lSing ,ft nPmbet ot.$uiUibt~hitmu'cblct1l cltJsterirtgalg()rimms~Tothe d~~lh"'t~balSQtitbm seJienlted $imU~SQlllti(m$ .,~,dl~ lev~rrcqu~,the :re~ulting ctllstcrs<»u1d, be~gardedasJ'Qbustal~ough ,dJ~1~gbtnot CQtlsPtu~an~ptirnpnl.

'lbe~ngtU$cQssi()rt .ha..sO\lUinediWocbls~rauribptesco~i~ ,b,'Uu; autbon;to ~highlyd~~ble 'inclustqinSa!gQriwrns anr;l·.~theS1lbjett pfongomJJ inv~sijgatic:)Q~ 'Th~ iS$uc$a5sociated with meseattriblltesrep~sentonJ)' .~. lew·oftho$e wJUQhn~addressing ifc11)stering is to b~ 'fruitfuDluJili~in~nonUcan.l\Iy$C$.

Witbinth~:hieratchiCtll~a;$1~rat.iyemethodfive 'tecbniqpcs .~in geneflll u~. (opr

of these Wf,i'C consi4er.ed,Th~seare:

CctmRJeu; )jnk,a" method, whi«:bcluste1'$by ·tbcnJlcthat amy entityto.beincluded intoM e~i.sdng~lJJsWhl\S tQbe;··wlthin '.a cenainlevel ()f:~imib.uity tQ ~ lllernber$pfUmt ,cJuster'(Atc:fen4erfer~andBlashtield, ,p~4()tthati~, itUiest()minimi~ tbeintrtlrcla$s variance.Thismeth()Crbas~n shown.tohave atendencyt()firtdcompactt,b~ph¢ricfU clustef$>mad~l.lP of v~ry similar (:as¢s.

W0;4'§metb<>4 >is desi~tommillliseth" vari~within clu$teJl.At each sblg.,'jn th~ Clu$t¢rlnB,tJ).~fu$ion()ralII)oSsibl~pait$ ofcluster$ i$fl$$Cs$t;d im,~tbe twc> .c~$'et'$ .whQS~ unjoJlI~$·tQthe~~est increase inth¢ v~an~ ~combine<fmveritt.p.9l). W~~sll1¢th04_g~net"Al~y .gevelQP$·clust~ wbi¢h .~1'Qqgbly·eqqat iJl~iZi)JWd.~ hyPetsph(:rica1.insbape~

~aV¢au:e JinJqt~metb9d Openltes1>),deriVlrlg'¢r) average, value foritbe sitPilArity Qf.~ca$ebeingco"sideredrellltive waUcas¢$ ,in·tb¢ ¢xistinS clU$ter,AAci~ ~joil)$ i~ clUst~r ifa.level ofsitnlladty :i$lllWn¢dU$inBtblS :average:vaJue~ This !~S()rithmi$ panicPl~lyuserlJl insq>fU'atjngoUtIieJ'$.

The penqoi4m;w94 depictstheclustersin~uclid~sp~ aIld·JlS~·cluster$are fonne4tl1ey .~tePIaCedbydlecOC>rdina~sQrtheit centrQid. Th~grQupswith lbesn:u~ll~~.t distance~~w.:enth.ei.rcentroids .. are. then fused~ 1l1i.s~thOdh~$~npopularbecau$Q it 'ls space~pnserving .an4is ~lativ¢ly simple (l.4mce &, WjlU~, 1967).

iNoollctechmqUe is considen:<i$up¢rior tQ t~ othet$ in ltU~in;umsJance$. However , only twO, WArd's and completenllk~ge. '.~. llkelyto sati$fyib,~ ~uirement of tj.gbtbypersph~res.

Widlbienu:thical.clusteringtwbniqllcs, -the rese.~her,mustdecide wbatlevel p( clustering.}$ mQ$tsignifiCAAtThaJ ls, 'wh~t n~mberofclusJerS;be$t¢apfQ1~s·ili¢ natunU gro1,Jpin~sof tbe~batt()ir$. VariOtlS stop.piJls.rulesh~vebeenp~po~bu~ none~ :h~ mi(l fast .anc1few .llI'eappUcable ~ssaU clustering ·a]g()ritbms. BQwever,atdlc'n)inhnutn .clusterssh()uld ~. illt~JCm1>lewlthinthe c(mteJ(tofthed1\ta.

Apracucal~thodofd~ding:wben toJernlinateclusteri~tthat'ha$~ned,spme

ac~ptance, isplottingtbeclassifi9aticmcriterionagainstllie number pfclusfet$ .tu)d.· .. asharp .stepinthispJptindicatesth~ nurober of.classest~(Oower, 1975)~aQwe"er,'~ ,this method is reco$ni~u being subjecriYeanc1has~nfound to '~in~pplicable ina numberQf studies ltmust:\)epse<i cautl()usl)'~Wh(!t1theclusteringatgqrithms we~appliedtothe datA

twod~sive~t~ps w~reappClfeot ~onebetw~nthe ·.three· ,andtbe four~lpst~r~lutionsll1'ld the Qtber belWe¢ntl}escvenandeight c1usl~r sol"tlpJl$(~seenin .Appendi,(1)~

Thefinalst~pln: flchl$~eringprQCedute~.Vali~tionQftbesol11tipns.Ev¢ritt,.m

discussinglthepmblemsasSQCia~ witbjpdgingtbevalidityQf clustmSp~s~st$thatQi1.e m~$()fYalidat.ll1gthe clpstCll formedisd:tat of:runningavanety .of~lustc;riragJlJgorithtt1$ ba#d·on :diffetentMsumpti()fls, artd"onlyclU$ter$f()rmed.byall()rthemajorityofth~

rned19ds:be~ptt~~rf "averitt,p.7S)~1bi$JPPrQacb: wM~¢Il,andthe¢lus~ng algoritlunsav~g~ 'linlcage,andcentroidwere~tained tQ yJWdate'~ WtU'd's JIld'~mpl~J¢ liOkagcsolud9ns.

The reS\lIting ~lJlstCl"$diff~re4sornewhat~;\\,#!!:~gQritlu'n$at l.owlevcls,()f,fusiQll buteonv~,g~ Atbishcrleve1.$. ThefQPr·cllJst~~Jution for tbeWQnf'~'~' ~I}t«)id a1gcdtlunswe~ identical. ~dtbc:rc \W~ ()Jlly .dfffer~,~betwQCntJleh~U~s·of ,tIu'ee abattoiQ by .Jbe~ ,lllgcritlun$'andtlleaVenls¢.JinQgc8Jl4complete linkagc$Ql\luQn$(tne three ~.~irsw~bartdledslish~1differenUy"m¢acb ,of,theJa~i1.wo Alg~). As th~rels$ucba$malldifferen"~tweent.hesoll1dQn$, ,tbe ~lustef$seneta~bytbeWamt~ clusterif1g,pl'QCe$s\V~judgedto be ,ftUpyrobUst.

'Thefour.clostell:Siven 'by Wani's ,~~, pre lis~inAppeQdix ,~AJ;lll$ter

MtlIysisQfabattoirswillolllybe ~lev8Jltfor tbi$stydyif tbecl"$teJ$~,Qi{t~ntinmo$t of ~he principalcQaracteristics.ThedistblgWsbipg c~ctcrh;qc~9fthe ~lus~t$·,are4i$~ :~h)w.

The (lI'St .gro\lP is mll4epp()f 14 }abl1ttQirswhiChb~v~ aXpOn.licellSC$.Theworlcs:in thi$cafe~au. ~$lausb~ cattl~(witb :~td$laullbterillg.tl~t on~~~ieJ)and dley Jill have ~nd~ringf~ties. nere,a,e afewe~ wodcs:.~tJla\lghter~~ttleonIy ... fcmturenot fO\lJldirtanywotlc$ just se.rvicin ~ :tl\e localma.rkct.

The .ciisUngliisbingfeatme of the second group,iJU\llt cacbof .thO· five wQdc$only s1.au~hter$heep. '1breeof the five ;wQrk$hllve·an export1i~n~tlrldbeIQngtQ~J>a:mies. FOUfof lhe w9tkshandle a low yolunte ofthrC)ughpllt .. dleexpol1 WQ~ h~dlirlg.ahl~h volume ,prtbrPJlShputbasonly'r«ently~en complet¢dandiSfarnl9fC$Opni$ticat~,tbaQ t,he .()ther wQJ'kii.

n.e~ ·.~18(lbattoirsigtol1l1in.tbethir4 clu$~, ISo! wbicb,bavea ~la$$ t 1iccn~~ "

lbethreewOricSwith a Class .~ Ucen~ .prelncludetl ind.tisca~g()1y:~aU$etlleY$tatlghter ·a· medium volume Qftbrougbput .'!' aJl;l\lyPlcalfeatJ,irefor Class 2. wotics.()nlytlu'ee gf the worb,hl.tbt·S cat~~orybeloJlg'lQa .CQmp~yandtt1l; barp!le.slJlughtet,aU species.

The final c.~~oryis ,wt! largest and is .rna4e, .. llPoft~rerruUning22 Clas$2 woncs~ ~~lWQrk$haOOl~ only ~srn~tv()lurne .Qfthro\lghpytand does .noflx:1Qngto a cprnpany,

The Vil$troajQriJyQf thc~~laugbterttll~pe¢i~~ Thj$8J'OUP~~sent$:~',$nilUl WQrQ .thQ~~.gen~ly ;,$Crvi~blgUtejrJtgt. ·onal.~et!!> .b¢nceth¢lQw VQl~9f.t:Im>ugbpllt~

'" ., .

Wbil~ ·~four clustersolpti<>n$4Usn«l1»th;tlle tpWytical.1IDd n:~ tc;qglre~nts of lhe 'wiqersttldy, t1US:$QhlriOtlWl$n()top~~in that n ~4 nOtfCprc~llt,the probable ~illJnltlll"clg~tel'$~atexistmthedit.~ J\$S\lclt .ilQoptimal$Olutlonwoulc,Jc,qst at • :l()w~r l~~l in UtecllJst~~hterm-ch)'~the iJ)$Ulbni~ofsoJutiQ"s_low~:fu:don :lovel$ Wtl118nted investig~tion, Inthenext~non lh~ undedyingcau$t$of~ ln$tabiUty .~gi$CQ$~

The in$tabllity Qf clust~ membersbipat.lQw~ordercl\l~tetSQlptiOll~ meniloneg .in .tbeprevious~on .. bignUgb~ anUrtl~rQfprobleQlSPsociattAwithbofu.,choic~of clust~algorithm;md clu~tering.strAtegiesandwitbtl1ec.o~t AJl4. appUcatiQnof:sirrUhnity to dichotomous da~

Two SOllJ'Ce$ of instability were jdenti.fied,ThemQ~ ,CQtn1llQn ··J!n4·disruptlve SOQrCe lies in tbe QiffCl'entJQinins rulesdtrit 9b~~«=r~~Q!jU1tn$~pplyin fQillling~lps~. Unl~$ ~~m ,cQIltaindisUnctgrQups()f ~n~ti~?mtO¢ationpf.fringc J~ lnJ~nneQi~~~ntiti~s can vary 8C¢ordingtqme ~8oriUlltl'empl(»'ed.F()r.e~le. siven .• u~latlv¢JywPrQml elongat~scaU~r 9£ entiti:~s$ing1¢ lingge «)n~()f ~bienuchical famlly),'wbi<:h jQin$.pn ,nearestne,iahbQurtulewUl PJDbabl)'prodg~a$in,gle 'loosccJu$~¢r~ W1P'ds,on '~Ol~ hl1nd. ~k$to ,minbniseintra~luster'v~ance and would m~r~.tendtOprodPCe :t\Voor ,ntQrctight clusters. While this problem h;t$I)e(;.1.inve$tigate4.by.manyre~ber$,n()

gcm~'Jty applicable ruleS·AS to wnen#.pAJticutara1sQrith.mshouldbe.p'pliedbav~bcen

develo~

The~ond$OUfce·()fjnsttlbUityWllS·found to "~ties lnmC$im.ilfbity<:oefficients. Tie$occurwh¢n tWo entiti~s :haveiQenucal. $hniJaritytoa ~thirdenpty but(!iffet,fromeacb ()m~. ToiU\lstrlltt;, .. tt"te simiJarif¥l>etween the fQltowinS·l~. enudes' w~s ~culaU!dusins simplematcbing.

Variables

~~! .r... : ..... i.·. ·i.· •. · .. :. '~ .•.• i .•.. · ... ·· .• ~.· .•. ·. : .•.. ~ ..•.. ,CaseS 10 .100 1 DO 1

11

1

;2 ,77.$

3

Ifentiti~s 1 ana. 3 ,;U-¢ ~t~ .8$. $q)tu1lteCl"s~,ql~ly .it .1$ lmp9SsUjl¢ t() distin~l~h with which .cluslCr entity2sn9pldjoill, Thispr.edicamenf is .n<>t .ur4que to

" ' ',\-, ,

$implematching. 'Sinc¢ mt binary f1S$QCi~tiQnm~~$ ·~.~mputedfrQmJbe, ·tw~W,a)' 3sSQCiatiQrltab)e"t\C$ .(:~., Q¢Cllr.reg~dle,ssQfWhichi~#4QPted.Whe,1'\' ti~.are v~ .rewme problem may not be seve~bptwhendte ma.trixha$anumber.Qfti~. cluster.$Oltld,ons~y be<:<>me bishly.\lnstable,.Jn.~ next$eCUQn.tm ~tentau"e:~Utod for Q~culatin,binary shnUarityCC)effici¢ntsjspropo~'Wbich!.larsc::ly ov~spe$~

InUle preceding$CCtionequivalent diffetei1~, ottie~ in, the .$lrnUarit)'~tw~n cases w~oneoftbeconditiQl1s :iderttifiec.l~,causin,g ins~bUity incblster m.em~hip~ 'The likelihood ofthisCQnditionoccuring isbiglleJ' .for·dichotoroous MUitban fQI c(mtinuQ\lsot mUlti-state data .as dicho{QffiOus da~.are limitedw two state va1u~. Con~QCntly, iustabilityin loweronfer hienm::mca.lclustcrsPhltions c~bee,~~ed toOCC\li' nt<>re frequently when similarity i$calculat~ .fm,mdichotomousdata. lntbl$~"on anew npproJIChtoclllculatingSimilcu;tyfrombinary d~whicbi$ designedtQ.reduc~tbe li.kelihqQd of ties, is .,:ported.

~J;m (1982) and others haveexplQrrA lb~ applic~tion iofalgebraic ~opol()gy.. in the . - ~. .

analysi~ofthe~~'Ucture ofsccialreladons within a groupo(individualS.ThedilUl tllatare the .spbject ofanalysi$ indlis contt?te ar¢Qrtert .<JiCho~!1lOll$.For .exlIlllp1e. the data ~maybe inthefoM·of.an inC,jdenc~ma~\'i}cQniiQg e~h indi;ridu~s plU'ticipation ,in .·ll~ries Qf

·Jlcpvi.ties~Theapplication of a1g~brai¢ topcllogy Involves.analy$is of.dle~pQndence m~tri" wbichis formed by the ptoduc~of the in(;idencernatrix .anditS trans~se(Porebln, pp.219.;22~).ThecQ-Conesponden¢ematrix~on& the .frequency wbh wbichany two indlvidun1sjomtlypttrticiJ>a~in urtactivity~ A toW (or column) of this matrix is termed a 'simplicial ~ple?t'{U1d1CC9rds the fre.q\lency With Which an individualjointlY,pardCipateP

12

inactivities Witb ~;tch ()ftb~·~ individuals. ·Th~:identiticat1on()rthe.~ial $'p~ttnl:o£ the ·groupinvQlvc$ 'analysis of the ipetwod~ of~i,mpUcialCQ~pleXe$~

!&c:>manotbcr~l'Spectiye",~helementofthe~~ponc:k!n~maJtbcCAA··J>e . .interp~tedasameasqre.QfQle $itniltmty~tw~ntwo 'l~\viduals.In9~WPr.d$t~~ch

ele~nt()f.~, 'simplicial comple,,~n:ts 'tbesinill~ty~~.l) twQimlividl.UlJ.sln .~tm$·()f the frequency with wbi¢bdlcy jomtlypanicipat~inthea,ctivitJes~

Another.PQssibility t which does ,nmappear.tc) baYel>e¢n cxplOJ'ed. J$ .t()treat,th~ s.impliclalcomple~es;intoto.as am~ure·of similarity. .In otherw<mJs,iW; panem of.~ c~dcnce~ween eacb :jndividu~ and .llll othc:ti,ndivjdua1$~mestbeba$is of similarity..lf .two inQividllalS possess identical Simplicial cQfnple~esthenthe similarity between dto~ two individuals is exact, Therno~ ·the :simplicialcpmplexesdiffet:, ·lhe.P1Qfe dissimilar are two individuals.

The employmen~of the simplicial cornplex;asa Jlletl$\lfeofsimih\rity wbell.~ta are dichQtol1l()1,ls has the ~ffecl of diminishing the :likelihpod of .tie$inshnilllrity ~ Th~ reduction inti~arises from two sources~First, the value.$of ~simp1icial~ple"Memulti~$tate rather than dichotomous. Second, use ofthec()mpl~x involves,implicitly,comparisolls between entities .acrossattributes'anda(:tOss all entities in the data. CQnvenpona1<~ures of sirnilarityinvolve. comparisionsbetween entities acrossattribut~pnly,

To operationalise. the uSCQf the simplicial complex as a ·basisformcasurin~ similaritytth~ followingsirnilarity measure was developed:

i,j,k = l, ••••• ,n

Where» is the number of individulllS Of cases, and thf'l :"IJ denote¢lernent Jor simplicial complex i,nor.maUsed by the primary simplex for case 1, Algebraically t

"ij =wij I wii

Where WiJ is element Jofsimplicial complex i.

13

Thelt$lllr.mg~uJ'C is.~ dissirniJarityCoefficient.pOs&e$sil1g,b()th ~p~;and lower bounds,the.c()emci~nt .canbc int¢rpfet~ .. asnrasuring.the ¢ucUd~rtdis~ ,betwetmtwo simplicialcomplexe$ (Everitt, p.l7),

When the coefficiealtwas etnployedto .meas~;si#llitaritybetweenabalt()it'$ the nutnberof deswassignificaJltIY~lIcecJ~UsiJ1gthe$implQ matCblng¢QCffiCientth~re ·'were 37 IS ties present ill the 'Illatrhc_ In ton)parison.there'w~ 433desp~ntinthe.matrix when the proposedcoefftcicntwa&'adopted .... TedQCtionot,88per~nt. ntc ¢qefficient clearlY' offers 'pronuseasa means ofreduclngthe effect·()fties.OIl clustttinstability.

To ~~stwbetherthcproPQsedmeasure 'irnprovesstabiUty .~tw¢en:a1gorithmsthctwo simihlrity mamceswere used to fonncllist~'using W~ntts~iliodand .completelinktlge~ Instability, measured intenns OfCOOll1lOllClassification ·()f CtlSeS, was'improvedatUt~ siX cluster solution but reduced at .the fouticblster;SQlution~ Atthesix cluster sotution.73 .~. cent of cases were ,placed in thesameclassificanQn.using the proposee coefficient while 69 per cent ofcase$ wercplaccd in "he sante classlncatiQn when botbalgorithms w~app1ied tQtbesimpletnatcbin,g coefficient. At the (Pur clustersoluuonthe results Were '61 :and78 percent for theproposedandDlatching~fficientsrespc:cuvcily.

The decrease instabilityofclllster ,SOllltionsat' ·broaderclassmcation levelsusirtgthe proposed coefficient probably arises frotntWo $Ouree5.. ·First,thc; lower the level of :the hieran:hica1 solution,thernoreimportant istheroleplay¢4 ~y nessince differences in joining rules (that is, clllstering atgonthms) are ,tllQte likely to generate dj.ff~nCC$,in cluster membership. Second,the range ofstatcvalues pfconverttionalbinarysimilarity coefficients is limited to the number of variables. In conU'ast . the range rorthe proposed.coefficient isa . function of: the number orvariables~ the number oreases aa14thediv~ity of simplicial complexes. Therefore. when a conventional coeffickentisUsed different clustering ·algoritbmsarelikely·toconvergemore quicklytothesameclustering.~llltionsathigh levels in the hierarchy • This behaviour emphasises .the importanceofchoosint; the C()rrect level!'n the hierarchy as the appropriate c)u$terlngsolution. Ittil$OhighUght$ .the need for more robust optimisil1g clus~ng algorithms. to be developed toassistreseatcbers in .tnJOOngthis choice. .By way of ex amp let the optimising procedure ": 'monothenc divisive clus~ng (mentioned~1ier) was ctllployed()Ilthe data. The treatment ot the data by UUs,procedure

mostclosclyresembles. of all the Clustering ·procedures. 'the 'manner in wbicb dataar~ treatedbyilieproposedcoefficient.Thisproceduregenetated anoptin'lum .solution of.seven clusters 8i1d 90 per centofcases. were placed in the same classifications that are obtaine4in a seven cluster solution usingWani's·metbod.For .complete linlcage. a70percent agreement

14

with th~$Cve.~ccluster$()luijol1resu1ted, whiJe',theconCOIttaneebetWeenthcw~'.8and ,cQmplete$Oludons.a~$tWenclusterswas 76petecmt.Thi$com~Witha:66percent

agreement betw~n Want's and.complete WheJl '·$Cvcn.clustet$OllltiOtt'ls .g~neratedtJsing tbe sinlple.tn;ltching coefficient

Th~sirnilarity~urepro~indlispaper $hows.prouUseasa:~ure(or

:filt.MUringsimUaritywhcm na~ ,art)dichot()tlU)U~Funher work iStJ~yto~finethe conceptS pre$Cnteqhere .andto cxpalldtheiJPpliclitl(:m .Qf,Jhetoeffici¢nt. AdditionalwQ[k .is being undenaken todevelqpan optimising clusterinJ:t:a1gQritbmfor;ll~ witb~. ~ffictent.

Thepurposc ,of~ppIYi"g(fluster'analysistoth~NSWmeatptoeesslng.indti~trywas ·to establish some· meansotcategorising ·abattoirs into relatively 'hQnlQg~JlOqsgroups~Frorn ~ach ofthes¢gtoupsasarnplefinn could'ilien.bednlwnwbich would capturc thtmng¢.of workscUITentlyoperadng.Jnso fat .~th¢ cl\lste1:'S,genetatedinthemidalanalysis ~ tairlymbustand. inlerpretable~ this objective wassatlsfied.

:Whiletbe four·clu$tet' $Olption satisfied .. bo~the;ma1ytica1and·,tesOurce'req\1ire~nts of.theWidc:r .study, ·.UUSSOlutionwas not optitnalmtbafitdiCl ootrepre$¢nt'lhe probable tnaturaltclusterstllatexist in the .data. As ,sllchanoptimPl soJutionwoUld existatalbwer level in .~clu$tering .hienuchy,the instabilityofsOlutl.c)nsat Jowetfuslonlevels. warranted ,iny~stiglltion.Twosources 'of .instability were identified.thcdlfferences injoining.ndesthat

clusteratgoritbms apply in 'fonnil'lgclusters,andtbe presence ·pftie$ .i!'l'tbe data. Thefonner source has been thefoc,sof:muchre~arcbbut~lutiol1 of the issuesmvQlved ,jsunUlcely for$Ome time. It is, therefore, largely left tothereseareher toheuristicalJ'Y justify their cooice()f.atgorithm. Withrespecttomeasuringsimibuitywhenda~aredich()lomous, 'a, 'new:approach l()calculating$itnilari~ was presented. The attributes Qf theprPPosed simiJaritycoefficient largely Qve~me the problem 'wjthties.Use oftllecoeffieieotis the subjectofcontitlu,ingresearChasitap~ ,it may be of value in;lddressinganumber Qt other iSsues conc~rnjng~imi1arity~

De$pit~thQshortcomil1gS discuSS¢(J intlttspapetthe<authors feel cll,lsfet .analysis has ;greatpotendalinquantl~tive economic lUla1ysis. .Apparent,lhopgh.is the ~toapplythe t~hniqtle witbcaudonand an urgent need ,for furtberinvestigatiQn.intotheproblemsand concep,tsofclqstetsll$~~n~y apply :in economic analyses.

.IS

WAR]) '.8 CO:E;FFICIEN"r

CENTROIDCOEF.FICIENT

O~1··-r-....... -----...-~~-~-------...-tIi----'

~o.~ -+-'"!""P_-______ --,.-_-r---,..-....------r--.----r--r--..-.,....-I .0 12

Cluste-rs

CLUSTER! 1.Blaynf!yJ\battOir P/L 2.Cudgegollg (Abattoir) County Council '3~GunnedahShireCouncil

4.Metro Meat (Moss Vale) S:R.J,(]ilbertsQn PIL ~~WiflgharnAbattoirPIL

7.T11eMidCoastMeat eoI'lL 8:NorthWestExpotts ,P/L 9 .. NorthemC()o!opMeat CQ Ltd .10. The' Aberdeen Jleef Ct) 11.(Jgyra M~tPacldng I'lL 12.Riverstone Metit ·CQPIL 13.Lacht~y,Meat$ P/L 14.MetroMeat(Wagga)Ltd

CLUSTER 2

16

1 S.AustralianSuperiorMeatPackersPIL 16.Australian Specialiscd~ MeatProductsPIL 17~Fletcher IntemationalBxportsP/L 27~StaddmnPlL

S9.SlPayMeatCo.

CLUSTER 3 18 Beers Butcheries :P/L 19.A.l. Bush &SonsP/L 20~Castlereagh Regional Abauoir 21.F.ONicbols,(Abattoir} Z2.Tamworth .City Abattoir 23. T.& J.Wadland :P/L. 24.CowraAbattoir Ltd 2S.PJP'rish Mcmt Supplies 26.G.M.ScQUPIL 28.S.Motrow &. Sons 29.Griffith Abattoir 30~BurtaJ18ong Abattoirs

'3.l .. R.N.TaylorP~ '32~~\1ndra Abattoir ,~3JI!l$ting$MeatS\lpply PIt., 34~I.R~B\lrn~tt

3S.EJenMelts P/L 39.WoUandillyO:nttJ!l. :Killing,O>-Qp,

CLUSTER 4

36.()llJUblgaiKillingCentre 37.WandaeIPas,tora} Company 38.~yPorahy & Sons ,40.Swans Butchery 41~1).& S~Aff1ick

42~DeringunaMeats

43.Wyalong Meatwotics '44.F~ericktOn Meat Servi<:e 4S~Yo~oP~

46.BablaraaJ'/L 47,.MurwU1im~J\batt()irP/L

48.Moore l/iLatimore 49.CunnynghliJl1Btos SO.Tancred &. Conslllble Sl.SPulbernJUverina 'M~tSupply .P/L S2.R.G.Uat1oW S3!C.p.Ba}1 Butchery & Co 54.C.0. Perry&Cp

SS~R.&N. Stenhouse S6.'P.R.& R.N~Andrews ,57.MorgtUlsllulchery 58.Macleay Valley MeatS

.. ~

17

18

~Reterenf!e$

Aldend~er.M.S.,&Blashfield, RJ<:.(1984), quslerAnalxsis.$agePiJ1)licatlQns. J1evetlyHills. . .

Clifford,a:r. &. 'Stephenson,W • .(1915), ,An IntrqduQtipl1 TQN\lmerical Classification, Academi.c l:'re~s,lNewYork. . . .' . , .

'Donen, p~ (1982)~ 'Orlthe J)elineationof 'Sm;tU GroupS~cturc~' ,in Hud$On;H.& Associate.s. CJassifyiJli~oR,J J)aJ8. JQ$ey .. ;SClS$inc, ·San. PnmCisco

Everitt, ,B. (1981),Clu$ter AJla1)!sjs,2ndedn,aals.tedPress,NewYork~

Oower,l~C. (1911),A Gen~CoefficienJ of Similarity and SOm¢ofjttsPro~es, J3iQmFtriC$.').7,8S7~872 .

Green.P.E~, TuU, D.$~&.Albaulll,G~(l988), Res_FQrMykc;tin~ pWi§ions. Prenti~.;HalllntemationlllEc:litiont New York.. . .,. " . . ...,

1arcUnc,N .&SibSQn, R. (1968), The ConstructiQn QfaiertU'Chicantll'ion~Hierarcb'i¢ Classifications,C9'UpUl$!rloumaJ. 11, l17 ... 184

Rum~l, 'R,.J. (t970), AppJiesJFacto[ AnaJysi~.Nottlt\Vest¢,rtlUnivCf$ityPre$$J;avell$tQn, Dl ' ... ' ., .................. '. .... .

SJl~tbP,H.A,& Sokal R.R •. (1973),HumedcalTaxQnoDly"W:a~Fteernan 4 CO,SJ\1l Frar~i~ . . .............................. ' .... .

Tabachnic:k,B~G.&FldelltL~S.(l989),J1sin~MvlijyarigteSt,dsdCst2nd «lb.·a(U'p:t~ RpwPDblishers,New York. '.' ... . .

Wishm,]). (l987).ChJ~tanU~r Mannal. 4th Edition, ComputingybQratc>ry, Univ~qst)'· QfSl J\ndJews.Edinburgh . ,