multivariate statistical analysis of geochemical data of groundwater in veeranam catchment area,...
TRANSCRIPT
JOURNAL GEOLOGICAL SOCIETY OF INDIA
Vo1.74, November 2009, pp.573-578
Multivariate Statistical Analysis of Geochemical Data ofGroundwater in Veeranam Catchment Area, Tamil Nadu
M. SUVEDHAI, B. GURUGNANAM2,M. SUGANYA3and S. V ASUDEVAN4IDepartment of Geology, Alagappa Govt. Arts College, Karaikudi - 630 003
2Department of Earth Sciences, Annamalai University, Annamalai Nagar - 608 002
3No.572, P.S.P.Street, Soodamani Nagar, Karaikudi - 630 0034 Department of Geology, Bharathidasan University, Trichy
Email: [email protected]
Abstract: The study of hydrogeochemistry of the Mia-Pliocene sedimentary rock aquifer system in Veeranam catchment
area produced a large geochemical dataset. Groundwater samples were collected at 52 sites over 963.86 km2 area
and analyzed for major ions. The large number of data can lead to difficulties in the integration, interpretation and
representation of the results. Two multivariate statistical methods, Hierarchical cluster analysis (HCA) and Factor analysis
(FA), were applied to a subgroup of the dataset to evaluate their usefulness to classify the groundwater samples, and to
identify geochemical processes controlling groundwater geochemistry. Hydrochemical data for 52 groundwater samples
were subjected to Q- and R- mode factor and cluster analysis. R-mode analysis reveals the inter-relations among the
variables studied and the Q-mode analysis reveals the inter-relations among the samples studied. The R-mode factor
analysis shows that Ca, Mg and Cl with HC03 account for most of the electrical conductivity. total dissolved solids andtotal hardness of groundwater. The 'single dominance' nature of the majority of the factors in the R-mode analysis
indicates non-mixing or partial mixing of different types of groundwater. Both Q-mode factor and Q-mode clusteranalyses indicate an exchange between the river water and the groundwater in the vicinity. The rock water interaction
like tlood basin back swamp deposits of silty clayey formation is the major cause for the cluster \I classi fication. Cluster
classification map reveals that 58% of the study area comes under cluster \I classification.
Keywords: Groundwater, Multivariate statistical analysis. Geochemical data, Tamil Nadu.
INTRODUCTION
The objective of the study is to identify the processes
controlling the geochemical evolution of groundwater by
using two proven methods of multivariate analysis of the
geochemical data sets, namely Hierarchical cluster analysis
(HCA) and Factor analysis (FA). The relatively complex
setting and geological history of the study area, use ofHCA and FA aims at distinguishing respective roles of
geological and hydrogeological factors in this hydro-chemical evolution. We also assessed the relative
applicability and complementarities of HCA and FA
methods compared to conventional geochemical grouping
in achieving the scientific evaluations.
Multivariate statistical analysis has been successfullyapplied in a number of hydrogeochemical studies. Steinhorstand Williams (1985) used multivariate statistical analysis
of water chemistry data in two field studies to identify
groundwater sources. Usunoff and Guzma'n-Guzma'n(1989) demonstrated the usefulness of the approach in
hydrogeochemical investigations for understanding the
geological and hydrogeological state of the aquifer.
Multivariate treatment of environmental data is also widely
used to characterize and evaluate groundwater quality
(Vengosh and Keren, 1996; Suk and Lee, 1999; Helena et
al. 2000; Reghunath, 2002; Lambrakis et al. 2004;
Panagopoulos et al. 2004, Vincent Cloutier et al. 2008). It
is also useful for identifying temporal and spatial variations
caused by natural and human factors linked to seasonality.
STUDY AREA
The study area, the Veeranam catchment, occupies an
area of 963.86 km2, falling in parts of Cuddalore andPerambalur districts, Tamil Nadu. It lies between the North
latitudes 11°05'56" - 11°26' and East longitudes 79° 15'30"-
79°32' 10" (Fig. 1). Physiographically, the area is flat with
gentle slope, experiences high rainfall from the north-
east monsoon. Geologically, the area is underlain by
alluvial deposits of Early to Middle Pleistocene. Thenature and character of the alluvium have been studied,
0016-7622/2009-74-5-573/$ 1.00 <9 GEOL. SOC. INDIA
574 M. SUVEDHA AND OTHERS
79"20'O'E 79'30'O'E
A
Legend
. SamplingLocation
0 2' 8r , , , r , , , r
Kilometers
- Road
79'20'O'E
~ Water body
~River79'30'O'E
Fig,I. Map showing water sample locations.
. based on the geological sections prepared from the well logs
of the tube wells in the area. From these logs, it is evident
that alluvial deposits which form the potential aquifers
primarily consist of thick deposits of mottled sandstone,
clay and lignite deposits of Mio-Pliocene age, The
Quaternary formations are restricted to the alluvium of
Cauvery, Kollidam and their distributaries which occurs as
a isolated remnant patches over Cuddalore Formation. The
area is bounded by the river Vellar in the north and Kollidam
in south, running along the eastern to northwestern andnorthern to northeastern boundaries.
MATERIALS AND METHODS
GEOCHEMISTRY
Fifty two water samples were collected in May 2006fromdifferentshallowdug wells and deep bore wells, whichare almost uniformly distributed over the study area. Onlythose wells were selected for sampling purpose which arein constant use and approachable. After half an hourdischarge from the tube wells, the samples were collectedin air tight bottles with stoppers and subjected to chemicalanalysis to see the variations in quality parameters. Major
Unit: Concentration in ppm except pH. EC (liS Cm"), RSC and SAR
(meq I").
cations and anions were estimated by titration method.
Residual sodium carbonate (RSC) was calculated bysubtracting (Ca+Mg) from the values of carbonates and
bicarbonates expressed as epm (Eaton, 1950). Sodiumabsorption ratio (SAR), was calculated by dividing sodium
with root of half (Ca+Mg) expressed as epm (Richard, 1954).
MULTIVARIATE STATISTICAL ANALYSIS
Factor Analysis (FA)
Multivariate techniques can help to simplify and organize
large data sets and to make useful generalizations, that can
lead to meaningful insight (Laaksohmju et al. 1999). Cluster
and factor analyses are efficient ways of displaying complex
relationships among many objects (Davis, 1986). The two
methods in cluster and factor analyses, i.e. Q- and R- mode
analyses have been done for the data generated. R-mode
analysis reveals the interaction among the variables studied
and the Q-mode analysis reveals the interrelation among
the samples studied. The software packages like Statistical
Package for Social Sciences (SPSS) and STATISTICA 6
have been used to carry out the analysis. The data have been
standardized by using standard statistical procedures.
Hierarchical Cluster Analysis (HCA)
Cluster analysis comprises a series of multivariate
methods which are used to find true groups of data. In
clustering, the objects are grouped such that similar objects
fall into the same class (Danielsson et al. 1999), Hierarchical
clustering joins the most similar observations, and then
successively the next most similar observations. The levels
of similarity at which observations are merged are used to
JOUR.GEOL.SOC.INDIA, VOL.74. NOY 2009
Table 1. Mean and standard deviation of the chemical parameters ofgroundwater
Parameter Valid N Mean Minimum Maximum Stc!.Dev
Na 52 51.98 5.52 358.8 56.81K 52 11.92 0.39 117.3 24.39
Ca 52 64,17 601 170.34 42.67Mo 52 17.02 1.22 46.21 1197"CI 52 131.41 17.73 425.52 102.85
HCO, 52 205.12 42.71 842.08 14906
SO. 52 5.27 0 96.06 14.34
pH 52 6.89 6 7.7 0,43
EC 52 713.65 140 2480 482.90
TDS 52 382.28 72.96 1375.27 264.68
TH 52 230.28 20.02 565.45 146.5.6RSC 52 0.29 0 6.8 104
SAR 52 1.52 0 8.34 1.29
STATISTICAL ANALYSIS OF GEOCHEMICAL DATA OF GROUNDWATER, VEERANAM CATCHMENT, TAMIL NADU
construct a dendrogram. In this study, a standardized spaceEuclidian distance (Davis, 1986) is used. A low distance
shows the two objects are similar or "close together",
whereas a large distance indicates dissimilarity.
RESULTS AND DISCUSSION
The chloride. calcium. sodium and bicarbonate content
shows a significant difference between the medium and
maximum values, the mean values being near the quarter
values of the maximum values. It suggests that local
contamination to the groundwater system. The wide range
of bicarbonate contents, from 42.71 to 842.1 ppm is the
result of the lateral geological variations of the layers.
Box plots of the chemical concentration show that
bicarbonate, calcium, chloride and TDS have the largest
dispersions (Fig.2). The enrichment of chloride and TDSfrom values of 17.73 to 425.52 and 72.95 to 1375.27
respectively. is observed in the groundwater on the eastern
side of the study area. It is due to the high enrichment of the
flood basin back swamp deposit of silty clayey formation.The increase in the salt concentration could be associated
with different mechanisms like water rock interaction
processes.
R - Mode Factor Analysis
R-mode factor analysis of different chemical constituents
of the groundwater of Veeranam catchment area has been
carried out. All cations and anions, TDS, EC, pH andhardness have been considered for the present analysis. The
1600
. Median1400 ~- n~ 25%-75%' _h _h hh - h h _hhh- - h- - _nn_-
:::r::: Mn-Max
E 1200- -- _nn nn _nn -nn _n- - --- -- - - - .-- _h_- h__-a.a.c::= 1000 --- h-- --- nn_nnn _n _nnn_nn__h'__h-_nn--c:0..,
~c:<I>uc:0u
600 -n - .n '" -_.--. --_h - h-
600 ._nn_nnh h h nnnnnnn_nn_n
1-"-.-------
i?! . _n__-jl
L
200
Na CI HCO, so, IDS 1HK C. Mg
PARAMETffiS
Fig.2. Box and Whisker plot for chemical parameters of ground-water samples.
JOUR.GEOL.SOC.INDIA, VOL.74. NOV. 2009
575
analysis generated five factors which together account for
95.08% of variance. The rotated loadings. eigen values,
percentage of variance and cumulative percentage ofvariance of all the five factors are given in Table 2. The first
eigen value is 7.69 which accounts for 59.2o/r of the totalvariance and this constitutes the first and main factor. The
second and third eigen values are 2.33 and 1.0 I and these
account for 18% and 7.84% respectively, of the total
variance. Each of the remaining eigen values constitutes lessthan 10% of the total variance. The first factor (which
accounts for 59.2% of the total variance) is characterised
by very high loadings of Ca, Mg, CI and EC, and moderate
to high loadings of bicarbonate and pH. This factor reveals
that the EC and TDS in the study area are mainly due to Ca
and Mg and CI, though bicarbonate also plays a substantial
role in determining EC and TDS. This factor accounts for
the temporary hardness of the water. The second factor
(which accounts for 18% of the total variance) is mainly
associated with very high loading ofNa, Cl and bicarbonate,
and also with moderate loading ofTDS. This factor accounts
for the temporary salinity of the water. The loading ofbicarbonate is same as the first factor. Factors 3-5 are
characterized by the dominance of only one variable each,
such as S04 (factor 3), HCOJ (factor 4) K (factor 5), andtogether these six factors account for 17% of the totalvariance.
Q - Mode Factor Analysis
The rotated loadings, eigen values, percentage of
variance and cumulative percentage of variance of the first
three factors are given in Table 4. Q-mode factor analysis
of the 52 groundwater samples generated three factors which
Table 2. R - Mode factoranalysiswith Varimax normalized rotation
Parameter Factor I Factor 2 Factor 3 Factor 4 Factor 5
Na 0.444 0.868 0090 0.018 0 150
K 0.140 0.412 0.025 0.071 0.886
Ca 0.972 -0.035 0.118 0.089 -0.052
Mg 0.751 0.093 ..0007 0.576 0 132
CI 0.875 0.354 0.084 -0.209 0229
HC03 0566 0538 0.073 0.601 0.086
SO, 0.119 0 125 0949 0.000 0.001
pH 0.541 0.120 0.489 0327 0278
EC 0.820 0.499 0.099 0.17') 0 18(,
TDS 0.801 0.516 0.145 0.136 0.224
TH 0.959 0.006 0.084 0.258 0.007
RSC -0.128 0.879 0126 0283 0.206
SAR 0172 0935 0.046 ..0.038 0.218
Eigenvalue 7.699 2.385 1.019 (J.(,M 0592
% Total Variance 59223 1834 7.84 5.107 4.560
Cumulative % 59.22 77.57 85.41 90.524 95.08
576 M. SUVEDHA AND OTHERS
together accounted for 99.87% of the total variance
(Table 3). The three factors obtained in this way were rotated
using the Varimax procedure (Knudson et a!. 1977), which
could be more easily interpreted. The first factor (which
explains 37.96% of the total variance) was considered as
major factor controlling the relative proportions of major
element existing in the groundwater samples and had the
high loadings of almost all the samples except those fromlocation nos. 28 and 46.
On the other hand, groundwater samples from the three
locations 28, 46 and 51 had high loadings in the second
factor. As mentioned earlier, the Q-mode factor analysis
described the relative proportions of these major elementsin groundwater samples. Therefore, the relative proportions
of major elements in these groundwater samples were
controlled completely by the three factors which together
explain 99.87% of the total variance. The distribution of
wells explained by factors 2 and 3 do not conform to any
kind of spatial pattern. However, the majority of the sampleswithin factor 1 fall on either side of the main course of
the river system. This strongly suggests that there is an
exchange between the river water and groundwater in the
vicinity. This has also been discussed by Reghunath et a!.(2002).
HIERARCHICAL CLUSTER ANALYSIS
The HCA is a data classification technique. There are
different clustering techniques, but the hierarchical clustering
is the one most widely applied in Earth sciences (Davis,1986), and often used in the classification of
hydrogeochemical data (Steinhorst and Williams, 1985;Schot and van der Wal, 1992; Ribeiro and Macedo, 1995;
Gu"ler et aI., 2002). The result of the hierarchical cluster
analysis was given as a dendrogram (Fig.3). For this project,the Euclidean distance was chosen as the distance measure,
or similarity measurement, between sampling sites. The
sampling sites with the larger similarity are first grouped.
Next, group of samples are joined with a linkage rule, and
the steps are repeated until all observations have been
classified. With this geochemical dataset, Ward's methodwas more successful to form clusters that are more or less
homogenous and geochemically distinct from other clusters,
compared to other methods such as the weighted pair-group
average. Ward's method is distinct from other linkage rules,
because it uses an analysis of variance approach to evaluatethe distances between clusters (StatSoft Inc., 2004). Other
studies in their cluster analysis (Adar et a!. 1992; Schot andvan der Wal, 1992). Gu"ler et a\. (2002) also found that
using the Euclidean distance as a distance measure and
S.No
Table 3. Q - Mode factor analysis with Varimax normalized rotation
Factor 3
]
2
3
4
5
6
7
8
9
10
II
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
272829
3031
323334
353637
3839
4041
42434445
4647
48
4950
5152
Eigenvalue .,% Total Variance
Cumulative %
Factor I
0.637
0.6340.7110.5510.539
0.5010.63
0.563
0.5160.685
0.7050.5510.5450.765
0.5470.5330.635
0.6690.5780.5280.563
0.5560.502
0.5460.5830.473
0.5660.4470.7110.547
0.6520.521
0.6720.5890.679
0.6650.6480.665
0.7620.7190.708
0.7030.55
0.6710.5480.4470.685
0.7110.765
0550.524
0.7719.74
37.96237.962
Factor 2
0.614
0.5460.5210.4890.618
0.52506270589
0.6870.585
0.5320.6890.7050.522
0.5370.6960.6490.5720.665
0.6510.589
0.5890.5240.538
0.5780.619
0.5420.7540.5210.537
0.6430.708
0.5640.6380.524
0.5420.5880.533
0.3830.4550.513
0.5910.71
0.587071
07540584
0.50.522
0.710.727
0.41118.50 I
35.5873.541
0.466
0.5480.4720.6760571
0.6850.4580.5780.511
0.435
0.4690.47
0.4530375
06420.4810.418
0.4750.472
0.5450.578
0.5860.687
0.642057
0.626
0.621
0.481
0.472
0.642
0.401
0.477
0.479
0.496
0.514
0.51
0.484
0.522
052
0.525
0.484
0.394
0.44
0.418
0.441
0.481
0.436
0.494
0.375
0.44
0.444
0.486
13.694
26.335
99.876
JOUR.GEOL.SOC.INDIA. VOL.74. NOY.2009
STATISTICAL ANALYSIS OF GEOCHEMICAL DATA OF GROUNDWATER, VEERANAM CATCHMENT, TAMIL NADU
110"2J635,329"""""7173437
c: 120 '+:: "ro "0 500 .....J 4451C),c: 21= 25Co 19E 31ro 22
en4118"20"15302427"263229..614.."".
0 20 40
577
60 80 100 120
Linkage Distance
Fig.3.Dendorgram of the hierarchical cluster analysis using the Ward method.
Ward's method as a linkage rule produced the most
distinctive group.
There are three major clusters as shown in Fig. 2. Clusters
I, 2 and 3 correspond to the factors I, 2 and 3 respecti vely.
The similarity of the Q-mode cluster analysis to the Q-mode
factor analysis confirms the interpretations made using the
Q-mode factor analysis. To understand the spatialdistribution of various cluster classes, the results were taken
into GIS platform wherein spatial distribution map is
prepared (Fig.4). The salient findings of spatial distribution
map are given in the Table 4.
Table 4. Results of Cluster Classification spatial distribution map
CONCLUSION
The scientific evaluation ofthe raw data by FA and HCAleads to the conclusion that the water-rock interaction
process is the major mechanism responsible for the
groundwater salinity in the study area. The water samplesare mainly of calcium-bicarbonate type, pointing to the
JOUR.GEOL.SOC.INDIA, VOL.74, NOY. 2009
aquifer lithology dominated by calcareous sandstone and
clayey formation. The factor analysis reveals that the calcium
and magnesium concentrations are the major sources for
"'20.0'N
N
t
""'WE 79"300'E
/"-;;'-::~»
!<~ia(if@/$
P.,.,"",
"'10"'N
0 '4 8,,, ,," I
Knoon,'",
Legend
~CI'sI'"
[::::::::1 CI,sI., 2
GIJ Clu".' 379"2O"<rE 79"3O"O'E
Fig.4. Spatial distribution of map of cluster classes.
Grid code Cluster classification Area in km'
I Cluster I 127.27
2 Cluster II 533.8
3 Cluster III 302.79
578 M. SUVEDHA AND OTHERS
the hardness of groundwater. An anthropogeniccontamination was identified in both the aquifers, due tolocal pollution inputs. The Q-mode factor and cluster
analyses indicate that exchange between the river water and
ADAR. EM., ROSENTHAL,E, ISSAR, A.S. and BATELAAN,O. (1992)
Quantitative assessment of the flow pattern in the southern
Arava Valley (Israel) by environmental tracers and a mixing
cell mode!. Jour. Hydrology, v.136, pp.333-352.
DANIELSSON,A., CATO, I., CARMAN, R. and RAHM, L. (1999) Spatial
clustering of metals in the sediments of the Skagerrak/Kattegat.
Applied Geochemistry, v.14, pp.689-706.
EATON,EM. (1950) Significance of carbonate in irrigation water.
Soil Sci., v.69, pp.123-133.
GOLER, c., THYNE, G.D., MCCRAY, J.E. and TURNER, A.K. (2002)
Evaluation of graph ical and multivariate statistical methods
for classification of water chemistry data. Hydrogeology Jour.,v.lO, pp.455-474.
HELENA,B., PARDO,B., VEGA,M., BARRADO,E., FERNANDEZ,J.M.
and FERNANDEZ,L. (2000) Temporal evolution of groundwater
composition in an alluvial aquifer (Pisuerga River, Spain) by
rincipal component analysis. Water Res., v.34(3), pp.807-816.KNUDSON,E.J., DUEWER,D.L., CHRISTIAN,G.D. and LARSON,T.v.
(1977) Application of factor analysis to the study of rain
chemistry in the Puget Sound region. In: B.R. Kowalski (Ed.),
Chemometric: Theory and Application. ACS SymposiumSeries, Washington, DC, pp.80-116.
LAMBRAKIS,N., ANTONAKOS,A. and PANAGOPOULOS,G., (2004) The
use of multi component statistical analysis in hydrogeological
environmental research. Water Res., v.38, pp.1862-1872.PANAGOPOULOS, G., LAMPRAKIS, N., TSOLlS-KATAGAS, P. andPAPouLls,
D. (2004) Cation exchange processes and human activities in
inconfined aquifers. Environ. Geo!., v.46, pp.542-552.REGHUNATH,R., SREEDHARA,M.T.R. and RAGHAVAN,B.R. (2002)
the groundwater plays a dominant role in the hydrochemical
evolution of groundwater. Cluster classification map revealsthat 58°,{i of the study area comes under cluster IIclassification.
References
The utility of multivariate statistical techniques in
hydrogeochemical studies: an example from Karnataka, India.Water Res., v.36(10), pp.2437-2442.
RIBEIRO,L. and MACEDO,M.E. (1995) Application of multivariate
statistics, trend and cluster analysis to groundwater quality
in the Tejo and Sado aquifer. In: Groundwater Quality:
Remediation and Protection. Proceedings of the PragueConference, May 1995. IAHS Pub!. No.225, pp.39-47.
SCHOT,PP. and VANDERWAL,J. (1992) Human impact on regional
groundwater composition through intervention in natural flow
patterns and changes in land use. Jour. Hydrology, v.134,pp.297-313.
STATSOFTINC. (2004) STATISTICA (Data Analysis SoftwareSystem), Version 6.
STEINHORST,R.K. and WILLIAMS.R.E. (1985) Discrimination of
groundwater sources using cluster analysis, MANOVA,
canonical analysis and discriminant analysis. Water ResourcesRes., v.21, pp.1149-1156.
SUK, H. and LEE, K. (1999) Characterization of a ground water
hydrochemical system through multivariate analysis: clustering
into ground water zones. Ground Water, v.37(3), pp.358-366.VENGOSH,A. and KEREN,R. (1996) Chemical modifications of
groundwater contaminated by recharge of treated sewageeftluent. Contam. Hydro!., v.23, pp.347-360.
VINCENTCLOUTIER.,RENE LEFEBVRE.,RENETHERRIEN.,MARTINE,
M. and SAVARD.(2008) Multivariate statistical analysis of
geochemical data as indicative of the hydrogeochemical
evolution of groundwater in a sedimentary rock aqui fer systemJour. Hydrology, v.353, pp.294-313.
(Received: 28 April 2008; Revisedform accepted: 20 June 2009)
JOUR.GEOL.SOCINDIA, VOL.74, NOY.2009