plant leaf roughness analysis by texture classification with generalized fourier descriptors in a...
TRANSCRIPT
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
1
Plant leaf roughness analysis by texture classification with
generalized Fourier descriptors in a dimensionality reduction
context
Journaux, L.1*
, Simon, J.-C.1, Destain, M.F.
2, Cointault, F.
3, Miteran, J.
4 and Piron, A.
2
1AgroSupDijon, Engineering Sciences, 26 Bd Dr Petitjean BP 87999, 21079 Dijon Cedex,
France,
2FUSAGX, Unité de Mécanique et Construction, 2 passage des déportés, 5030 Gembloux,
Belgium,
3AgroSupDijon, Agroengineering Sciences, 26 Bd Dr Petitjean BP 87999, 21079 Dijon
Cedex, France,
4Université de Bourgogne, Le2i, Avenue Alain Savary BP 47870, 21078 Dijon Cedex, France
*[email protected]; phone: 0033/380.39.23.71; fax: 0033/380.39.27.51
Abstract In the context of plant leaf roughness analysis for precision spraying, this study
explores the capability and the performance of some combinations of pattern recognition and
computer vision techniques to extract the roughness feature. The techniques merge feature
extraction, linear and nonlinear dimensionality reduction techniques, and several kinds of
methods of classification. The performance of the methods is evaluated and compared in
terms of the error of classification. The results for the characterization of leaf roughness by
generalized Fourier descriptors for feature extraction, kernel-based methods such as support
vector machines (SVM) for classification and kernel discriminant analysis for dimensionality
reduction were encouraging. These results pave the way to a better understanding of the
adhesion mechanisms of droplets on leaves that will help to reduce and improve the
application of phytosanitary products and lead to possible modifications of sprayer
configurations.
Keywords Texture classification ∙ Precision spraying ∙ Motion descriptors ∙ Dimensionality
reduction Leaf roughness ∙ Kernel discriminant analysis
Introduction
Since the development of precision agriculture (Robert 1999), much research has been
done on the optimization of inputs in the field to reduce the environmental impact and to
*ManuscriptClick here to download Manuscript: draft5.doc Click here to view linked References
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
2
increase the yield, which is of benefit to farmers. Two specific activities have been focused
upon especially, the fertilizer application (mineral or organic spreading (Hijazi et al. 2008;
Villette et al. 2008) and spraying for appropriate weed control (Yun and Yong 2006). In
research on precision spraying, in particular, one objective is to minimize the volume of
phytosanitary products applied to reduce environmental effects by using more effective plant
treatments. The main goal is ensure that the sprayed products reach their target, to reduce
losses that occur at the time of application. The mechanisms of losses by drift are now well
known, but those due to runoff from leaves are still poorly understood. These latter are related
to the adhesion mechanisms of liquids on a surface. Specific models have been developed
(Forster et al. 2005) that showed that the predominant factor is leaf roughness for which little
robust research has been done. For example, with a hydrophobic surface the ‘lotus effect’ can
appear as in the Fig. 1.
Fig. 1 Representation of the ‘lotus effect’ (photograph by William Thielicke1)
For natural images, texture in the form of colour information is a fundamental
characteristic usually used in pattern and object recognition in different domains such as
medical and biological imaging, biometry, earth observation and industrial control by
computer vision (Fig. 2).
1 http://wthielicke.gmxhome.de/bionik/indexuk.htm
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
3
Fig. 2 Different domains in which texture analysis is applied (from left to right): medical,
biological, biometric, earth observation and industrial control
A successful texture classification or segmentation requires an efficient method for feature
extraction. The major difficulty, however, is that textures in the real world are often not
uniform due to changes in orientation, scale, illumination conditions, or other visual effects.
In our case we consider the invariant features (scale, illumination and rotation invariant)
called generalized Fourier descriptors (GFD) (Smach et al. 2007), which extract robust but
high-dimensional texture features (high-dimensional vectors) comprising many data.
Unfortunately, in a classification context, high-dimensional vectors are often redundant,
strongly correlated and suffer from the problem of the ‘Hughes’ phenomenon’(Hughes 1968),
which results in inaccurate classification (
Fig. 3).
Fig. 3 Illustration of the Hughes’ phenomenon
Classification
performance
Data complexity = dimensionality
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
4
To improve classification performance of the original features we combine the
classification steps with a selection of 13 linear and nonlinear dimensionality reduction (DR)
techniques (from classical principal component analysis (PCA) to more recent methods such
as Laplacian eigenmaps (LE) or kernel discriminant analysis (KDA)), which transform high-
dimensional data into a meaningful representation of reduced dimensionality.
Numerous studies have aimed to compare DR algorithms, usually using synthetic data
such as a swissroll (Lee et al. 2004; Lee and Verleysen 2007), but less so for natural data
such as hyperspectral images as in Journaux et al. (2006) or Niskanen and Silven (2003) (
Fig. 4). However, it is important to note that the goal of DR algorithms is to explore
the intrinsic structure of high-dimensional data, for example by unfolding data in the case of
the swissroll or reducing high-dimensional natural signal data as for hyperspectral images.
(a) (b)
Fig. 4 a Swissroll manifold frequently used as synthetic data (Lee and Verleysen 2007) and
b natural data from a hyperspectral image (Short)
We propose to characterize the leaf roughness of different plant leaf images (1242
texture images) by taking a computer vision approach with a combination of spatio-frequency
texture feature extraction and six methods of classification. Previous research into plant leaf
identification or recognition used only one method of classification in general such as in Wu
et al. (2007) or focused on morphological features (Tzionas et al. 2005). However, some
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
5
studies such as that of Backes and Bruno (2009) explored texture analysis to classify plant
leaves.
Our aim is to be able to characterize the hydrophilic or hydrophobic behavior of a leaf
with signal and or image processing so as to adapt the the sprayer settings and application of
phytosanitary products to crops.
The advantage in the operational contexts of precision agriculture or precision
spraying relate to two main aspects: for agricultural equipment manufacturers, knowledge of
the optimum characteristics of the spraying throws necessary to maximize the proportion of
the product deposited on leaves is essential to optimize the equipment to reduce the
environmental impacts and costs tied of phytosanitary products, and secondly so that the
phyto pharmaceutical firms understand the adhesion mechanisms of their products on the
leaves for the development of the sprays, especially of the adjuvant type (surfactant, oils,
humectants).
Finally, within the EcoPhyto 2018 French program this research could help to
optimize previous agronomic models and help landusers, especially winegrowers, to reduce
their input. This work is actually under investigation with the BIVB2 organization.
This paper describes briefly the GFD used as feature texture extraction tools, the
methods of classification used and the DR methods that will be cross compared. These
methods are combined with a feature selection approach to complete the comparison. Two
texture datasets (one synthetic and one natural) are compared, and the most efficient
combination is highlighted and discussed. Ideas for future work are included at the end of the
results.
2 Bureau Interprofessionnel des Vins de Bourgogne (Interprofessional Bureau of the Burgundy Wines)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
6
Generalized Fourier descriptors (GFD) and methods of classification
Texture characterization using generalized Fourier descriptors (GFD)
The main goal of texture analysis is to formalize the texture feature by mathematical
parameters. Several methods have already been proposed in the literature to extract the texture
features and tested in practice. There are five main families of methods to extract textural
features (Jain and Tuceryan 1993): structural, statistical and spatio-frequency approaches, and
methods based on form recognition and fractals.
The GFD are defined as follows. Let f be a square summable function on the plane.
The Fourier transform is then
2R
( )= (x)exp -j x dxf f .
(1)
If , are the polar coordinates of point we denote f , as the Fourier transform of
f at , . Gauthier et al. (Gauthier et al. 1991) defined the mapping of fD from into
by
2
2
0
ˆ , df
f
D = ( )
,
(2)
where fD is the GFD feature vector, extracted as in Fig. 5 that describes each texture image.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
7
Fig. 5 Procedure to find GFD texture vectors (feature vector from the image of centered 2D
Fourier transform of the texture image)
This GFD vector will be used as an input to the supervised classification method and will be
reduced by DR methods. The GFD features, calculated according to Eq. 2, have several
properties that are useful for object recognition: they are translation-, rotation- and reflection-
invariant.
Methods of classification
Classification is a central problem in pattern recognition (Duda et al. 2001) and many
approaches to solve it have been proposed such as the connectionist approach (Bishop 1995)
or metrics based methods, k-nearest neighbours (k-nn) and kernel-based methods such as
support vector machines (SVM) (Vapnik 1998). In our experiments, the average performances
of the dimensionality reduction methods and of one basic feature selection method applied to
the GFD features have to be evaluated. In this context, we have chosen and evaluated six
efficient classification approaches from four families of classification: the boosting (adaboost)
family (Schapire 1990) using three weak classifiers, (hyperplan, hyperinterval and
hyperrectangle), the hyperrectangle (polytope) method (Miteran et al. 1994), the SVM method
(Vapnik 1998; Abe 2005) and the neural network family with a multilayer perceptron (MLP)
(Rumelhart et al. 1986). To validate the classification performance and estimate the average
2
2
0
ˆD = ( , ) df
f
2R
( ) (x)exp -j x dxf f
_pixels values
Feature vector
Texture Image Image of centered 2D FT
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
8
error for each method, we performed 20 iterative experiments with a 10-fold cross validation
procedure (Witten and Eibe 2005).
Dimensionality reduction methods
The GFD provide features that have great potential in pattern recognition, but they result
in very high-dimensional data that are difficult to handle and comprise redundant information.
Moreover, the computational cost of elaborate data processing tasks may be prohibitive.
Therefore; dimensionality reduction (DR) techniques are used to transform high-dimensional
data into a meaningful representation of reduced dimensionality to improve classification
performance.
Let T
1=( ,..., )nX x x be an n × m data matrix, where n is the number of image examples in
each texture dataset and m is the dimension of vector xi, corresponding to the discrete
computation of fD from Eq. 2. Ideally, the reduced representation has a dimensionality that
corresponds to the intrinsic dimensionality of the data (Camastra and Vinciarelli 2002). One
of our working hypotheses is that, although the data (all texture images) are points in m ,
there is a p-dimensional manifold M = (y1, …, yn)T that can suitably approximate the space
spanned by the data points. The so-called intrinsic dimension (ID) of X in m is the smallest
possible value of p (p<m) for which the approximation of X by M is reasonable. In other
words, the ID is defined as the number of variables that is sufficient to represent the signal. To
determine the ID of our data, we used a geometric approach that estimates the equivalent to
the fractal dimension (Camastra and Vinciarelli 2002).
The DR methods can be classified according to three characteristics:
Linearity: This describes the type of transformation applied to the data matrix,
mapping it from m to
p .
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
9
Scale analysis (local or global): This reflects the kind of properties the transformation
preserves. In most nonlinear methods, there is a trade-off between the preservation of
local topological relationships between data points or of the global structure of X .
Metric (Euclidean or geodesic): This defines the distance function used to estimate
whether two data points are close to each other in m , and should consequently
remain close in p after the DR transformation. It is important to note that we
conserved the metrics of the methods generally used in the literature, but there are also
other metrics in Euclidean space such as Minkowski or Chebyshev distance (Deza and
Deza 2006).
Based on these criteria, we retained 13 methods: 4 linear and 9 nonlinear, see Table 1. To
complete this review of DR methods, we compared them with one classical feature selection
method to determine which approaches are the most relevant.
Linear methods
Principal components analysis (PCA)
Principal components analysis is the best-known DR method. It finds a linear transformation
that retains the subspace with the largest variance. It can be shown that the reconstruction
error, PCAJ , is minimized for the eigenvectors, iu , of the covariance matrix of X . It is
interesting to note that PCA is close to the classical multidimensional scaling (MDS)
introduce by Shepard (Shepard 1962) and (Kruskal 1964) where Euclidean distance is used as
described in (Fodor 2002). This relation between MDS and PCA is important because MDS
underpins other nonlinear DR methods such as ISOMAP (Tenenbaum et al. 2000). Principal
components analysis is a linear, global and Euclidean technique.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
10
Second-order blind identification (SOBI).
Second-order blind identification (Belouchrani et al. 1997) relies on stationary second-order
statistics that are based on a joint diagonalization of a set of covariance matrices. The set X is
considered as a mixed set of independent signals (t)iX , (t correspond to the time) and the p
features of the destination space we are searching are assimilated to a fixed number of original
sources S (t)i corresponding here to the intrinsic dimensionality. Each (t)iX is assumed to be a
linear mixture of n unknown components (sources) S (t)i , from the unknown ‘mixing’ matrix,
A .
(t)= s(t)X A. (3)
Second-order blind identification is also a linear, global and Euclidean method.
Projection pursuit (PP)
This method of Friedman and Tukey (1974), linked to the independent component analysis
method (ICA) (Comon 1994), is based on the resolution of a cost function, which finds its
optimum by a gradient descent method. For our experiment, we used the fast-ICA algorithm
(HyvÄarinen 1999) that allows new components to be estimated one by one by deflation. The
symmetric decorrelation of the vectors at each iteration was replaced by a Gram-Schmidt
orthogonalization procedure. When p components, 1w ,...,w p have been estimated, the
algorithm determines +1w p
. After each iteration, the projections T
+1w w w ( =1,..., )p j j j p of the p
previously estimated vectors are subtracted from +1w p
. Then, +1w p
is standardized according
to Eq. 4
+1T
+1 +1 +1=1 T
+1 +1
ww =w - w w w =
w w
p p
p p p j jj
p p
.
(4)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
11
It is important to note that there is a fundamental difference between PP and ICA. For the ICA
approach, the problem is solved globally and all components are evaluated at the same time.
Conversely, in the PP method each component is evaluated independently in an iterative
procedure. For each iteration, the algorithm finds a new component and readjusts data in such
a way as to hide the data structure in this new component. Finally, the difference between
these two approaches is based on the iterative ‘deflation’ approach of PP and the ‘global’
resolution of ICA. The algorithm stops when p components according to ID number have
been estimated. The projection pursuit method is also linear, global and Euclidean.
Nonlinear methods: Global approaches
Sammon's mapping (Sammon)
Sammon's mapping (Sammon 1969) is a DR method that tries to preserve the neighbourhood
topology of data by preserving distances between points. To evaluate the preservation
topology, we use the following stress function minimized by a gradient descent
2
, ,
sam
, =1 ,,, =1
(d -d )1
dd
m pni j i j
n mmi j i ji ji j
J
,
(5)
where ,dm
i j and ,d p
i j are the distances between points ith
and jth
, in m and
p . This function,
allows the distances in the projection space to be conserved compared to the initial space.
Sammon’s mapping is a nonlinear, global and Euclidean method.
Isometric feature mapping (Isomap)
Isomap (Tenenbaum et al. 2000) estimates the geodesic distance between data points in a
manifold using the shortest path in the nearest neighbours’ graph. It then searches for a low-
dimensional representation that approximates those geodesic distances in the least squares
sense. The 3 steps are:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
12
Form mD (X) , the all-pairs distance matrix.
Create a graph from X (k nearest neighbours). For a given point iX in m , a
neighbour is either one of the k nearest data points from iX or one for which d <εm
ij .
Form the all-pairs geodesic distance matrix mΔ (X) , using Dijkstra’s shortest path
algorithm.
Use classical MDS to find the transformation from m to
p that minimizes
2
ISOMAP
,
(X, )= (δ -δ )n
m p
ij ij
i j
J p . (6)
Isomap is nonlinear, global and geodesic.
Kernel methods (K-PCA, K-Isomap, KDA)
Recently, several well-known algorithms for the reduction of dimensionality of manifolds
have been developed to take the kernel machine approach (Ham et al. 2004; Shawe-Taylor
and Cristianini 2004). We retain here the three that are known best: kernel-PCA (K-PCA)
(Schölkopf et al. 1998), kernel isomap (K-Isomap) (Choi and Choi 2007) and kernel
discriminant analysis (KDA) (Liang et al. 2006). Non-linearity is introduced by mapping the
data from the input space m to a feature space . The projection methods (PCA, isomap
or discriminant analysis) are then applied to this new feature space, expressed by a kernel, K,
in terms of a Mercer kernel function (Schölkopf et al. 1999). For our experiment, we used the
following classical Gaussian kernel such as for the SVM classification method:
2x-y
-ζ
(x,y)K e
. (7)
All kernel methods are nonlinear and global, but K-PCA and KDA use the Euclidean metric
and K-isomap uses the geodesic one.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
13
Nonlinear methods: Local approaches
Local linear embedding (LLE)
The LLE algorithm (Roweis and Saul 2000) estimates the local coordinates of each data point
on the basis of its nearest neighbours, then searches for a low-dimensional coordinate system.
The 3 steps are:
(1) find the neighbourhood graph (see steps 1 and 2 of isomap).
(2) compute the weights ijW, that reconstruct
iX best from its neighbours, which
minimize the reconstruction error, ˆ-i ix x , where ˆ =i ij j i
j
x W x x ,
(3) Compute vectors iy in p reconstructed by the weights ijW . Solve for all
iy simultaneously:
i ij j
j
y W y
. (8)
This algorithm finds the local affine structure of the data manifold and the best projection of
data points in p . The LLE is nonlinear, local and Euclidean.
Laplacian eigenmaps (LE)
The Laplacian eigenmaps method finds a low-dimensional data representation by preserving
local properties of the manifold (Belkin and Niyogi 2003). The three steps of the algorithm
are:
Create the non-oriented symmetric neighbourhood graph.
Associate a positive weight ijW to each link of the graph (constant weights ( =1/kijW ),
or exponentially decreasing ( 22=exp - - /ζij i jW x x )).
Obtain the final coordinates iy of the points in p by minimizing the cost function:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
14
LE ij i j ii jj
ij
J 2
W y -y / D D , (9)
where D is the diagonal matrix ii ijjD = W . The LE is a nonlinear, local, Euclidean method.
Curvilinear components analysis (CCA) and curvilinear distances analysis (CDA)
The CCA is an evolution of nonlinear multidimensional scaling (MDS) and Sammon’s
mapping algorithms (Demartines and Hérault 1997). Instead of the optimization of a
reconstruction error, CCA aims to preserve the distance matrix while projecting data onto a
lower dimensional manifold. Let mD (X) be the 2 2n n matrix of distances between pairs of
points in X:
( ) (d ),m
m ijD X = where m
ij i jd = -x x.
After DR transformation to p , we also have:
( ) (d ),p
p ijD X = where d = -p
ij i jy y.
The CCA aims to find the best suitable transformation by minimizing
2
CCA
, =1
( , ) (d -d ) F(d ),n
m p p
ij ij ij
i j
J p X (10)
where F is a decreasing, positive weighting function that gives more importance to the
preservation of small distances. The CCA is nonlinear, local and Euclidean.
The CDA is a refinement of CCA (Lee, Lendasse et al. 2004) by minimizing:
2
CDA
, =1
( , )= ( -d ) F(d ),n
m p p
ij ij ij
i j
J p X (11)
where m
ij measures the geodesic distance between ix and jx , as in Isomap. The CDA is
nonlinear, local and geodesic.
Table 1 The 13 dimensionality reduction methods
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
15
Global Local
Linear
Principal component analysis
Linear discriminant analysis
Second-order blind indentification
Projection pursuit
Nonlinear
Sammon mapping
Isomap
Kernel isomap
Kernel-PCA
Kernel discriminant analysis
Local linear embedding
Laplacian eigenmaps
Curvilinear component analysis
Curvilinear distance analysis
Metric: Euclidean or Geodesic
Although some of the methods are neither completely global nor local, to simplify their
description we have classified them in the way usually encountered in the literature.
Feature selection method
Feature selection with an exhaustive search is impractical because of the large number of
possible feature subsets. To select the 5 best features identified by intrinsic dimensionality
estimation, we used the sequential forward selection method (SFS) (Kittler 1978) which
performs better when the optimal subset has a small number of features. The criterion
function for selection was the average correct rate of classification over all classes, obtained
by quadratic discriminant analysis (QDA) on all observations. The QDA approach was chosen
because it does not depend on features other than the observations and its aim is a measure of
efficiency of the feature subset and not the optimal rate of classification. At the end of the
process, the 5 best features were selected.
Texture Image databases
To test our texture classification protocol, the experiments included images from two
different sources:
The main texture grey level images images (Fig. 6) used in this study were provided for
our agronomic application. These images were acquired with a SEM microscope and
represent different kinds of leaf surfaces from six plant species. For each class of leaf texture
150 - 200 images were acquired. Each image is at a scale of 100 µm, with a resolution of
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
16
512512 pixels; this scale was adapted to our biological application. There were 1242 texture
images in six classes.
(a) (b) (c)
(d) (e) (f)
Fig. 6 The six classes of leaf texture images: (a) tomato, (b) rye grass (Lolium perenne), (c)
mature wheat, (d) pea, (e) young wheat and (f) horsetail
The well known Brodatz texture dataset (Brodatz 1966) cited in more than 500
relevant papers over the past 20 years (Fig. 7)
Fig. 7 Samples of the 32 Brodatz textures used in the experiments
The Brodatz dataset comprises 32 different textures. The original grey level images have a
resolution of 256256 pixels, but here they were cropped to 16 disjointed 6464 samples. To
evaluate scale and rotation invariance, three additional samples were generated per original
sample (90° degrees rotation, 64 64 scaling, combinations of rotation and scaling). Finally,
the set contained almost 2048 images with 64 samples per texture.
Results and discussion
From the two datasets described above, we obtained n=2048 vectors in m=32 dimensions
by GFD extraction from the Brodatz texture database and n=1034 vectors in m=254
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
17
dimensions from plant leaf texture database. These datasets represent 32 and 6, respectively,
classes of texture surfaces. According to the geometric approach proposed by Camastra and
Vinciarelli (2002), we estimated and fixed the intrinsic dimensionality of our two datasets as
being p=5.
The classification performance and average error rate for each classification method
were compared. The classification error rate corresponds to the percentage of
misclassification of the signals of test samples in the cross validation procedure. For the
SVM, we used the classic Gaussian kernel for which we determined the optimum. For the
Brodatz texture dataset (Table 2), the best results on classification error using the original
feature space (not reduced) were obtained using SVM (e=2.65%).
Table 2: Classification results on the Brodatz dataset (% error rate)
Hyperplan Hyperinterva l Hyperrec tangle
O riginal
features17.3 12.2 22.5 15.5 2.7 20.4
Selection 21.3 19.7 13.4 8.3 3.06 15.3
23.4 18.5 13.2 7.4 8.4 11.2
46.6 27.1 25.7 24.8 10.5 16.4
84,0 82,0 69,0 75,0 61.4 73,0
23.8 21.2 12.9 15.6 7.8 13.3
23.4 19.5 12.9 7.3 6.6 12.1
22.5 23.4 15.3 8,0 4.5 15.9
23.7 20.7 13.9 9.1 5.7 18.2
22.6 20,0 15.3 7.4 3.9 15.6
16.7 11.8 14,0 5.6 1.2 10.2
23.6 19.1 14.2 7.2 6.7 11.8
24.5 15.3 12.2 6.6 0.8 12.3
21.3 17.7 15.1 6.13 1.9 9.6
Dim
en
tio
na
lity
red
ucti
on
SVM MLPMethodsBoosting
Hyperrec tangle
The best combination is in bold and is also highlighted and all over combination in bold
correspond to results whose value is best than the result obtained with the original data
features
All the other methods gave poorer results (from 12.2 to 22.5%). Their performance is
generally improved by DR: the optimum error is obtained by combining KDA and SVM
(e=0.8%, i.e. the error is divided by a factor 3 compared to the classification without RD
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
18
methods and original high-dimensional features). The combination of LE with SVM gives
similar results. One can note that the use of kernel methods in combination with DR generally
improves performance compared to the standalone DR approach (isomap vs. K-isomap, PCA
vs. K-PCA). In the group of fast methods of decision, the best result is obtained using
Hyperrectangle combined with KDA. These results are generally confirmed by the
experiments with the plant leaf dataset (Table 3), although the dimension of the original space
is significantly higher than in the previous case (254 vs. 32) and the number of classes is
fewer (6 vs. 32). In this case, the gain factor is 3 (comparing SVM classification with original
high dimensional features, and the combination of SVM/KDA).
Table 3: Classification results on the plant leaf dataset (% error rate)
Hyperplan Hyperinterva l Hyperrec tangle
O riginal
features6.5 3.3 16.9 27.6 1.5 35.7
Selection 18.2 14.9 3.7 10.5 5.7 9.7
3.9 8.6 9.7 2.4 11.9
4.9 9.6 13.2 4.8 15.8
85.6 87.5 84.5 82,0 81.2
25.8 10.9 10.1 5.5 13,0
5.1 4.3 7.8 2.3 11.2
17.9 7.8 8.3 1.9 14.1
17.5 5.2 9.9 2.9 16.2
7.6 4.9 9.8 1.9 15.4
2.5 10.4 8.1 1.2 7.8
13.2 11.8 11.5 1.9 13.9
12.1 9.4 11.7 0.4 12.5
3.9 11.4 6.3 1.3 10.2
Dim
en
sio
na
lity
red
ucti
on
SVM MLPMethodsBoosting
Hyperrec tangle
The best combination is in bold and is also highlighted and all over combination in bold
correspond to results whose value is best than the result obtained with the original data
features
The combination of GFD and KDA appears to provide sufficient information to
characterize plant leaf roughness. In particular, this solution to texture classification enables
us to separate our six types of agronomic image into six different clusters as shown in Fig. 8.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
19
This is further validated by the small difference detected between two wheat clusters that
differ from one another in terms of growth stage only. In fact, it is important to note that the
difference between young and old wheat is characterized by the loss of water that transforms
the structure of the leaf surface. This transformation results in slightly different features. This
result allows us to consider future applications in order to follow the evolution of wheat
maturity.
Fig. 8 3D Projection of the third component of Kernel discriminant analysis for the plant leaf
dataset (each axes corresponds to the reduced coordinates of the original features by KDA)
The results are acceptable and the proposed method can be used as a robust tool for
roughness analysis. Nevertheless, the experiments were done on small samples for the
agronomic dataset, although the results obtained on the Brodatz dataset are pertinent. To
improve on these results, we will increase the number of leaf texture surfaces with different
leaf species (vines and other crops) at different growth stages to follow the evolution of the
crops.
From the image processing viewpoint, comparisons are currently done with other
texture features such as spatio-frequential and statistical parameters (Gabor filters, co-
occurence matrices and so on) and combinations of colour-texture analysis that provide high-
Component 1
Component 2
Component 3
rye
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
20
dimensional data by the concatenation of the textural features and the spectral information as
in Cointault et al. (2008).
Finally, the combination of KDA and SVM could be used for the detection of other
agronomic properties such as hydrophobic or hydrophilic surfaces, or monocotyledon and or
dicotyledon recognition. Artificial leaf textures could be modeled to control the
hydrophobicity of the surface for laboratory experiments so that sprays could be developed or
adapted accordingly in future research. For this research, comparisons with spectral tools will
be necessary.
For the same crop we are now able to distinguish between different growth stages (e.g.
wheat, Fig. 8) in order to optimize the inputs, but also to detect diseases earlier. A disease will
modify the structure of the leaf and its texture and these can be identified by our types of
analysis.
Finally, we are currently developing some research in precision viticulture, and the
results and methods presented in this paper will help us to model the evolution of vine leaf
roughness. This will be combined with the use of optical approaches, such as particle tracking
velocimetry sizing (PTVS) to study the behaviour of the droplets on the leaves. This type of
research is of interest to the sprayer manufacturers and also the phyto pharmaceutical firms.
Conclusion
A better understanding of the droplet adhesion mechanisms on leaves is an essential
step to evaluate the amount of phytosanitary product absorbed by the leaf and the amount of
product lost in the environment. It is a global objective to reduce the effect of sprays in the
context of precision spraying. Discrimination and modeling of leaf surface roughness is a
necessary stage of this. Its evaluation can be done by image or signal processing tools.
Our research has shown that the SVM classifier outperforms all other classification
methods using the original feature space. Moreover, we have demonstrated experimentally
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
21
that some DR methods can improve the final classification performance further. The best DR
method is KDA combined with SVM classifiers with an error of classification for the plant
leaf dataset of 0.4%.
This study has proved that image processing provides an accurate way of determining
plant leaf roughness, one of the most important properties for understanding phytosanitary
product losses during spraying.
The scientific spin-off can be seen in at least two different but complementary ways: a
better qualitative and quantitative understanding of fluid spraying on natural surfaces and a
better understanding of the microstructure effect on the impact and adhesion phenomenon of
droplets on leaves.
References
Abe, S. (2005). Support Vector Machines for Pattern Classification. London: Springer-
Verlag.
Backes, A. R., & Bruno, O. M. (2009). Plant leaf identification using multi-scale fractal
dimensions. In P. Foggin, C. Sansome & M. Vento (Eds.), Image Analysis and Processing -
ICIAP (pp. 143 - 150). Berlin/ Heidelberg: Springer.
Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data
representation. Neural Computation, 15, 1373-1396.
Belouchrani, A., Abed-Meraim, K., Cardoso, J. F., & Moulines, E. (1997). A blind source
separation technique using second order statistics. IEEE Transactions on Signal Processing,
45, 434-444.
Bishop, C. M. (1995). Neural Networks for Pattern Recognition. Oxford: Oxford University
Press.
Brodatz, P. (1966). Textures: A Photographic Album for Artists and Designers. New York:
Dover Publications
Camastra, F., & Vinciarelli, A. (2002). Estimating the intrinsic dimension of data with a
fractal-based method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24,
1404-1407.
Choi, H., & Choi, S. (2007). Robust kernel Isomap. Pattern Recognition, 40, 853-862.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
22
Cointault, F., Guérin, D., Guillemin, J. P., & Chopinet, B. (2008). In-field wheat ears
counting using color-texture image analysis. New Zealand Journal of Crop and Horticultural
Science, 36, 117-130.
Comon, P. (1994). Independent component analysis, a new concept ?" Signal Processing, 36,
287-314.
Demartines, P., & Hérault, J. (1997). Curvilinear component analysis: A self-organizing
neural network for nonlinear mapping of data sets. IEEE Transactions on neural networks,
8,148-154.
Deza, E. & Deza, M. (2006). Dictionary of Distances. Amsterdam: Elsevier.
Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern Classification (2nd Edition). New
York: Wiley Interscience Publication.
Fodor, I. K. (2002). A Survey of Dimension Reduction Techniques, Lawrence Livermore
National Laboratory technical report.
Forster, W. A., Zabkiewicz, J. A., & Kimberley, M. O. (2005). A universal spray droplet
adhesion model. Transactions of the ASAE, 48, 1321-1330.
Friedman, J. H., & Tukey, J. W. (1974). A projection pursuit algorithm for exploratory data
analysis. IEEE Transactions on computers, C23, 881-890.
Gauthier, J.-P., Bornard, G., & Silbermann, M. (1991). Harmonic analysis on motion groups
and their homogeneous spaces. IEEE Transactions on Systems, Man and Cybernetics, 21,
159-172
Ham, J., Lee, D. D., Mika, S., & Schölkopf, B. (2004). A kernel view of the dimensionality
reduction of manifolds. In Carla E. Brodley (Ed.). Twenty First International Conference on
Machine Learning (pp. 369–376). Banff, Canada: ACM International Conference Proceeding
Series.
Hijazi, B., Cointault, F., Yang, F., & Paindavoine, M. (2008). K. Harald & G. Martha Patricia
Butron. High-speed motion estimation of fertilizer granules with Gabor filters. Proceedings of
the 28th SPIE International Congress on High-Speed Imaging and Photonics. vol. 7126,.
Canberra, Australia
Hughes, G. F. (1968). On the mean accuracy of statistical pattern recognizers. IEEE
Transactions on Information Theory, 14, 55-63.
HyvÄarinen, A. (1999). Fast and Robust Fixed-Point Algorithms for Independent Component
Analysis. IEEE Transactions on Neural Networks, 10, 626-634.
Jain, A. K., & Tuceryan, M. (1993). Texture analysis. In C. H. Chen & P. S. P. Wang (Eds.),
Handbook of pattern recognition and computer vision (pp.235 - 276). Singapore: World
Scientific
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
23
Journaux, L., Foucherot, I., & Gouton, P. (2006). Reduction of the number of spectral bands
in Landsat images : a comparison of linear and nonlinear methods. Optical Engineering, 45,
067002
Kittler, J. (1978). Feature set search algorithms. In C. H.Chen (Ed.), Pattern Recognition and
Signal Processing (pp. 41-60). Alphen aan den Rijn, Netherlands: Sijthoff. and Noordhoff.
Kruskal, J. B. (1964). Non-metric multidimensional scaling: a numerical method.
Psychometrika, 29, 115-129.
Lee, J. A., Lendasse, A., & Verleysen, M. (2004). Nonlinear projection with curvilinear
distances: Isomap versus curvilinear distance analysis. Neurocomputing, 57, 49-76.
Lee, J. A., & Verleysen, M. (2007). Nonlinear Dimensionality Reduction. London: Springer.
Liang, Z., Zhang, D., & Shi, P. (2006). Robust kernel discriminant analysis and its application
to feature extraction and recognition. Neurocomputing, 69, 928-933.
Miteran, J., Gorria, P., & Robert, M. (1994). Geometric classification by stress polytopes.
Performances and integrations. Traitement du signal, 11, 393-407.
Niskanen, M., & Silven, O. (2003). Comparison of dimensionality reduction methods for
wood surface inspection. In K. W. Tobin & F. Meriaudeau (Eds.), Proceeding of the 6th
International Conference on Quality Control by Artificial Vision, (pp. 178-188), Gatlinburg,
Tennessee, USA, SPIE
Robert, P. C. (1999). Precision agriculture: research needs and status in the USA. In J. V.
Stafford (Ed.). Precision Agriculture '99. London, UK: SCI.
Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear
embedding. Science, 290, 2323-2326.
Rumelhart, D. E., & McClelland, J. L. (1986). Parallel Distributed Processing. Cambridge,
Mass: MIT Press.
Sammon, J. W. (1969). A nonlinear mapping for data analysis. IEEE Transactions on
Computers, C18, 401-409.
Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5, 197-227.
Schölkopf, B., Burges, J. C. C., & Smola, A. J. (1999). Advances in Kernel Methods - Support
Vector Learning. Cambridge, MA: MIT Press.
Schölkopf, B., Smola, A. J., & Müller, K.-R. (1998). Nonlinear component analysis as a
kernel eigenvalue problem. Neural Computation, 10, 1299-1319.
Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge:
Cambridge University Press.
Shepard, R. N. (1962). The analysis of proximities: Multidimensional scaling with an
unknown distance function. Part 1. Psychometrika, 27, 125–140.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65
24
Short, N. M. Remote sensing tutorial. Retrieved 04/14/2010, from
http://rst.gsfc.nasa.gov/Sect13/Sect13_9.html.
Smach, F., Lemaître, C., Gauthier, J.-P., Miteran, J., & Atri, M. (2007). Generalized Fourier
Descriptors with Applications to Objects Recognition in SVM Context. Journal of
Mathematical Imaging and Vision, 30, 43-71.
Tenenbaum, J. B., de Silva, V., & Langford, J. C. (2000). A Global Geometric Framework for
Nonlinear Dimensionality Reduction. Science, 290, 2319-2323.
Tzionas, P., Papadakis, S., & Manolakis, D. (2005). Plant Leaves Classification Based on
Morphological Features and a Fuzzy Surface Selection Technique. In D. Manolakis and A.
Gogoussis (Eds.) 5th International Conference on Technology and Automation, (pp. 365-
370), Thessaloniki, Greece: IEEE Computer society.
Vapnik, V. (1998). Statistical learning theory. New York: Wiley Interscience Publication.
Villette, S., Cointault, F., Piron, E., Chopinet, B., & Paindavoine, M. (2008). Simple imaging
system to measure velocity and improve the quality of fertilizer spreading in agriculture.
Journal of Electronic imaging, 17, 1109-1119.
Witten, I. H., & Eibe, F. (2005). Data Mining: Practical Machine Learning Tools and
Techniques, second edition. Morgan Kaufmann Series in Data Management Systems. Morgan
Kaufmann. San Francisco: Elsevier
Wu, S. G., Bao, F. S., Xu, E. Y., Wang, Y. X., Chang, Y.-F., & Xiang, Q.-L. (2007). A Leaf
Recognition Algorithm for Plant Classification Using Probabilistic Neural Network. IEEE
International Symposium on Signal Processing and Information Technology (pp. 11-16).
Cairo, Egypt. Giza: IEEE Computer society.
Yun, Z., Yong, H., Kexin, X., Qingming, L., Da, X., Alexander, V. P., & Valery, V. T.
(2006). Crop/weed discrimination using near-infrared reflectance spectroscopy (NIRS),
Progress in biomedical optics and imaging, 7(2), 37.