plant leaf roughness analysis by texture classification with generalized fourier descriptors in a...

24
1 Plant leaf roughness analysis by texture classification with generalized Fourier descriptors in a dimensionality reduction context Journaux, L. 1* , Simon, J.-C. 1 , Destain, M.F. 2 , Cointault, F. 3 , Miteran, J. 4 and Piron, A. 2 1 AgroSupDijon, Engineering Sciences, 26 Bd Dr Petitjean BP 87999, 21079 Dijon Cedex, France, 2 FUSAGX, Unité de Mécanique et Construction, 2 passage des déportés, 5030 Gembloux, Belgium, 3 AgroSupDijon, Agroengineering Sciences, 26 Bd Dr Petitjean BP 87999, 21079 Dijon Cedex, France, 4 Université de Bourgogne, Le2i, Avenue Alain Savary BP 47870, 21078 Dijon Cedex, France *[email protected]; phone: 0033/380.39.23.71; fax: 0033/380.39.27.51 Abstract In the context of plant leaf roughness analysis for precision spraying, this study explores the capability and the performance of some combinations of pattern recognition and computer vision techniques to extract the roughness feature. The techniques merge feature extraction, linear and nonlinear dimensionality reduction techniques, and several kinds of methods of classification. The performance of the methods is evaluated and compared in terms of the error of classification. The results for the characterization of leaf roughness by generalized Fourier descriptors for feature extraction, kernel-based methods such as support vector machines (SVM) for classification and kernel discriminant analysis for dimensionality reduction were encouraging. These results pave the way to a better understanding of the adhesion mechanisms of droplets on leaves that will help to reduce and improve the application of phytosanitary products and lead to possible modifications of sprayer configurations. Keywords Texture classification Precision spraying Motion descriptors Dimensionality reduction Leaf roughness Kernel discriminant analysis Introduction Since the development of precision agriculture (Robert 1999), much research has been done on the optimization of inputs in the field to reduce the environmental impact and to

Upload: independent

Post on 12-May-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

1

Plant leaf roughness analysis by texture classification with

generalized Fourier descriptors in a dimensionality reduction

context

Journaux, L.1*

, Simon, J.-C.1, Destain, M.F.

2, Cointault, F.

3, Miteran, J.

4 and Piron, A.

2

1AgroSupDijon, Engineering Sciences, 26 Bd Dr Petitjean BP 87999, 21079 Dijon Cedex,

France,

2FUSAGX, Unité de Mécanique et Construction, 2 passage des déportés, 5030 Gembloux,

Belgium,

3AgroSupDijon, Agroengineering Sciences, 26 Bd Dr Petitjean BP 87999, 21079 Dijon

Cedex, France,

4Université de Bourgogne, Le2i, Avenue Alain Savary BP 47870, 21078 Dijon Cedex, France

*[email protected]; phone: 0033/380.39.23.71; fax: 0033/380.39.27.51

Abstract In the context of plant leaf roughness analysis for precision spraying, this study

explores the capability and the performance of some combinations of pattern recognition and

computer vision techniques to extract the roughness feature. The techniques merge feature

extraction, linear and nonlinear dimensionality reduction techniques, and several kinds of

methods of classification. The performance of the methods is evaluated and compared in

terms of the error of classification. The results for the characterization of leaf roughness by

generalized Fourier descriptors for feature extraction, kernel-based methods such as support

vector machines (SVM) for classification and kernel discriminant analysis for dimensionality

reduction were encouraging. These results pave the way to a better understanding of the

adhesion mechanisms of droplets on leaves that will help to reduce and improve the

application of phytosanitary products and lead to possible modifications of sprayer

configurations.

Keywords Texture classification ∙ Precision spraying ∙ Motion descriptors ∙ Dimensionality

reduction Leaf roughness ∙ Kernel discriminant analysis

Introduction

Since the development of precision agriculture (Robert 1999), much research has been

done on the optimization of inputs in the field to reduce the environmental impact and to

*ManuscriptClick here to download Manuscript: draft5.doc Click here to view linked References

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

2

increase the yield, which is of benefit to farmers. Two specific activities have been focused

upon especially, the fertilizer application (mineral or organic spreading (Hijazi et al. 2008;

Villette et al. 2008) and spraying for appropriate weed control (Yun and Yong 2006). In

research on precision spraying, in particular, one objective is to minimize the volume of

phytosanitary products applied to reduce environmental effects by using more effective plant

treatments. The main goal is ensure that the sprayed products reach their target, to reduce

losses that occur at the time of application. The mechanisms of losses by drift are now well

known, but those due to runoff from leaves are still poorly understood. These latter are related

to the adhesion mechanisms of liquids on a surface. Specific models have been developed

(Forster et al. 2005) that showed that the predominant factor is leaf roughness for which little

robust research has been done. For example, with a hydrophobic surface the ‘lotus effect’ can

appear as in the Fig. 1.

Fig. 1 Representation of the ‘lotus effect’ (photograph by William Thielicke1)

For natural images, texture in the form of colour information is a fundamental

characteristic usually used in pattern and object recognition in different domains such as

medical and biological imaging, biometry, earth observation and industrial control by

computer vision (Fig. 2).

1 http://wthielicke.gmxhome.de/bionik/indexuk.htm

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

3

Fig. 2 Different domains in which texture analysis is applied (from left to right): medical,

biological, biometric, earth observation and industrial control

A successful texture classification or segmentation requires an efficient method for feature

extraction. The major difficulty, however, is that textures in the real world are often not

uniform due to changes in orientation, scale, illumination conditions, or other visual effects.

In our case we consider the invariant features (scale, illumination and rotation invariant)

called generalized Fourier descriptors (GFD) (Smach et al. 2007), which extract robust but

high-dimensional texture features (high-dimensional vectors) comprising many data.

Unfortunately, in a classification context, high-dimensional vectors are often redundant,

strongly correlated and suffer from the problem of the ‘Hughes’ phenomenon’(Hughes 1968),

which results in inaccurate classification (

Fig. 3).

Fig. 3 Illustration of the Hughes’ phenomenon

Classification

performance

Data complexity = dimensionality

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

4

To improve classification performance of the original features we combine the

classification steps with a selection of 13 linear and nonlinear dimensionality reduction (DR)

techniques (from classical principal component analysis (PCA) to more recent methods such

as Laplacian eigenmaps (LE) or kernel discriminant analysis (KDA)), which transform high-

dimensional data into a meaningful representation of reduced dimensionality.

Numerous studies have aimed to compare DR algorithms, usually using synthetic data

such as a swissroll (Lee et al. 2004; Lee and Verleysen 2007), but less so for natural data

such as hyperspectral images as in Journaux et al. (2006) or Niskanen and Silven (2003) (

Fig. 4). However, it is important to note that the goal of DR algorithms is to explore

the intrinsic structure of high-dimensional data, for example by unfolding data in the case of

the swissroll or reducing high-dimensional natural signal data as for hyperspectral images.

(a) (b)

Fig. 4 a Swissroll manifold frequently used as synthetic data (Lee and Verleysen 2007) and

b natural data from a hyperspectral image (Short)

We propose to characterize the leaf roughness of different plant leaf images (1242

texture images) by taking a computer vision approach with a combination of spatio-frequency

texture feature extraction and six methods of classification. Previous research into plant leaf

identification or recognition used only one method of classification in general such as in Wu

et al. (2007) or focused on morphological features (Tzionas et al. 2005). However, some

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

5

studies such as that of Backes and Bruno (2009) explored texture analysis to classify plant

leaves.

Our aim is to be able to characterize the hydrophilic or hydrophobic behavior of a leaf

with signal and or image processing so as to adapt the the sprayer settings and application of

phytosanitary products to crops.

The advantage in the operational contexts of precision agriculture or precision

spraying relate to two main aspects: for agricultural equipment manufacturers, knowledge of

the optimum characteristics of the spraying throws necessary to maximize the proportion of

the product deposited on leaves is essential to optimize the equipment to reduce the

environmental impacts and costs tied of phytosanitary products, and secondly so that the

phyto pharmaceutical firms understand the adhesion mechanisms of their products on the

leaves for the development of the sprays, especially of the adjuvant type (surfactant, oils,

humectants).

Finally, within the EcoPhyto 2018 French program this research could help to

optimize previous agronomic models and help landusers, especially winegrowers, to reduce

their input. This work is actually under investigation with the BIVB2 organization.

This paper describes briefly the GFD used as feature texture extraction tools, the

methods of classification used and the DR methods that will be cross compared. These

methods are combined with a feature selection approach to complete the comparison. Two

texture datasets (one synthetic and one natural) are compared, and the most efficient

combination is highlighted and discussed. Ideas for future work are included at the end of the

results.

2 Bureau Interprofessionnel des Vins de Bourgogne (Interprofessional Bureau of the Burgundy Wines)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

6

Generalized Fourier descriptors (GFD) and methods of classification

Texture characterization using generalized Fourier descriptors (GFD)

The main goal of texture analysis is to formalize the texture feature by mathematical

parameters. Several methods have already been proposed in the literature to extract the texture

features and tested in practice. There are five main families of methods to extract textural

features (Jain and Tuceryan 1993): structural, statistical and spatio-frequency approaches, and

methods based on form recognition and fractals.

The GFD are defined as follows. Let f be a square summable function on the plane.

The Fourier transform is then

2R

( )= (x)exp -j x dxf f .

(1)

If , are the polar coordinates of point we denote f , as the Fourier transform of

f at , . Gauthier et al. (Gauthier et al. 1991) defined the mapping of fD from into

by

2

2

0

ˆ , df

f

D = ( )

,

(2)

where fD is the GFD feature vector, extracted as in Fig. 5 that describes each texture image.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

7

Fig. 5 Procedure to find GFD texture vectors (feature vector from the image of centered 2D

Fourier transform of the texture image)

This GFD vector will be used as an input to the supervised classification method and will be

reduced by DR methods. The GFD features, calculated according to Eq. 2, have several

properties that are useful for object recognition: they are translation-, rotation- and reflection-

invariant.

Methods of classification

Classification is a central problem in pattern recognition (Duda et al. 2001) and many

approaches to solve it have been proposed such as the connectionist approach (Bishop 1995)

or metrics based methods, k-nearest neighbours (k-nn) and kernel-based methods such as

support vector machines (SVM) (Vapnik 1998). In our experiments, the average performances

of the dimensionality reduction methods and of one basic feature selection method applied to

the GFD features have to be evaluated. In this context, we have chosen and evaluated six

efficient classification approaches from four families of classification: the boosting (adaboost)

family (Schapire 1990) using three weak classifiers, (hyperplan, hyperinterval and

hyperrectangle), the hyperrectangle (polytope) method (Miteran et al. 1994), the SVM method

(Vapnik 1998; Abe 2005) and the neural network family with a multilayer perceptron (MLP)

(Rumelhart et al. 1986). To validate the classification performance and estimate the average

2

2

0

ˆD = ( , ) df

f

2R

( ) (x)exp -j x dxf f

_pixels values

Feature vector

Texture Image Image of centered 2D FT

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

8

error for each method, we performed 20 iterative experiments with a 10-fold cross validation

procedure (Witten and Eibe 2005).

Dimensionality reduction methods

The GFD provide features that have great potential in pattern recognition, but they result

in very high-dimensional data that are difficult to handle and comprise redundant information.

Moreover, the computational cost of elaborate data processing tasks may be prohibitive.

Therefore; dimensionality reduction (DR) techniques are used to transform high-dimensional

data into a meaningful representation of reduced dimensionality to improve classification

performance.

Let T

1=( ,..., )nX x x be an n × m data matrix, where n is the number of image examples in

each texture dataset and m is the dimension of vector xi, corresponding to the discrete

computation of fD from Eq. 2. Ideally, the reduced representation has a dimensionality that

corresponds to the intrinsic dimensionality of the data (Camastra and Vinciarelli 2002). One

of our working hypotheses is that, although the data (all texture images) are points in m ,

there is a p-dimensional manifold M = (y1, …, yn)T that can suitably approximate the space

spanned by the data points. The so-called intrinsic dimension (ID) of X in m is the smallest

possible value of p (p<m) for which the approximation of X by M is reasonable. In other

words, the ID is defined as the number of variables that is sufficient to represent the signal. To

determine the ID of our data, we used a geometric approach that estimates the equivalent to

the fractal dimension (Camastra and Vinciarelli 2002).

The DR methods can be classified according to three characteristics:

Linearity: This describes the type of transformation applied to the data matrix,

mapping it from m to

p .

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

9

Scale analysis (local or global): This reflects the kind of properties the transformation

preserves. In most nonlinear methods, there is a trade-off between the preservation of

local topological relationships between data points or of the global structure of X .

Metric (Euclidean or geodesic): This defines the distance function used to estimate

whether two data points are close to each other in m , and should consequently

remain close in p after the DR transformation. It is important to note that we

conserved the metrics of the methods generally used in the literature, but there are also

other metrics in Euclidean space such as Minkowski or Chebyshev distance (Deza and

Deza 2006).

Based on these criteria, we retained 13 methods: 4 linear and 9 nonlinear, see Table 1. To

complete this review of DR methods, we compared them with one classical feature selection

method to determine which approaches are the most relevant.

Linear methods

Principal components analysis (PCA)

Principal components analysis is the best-known DR method. It finds a linear transformation

that retains the subspace with the largest variance. It can be shown that the reconstruction

error, PCAJ , is minimized for the eigenvectors, iu , of the covariance matrix of X . It is

interesting to note that PCA is close to the classical multidimensional scaling (MDS)

introduce by Shepard (Shepard 1962) and (Kruskal 1964) where Euclidean distance is used as

described in (Fodor 2002). This relation between MDS and PCA is important because MDS

underpins other nonlinear DR methods such as ISOMAP (Tenenbaum et al. 2000). Principal

components analysis is a linear, global and Euclidean technique.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

10

Second-order blind identification (SOBI).

Second-order blind identification (Belouchrani et al. 1997) relies on stationary second-order

statistics that are based on a joint diagonalization of a set of covariance matrices. The set X is

considered as a mixed set of independent signals (t)iX , (t correspond to the time) and the p

features of the destination space we are searching are assimilated to a fixed number of original

sources S (t)i corresponding here to the intrinsic dimensionality. Each (t)iX is assumed to be a

linear mixture of n unknown components (sources) S (t)i , from the unknown ‘mixing’ matrix,

A .

(t)= s(t)X A. (3)

Second-order blind identification is also a linear, global and Euclidean method.

Projection pursuit (PP)

This method of Friedman and Tukey (1974), linked to the independent component analysis

method (ICA) (Comon 1994), is based on the resolution of a cost function, which finds its

optimum by a gradient descent method. For our experiment, we used the fast-ICA algorithm

(HyvÄarinen 1999) that allows new components to be estimated one by one by deflation. The

symmetric decorrelation of the vectors at each iteration was replaced by a Gram-Schmidt

orthogonalization procedure. When p components, 1w ,...,w p have been estimated, the

algorithm determines +1w p

. After each iteration, the projections T

+1w w w ( =1,..., )p j j j p of the p

previously estimated vectors are subtracted from +1w p

. Then, +1w p

is standardized according

to Eq. 4

+1T

+1 +1 +1=1 T

+1 +1

ww =w - w w w =

w w

p p

p p p j jj

p p

.

(4)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

11

It is important to note that there is a fundamental difference between PP and ICA. For the ICA

approach, the problem is solved globally and all components are evaluated at the same time.

Conversely, in the PP method each component is evaluated independently in an iterative

procedure. For each iteration, the algorithm finds a new component and readjusts data in such

a way as to hide the data structure in this new component. Finally, the difference between

these two approaches is based on the iterative ‘deflation’ approach of PP and the ‘global’

resolution of ICA. The algorithm stops when p components according to ID number have

been estimated. The projection pursuit method is also linear, global and Euclidean.

Nonlinear methods: Global approaches

Sammon's mapping (Sammon)

Sammon's mapping (Sammon 1969) is a DR method that tries to preserve the neighbourhood

topology of data by preserving distances between points. To evaluate the preservation

topology, we use the following stress function minimized by a gradient descent

2

, ,

sam

, =1 ,,, =1

(d -d )1

dd

m pni j i j

n mmi j i ji ji j

J

,

(5)

where ,dm

i j and ,d p

i j are the distances between points ith

and jth

, in m and

p . This function,

allows the distances in the projection space to be conserved compared to the initial space.

Sammon’s mapping is a nonlinear, global and Euclidean method.

Isometric feature mapping (Isomap)

Isomap (Tenenbaum et al. 2000) estimates the geodesic distance between data points in a

manifold using the shortest path in the nearest neighbours’ graph. It then searches for a low-

dimensional representation that approximates those geodesic distances in the least squares

sense. The 3 steps are:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

12

Form mD (X) , the all-pairs distance matrix.

Create a graph from X (k nearest neighbours). For a given point iX in m , a

neighbour is either one of the k nearest data points from iX or one for which d <εm

ij .

Form the all-pairs geodesic distance matrix mΔ (X) , using Dijkstra’s shortest path

algorithm.

Use classical MDS to find the transformation from m to

p that minimizes

2

ISOMAP

,

(X, )= (δ -δ )n

m p

ij ij

i j

J p . (6)

Isomap is nonlinear, global and geodesic.

Kernel methods (K-PCA, K-Isomap, KDA)

Recently, several well-known algorithms for the reduction of dimensionality of manifolds

have been developed to take the kernel machine approach (Ham et al. 2004; Shawe-Taylor

and Cristianini 2004). We retain here the three that are known best: kernel-PCA (K-PCA)

(Schölkopf et al. 1998), kernel isomap (K-Isomap) (Choi and Choi 2007) and kernel

discriminant analysis (KDA) (Liang et al. 2006). Non-linearity is introduced by mapping the

data from the input space m to a feature space . The projection methods (PCA, isomap

or discriminant analysis) are then applied to this new feature space, expressed by a kernel, K,

in terms of a Mercer kernel function (Schölkopf et al. 1999). For our experiment, we used the

following classical Gaussian kernel such as for the SVM classification method:

2x-y

(x,y)K e

. (7)

All kernel methods are nonlinear and global, but K-PCA and KDA use the Euclidean metric

and K-isomap uses the geodesic one.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

13

Nonlinear methods: Local approaches

Local linear embedding (LLE)

The LLE algorithm (Roweis and Saul 2000) estimates the local coordinates of each data point

on the basis of its nearest neighbours, then searches for a low-dimensional coordinate system.

The 3 steps are:

(1) find the neighbourhood graph (see steps 1 and 2 of isomap).

(2) compute the weights ijW, that reconstruct

iX best from its neighbours, which

minimize the reconstruction error, ˆ-i ix x , where ˆ =i ij j i

j

x W x x ,

(3) Compute vectors iy in p reconstructed by the weights ijW . Solve for all

iy simultaneously:

i ij j

j

y W y

. (8)

This algorithm finds the local affine structure of the data manifold and the best projection of

data points in p . The LLE is nonlinear, local and Euclidean.

Laplacian eigenmaps (LE)

The Laplacian eigenmaps method finds a low-dimensional data representation by preserving

local properties of the manifold (Belkin and Niyogi 2003). The three steps of the algorithm

are:

Create the non-oriented symmetric neighbourhood graph.

Associate a positive weight ijW to each link of the graph (constant weights ( =1/kijW ),

or exponentially decreasing ( 22=exp - - /ζij i jW x x )).

Obtain the final coordinates iy of the points in p by minimizing the cost function:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

14

LE ij i j ii jj

ij

J 2

W y -y / D D , (9)

where D is the diagonal matrix ii ijjD = W . The LE is a nonlinear, local, Euclidean method.

Curvilinear components analysis (CCA) and curvilinear distances analysis (CDA)

The CCA is an evolution of nonlinear multidimensional scaling (MDS) and Sammon’s

mapping algorithms (Demartines and Hérault 1997). Instead of the optimization of a

reconstruction error, CCA aims to preserve the distance matrix while projecting data onto a

lower dimensional manifold. Let mD (X) be the 2 2n n matrix of distances between pairs of

points in X:

( ) (d ),m

m ijD X = where m

ij i jd = -x x.

After DR transformation to p , we also have:

( ) (d ),p

p ijD X = where d = -p

ij i jy y.

The CCA aims to find the best suitable transformation by minimizing

2

CCA

, =1

( , ) (d -d ) F(d ),n

m p p

ij ij ij

i j

J p X (10)

where F is a decreasing, positive weighting function that gives more importance to the

preservation of small distances. The CCA is nonlinear, local and Euclidean.

The CDA is a refinement of CCA (Lee, Lendasse et al. 2004) by minimizing:

2

CDA

, =1

( , )= ( -d ) F(d ),n

m p p

ij ij ij

i j

J p X (11)

where m

ij measures the geodesic distance between ix and jx , as in Isomap. The CDA is

nonlinear, local and geodesic.

Table 1 The 13 dimensionality reduction methods

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

15

Global Local

Linear

Principal component analysis

Linear discriminant analysis

Second-order blind indentification

Projection pursuit

Nonlinear

Sammon mapping

Isomap

Kernel isomap

Kernel-PCA

Kernel discriminant analysis

Local linear embedding

Laplacian eigenmaps

Curvilinear component analysis

Curvilinear distance analysis

Metric: Euclidean or Geodesic

Although some of the methods are neither completely global nor local, to simplify their

description we have classified them in the way usually encountered in the literature.

Feature selection method

Feature selection with an exhaustive search is impractical because of the large number of

possible feature subsets. To select the 5 best features identified by intrinsic dimensionality

estimation, we used the sequential forward selection method (SFS) (Kittler 1978) which

performs better when the optimal subset has a small number of features. The criterion

function for selection was the average correct rate of classification over all classes, obtained

by quadratic discriminant analysis (QDA) on all observations. The QDA approach was chosen

because it does not depend on features other than the observations and its aim is a measure of

efficiency of the feature subset and not the optimal rate of classification. At the end of the

process, the 5 best features were selected.

Texture Image databases

To test our texture classification protocol, the experiments included images from two

different sources:

The main texture grey level images images (Fig. 6) used in this study were provided for

our agronomic application. These images were acquired with a SEM microscope and

represent different kinds of leaf surfaces from six plant species. For each class of leaf texture

150 - 200 images were acquired. Each image is at a scale of 100 µm, with a resolution of

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

16

512512 pixels; this scale was adapted to our biological application. There were 1242 texture

images in six classes.

(a) (b) (c)

(d) (e) (f)

Fig. 6 The six classes of leaf texture images: (a) tomato, (b) rye grass (Lolium perenne), (c)

mature wheat, (d) pea, (e) young wheat and (f) horsetail

The well known Brodatz texture dataset (Brodatz 1966) cited in more than 500

relevant papers over the past 20 years (Fig. 7)

Fig. 7 Samples of the 32 Brodatz textures used in the experiments

The Brodatz dataset comprises 32 different textures. The original grey level images have a

resolution of 256256 pixels, but here they were cropped to 16 disjointed 6464 samples. To

evaluate scale and rotation invariance, three additional samples were generated per original

sample (90° degrees rotation, 64 64 scaling, combinations of rotation and scaling). Finally,

the set contained almost 2048 images with 64 samples per texture.

Results and discussion

From the two datasets described above, we obtained n=2048 vectors in m=32 dimensions

by GFD extraction from the Brodatz texture database and n=1034 vectors in m=254

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

17

dimensions from plant leaf texture database. These datasets represent 32 and 6, respectively,

classes of texture surfaces. According to the geometric approach proposed by Camastra and

Vinciarelli (2002), we estimated and fixed the intrinsic dimensionality of our two datasets as

being p=5.

The classification performance and average error rate for each classification method

were compared. The classification error rate corresponds to the percentage of

misclassification of the signals of test samples in the cross validation procedure. For the

SVM, we used the classic Gaussian kernel for which we determined the optimum. For the

Brodatz texture dataset (Table 2), the best results on classification error using the original

feature space (not reduced) were obtained using SVM (e=2.65%).

Table 2: Classification results on the Brodatz dataset (% error rate)

Hyperplan Hyperinterva l Hyperrec tangle

O riginal

features17.3 12.2 22.5 15.5 2.7 20.4

Selection 21.3 19.7 13.4 8.3 3.06 15.3

23.4 18.5 13.2 7.4 8.4 11.2

46.6 27.1 25.7 24.8 10.5 16.4

84,0 82,0 69,0 75,0 61.4 73,0

23.8 21.2 12.9 15.6 7.8 13.3

23.4 19.5 12.9 7.3 6.6 12.1

22.5 23.4 15.3 8,0 4.5 15.9

23.7 20.7 13.9 9.1 5.7 18.2

22.6 20,0 15.3 7.4 3.9 15.6

16.7 11.8 14,0 5.6 1.2 10.2

23.6 19.1 14.2 7.2 6.7 11.8

24.5 15.3 12.2 6.6 0.8 12.3

21.3 17.7 15.1 6.13 1.9 9.6

Dim

en

tio

na

lity

red

ucti

on

SVM MLPMethodsBoosting

Hyperrec tangle

The best combination is in bold and is also highlighted and all over combination in bold

correspond to results whose value is best than the result obtained with the original data

features

All the other methods gave poorer results (from 12.2 to 22.5%). Their performance is

generally improved by DR: the optimum error is obtained by combining KDA and SVM

(e=0.8%, i.e. the error is divided by a factor 3 compared to the classification without RD

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

18

methods and original high-dimensional features). The combination of LE with SVM gives

similar results. One can note that the use of kernel methods in combination with DR generally

improves performance compared to the standalone DR approach (isomap vs. K-isomap, PCA

vs. K-PCA). In the group of fast methods of decision, the best result is obtained using

Hyperrectangle combined with KDA. These results are generally confirmed by the

experiments with the plant leaf dataset (Table 3), although the dimension of the original space

is significantly higher than in the previous case (254 vs. 32) and the number of classes is

fewer (6 vs. 32). In this case, the gain factor is 3 (comparing SVM classification with original

high dimensional features, and the combination of SVM/KDA).

Table 3: Classification results on the plant leaf dataset (% error rate)

Hyperplan Hyperinterva l Hyperrec tangle

O riginal

features6.5 3.3 16.9 27.6 1.5 35.7

Selection 18.2 14.9 3.7 10.5 5.7 9.7

3.9 8.6 9.7 2.4 11.9

4.9 9.6 13.2 4.8 15.8

85.6 87.5 84.5 82,0 81.2

25.8 10.9 10.1 5.5 13,0

5.1 4.3 7.8 2.3 11.2

17.9 7.8 8.3 1.9 14.1

17.5 5.2 9.9 2.9 16.2

7.6 4.9 9.8 1.9 15.4

2.5 10.4 8.1 1.2 7.8

13.2 11.8 11.5 1.9 13.9

12.1 9.4 11.7 0.4 12.5

3.9 11.4 6.3 1.3 10.2

Dim

en

sio

na

lity

red

ucti

on

SVM MLPMethodsBoosting

Hyperrec tangle

The best combination is in bold and is also highlighted and all over combination in bold

correspond to results whose value is best than the result obtained with the original data

features

The combination of GFD and KDA appears to provide sufficient information to

characterize plant leaf roughness. In particular, this solution to texture classification enables

us to separate our six types of agronomic image into six different clusters as shown in Fig. 8.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

19

This is further validated by the small difference detected between two wheat clusters that

differ from one another in terms of growth stage only. In fact, it is important to note that the

difference between young and old wheat is characterized by the loss of water that transforms

the structure of the leaf surface. This transformation results in slightly different features. This

result allows us to consider future applications in order to follow the evolution of wheat

maturity.

Fig. 8 3D Projection of the third component of Kernel discriminant analysis for the plant leaf

dataset (each axes corresponds to the reduced coordinates of the original features by KDA)

The results are acceptable and the proposed method can be used as a robust tool for

roughness analysis. Nevertheless, the experiments were done on small samples for the

agronomic dataset, although the results obtained on the Brodatz dataset are pertinent. To

improve on these results, we will increase the number of leaf texture surfaces with different

leaf species (vines and other crops) at different growth stages to follow the evolution of the

crops.

From the image processing viewpoint, comparisons are currently done with other

texture features such as spatio-frequential and statistical parameters (Gabor filters, co-

occurence matrices and so on) and combinations of colour-texture analysis that provide high-

Component 1

Component 2

Component 3

rye

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

20

dimensional data by the concatenation of the textural features and the spectral information as

in Cointault et al. (2008).

Finally, the combination of KDA and SVM could be used for the detection of other

agronomic properties such as hydrophobic or hydrophilic surfaces, or monocotyledon and or

dicotyledon recognition. Artificial leaf textures could be modeled to control the

hydrophobicity of the surface for laboratory experiments so that sprays could be developed or

adapted accordingly in future research. For this research, comparisons with spectral tools will

be necessary.

For the same crop we are now able to distinguish between different growth stages (e.g.

wheat, Fig. 8) in order to optimize the inputs, but also to detect diseases earlier. A disease will

modify the structure of the leaf and its texture and these can be identified by our types of

analysis.

Finally, we are currently developing some research in precision viticulture, and the

results and methods presented in this paper will help us to model the evolution of vine leaf

roughness. This will be combined with the use of optical approaches, such as particle tracking

velocimetry sizing (PTVS) to study the behaviour of the droplets on the leaves. This type of

research is of interest to the sprayer manufacturers and also the phyto pharmaceutical firms.

Conclusion

A better understanding of the droplet adhesion mechanisms on leaves is an essential

step to evaluate the amount of phytosanitary product absorbed by the leaf and the amount of

product lost in the environment. It is a global objective to reduce the effect of sprays in the

context of precision spraying. Discrimination and modeling of leaf surface roughness is a

necessary stage of this. Its evaluation can be done by image or signal processing tools.

Our research has shown that the SVM classifier outperforms all other classification

methods using the original feature space. Moreover, we have demonstrated experimentally

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

21

that some DR methods can improve the final classification performance further. The best DR

method is KDA combined with SVM classifiers with an error of classification for the plant

leaf dataset of 0.4%.

This study has proved that image processing provides an accurate way of determining

plant leaf roughness, one of the most important properties for understanding phytosanitary

product losses during spraying.

The scientific spin-off can be seen in at least two different but complementary ways: a

better qualitative and quantitative understanding of fluid spraying on natural surfaces and a

better understanding of the microstructure effect on the impact and adhesion phenomenon of

droplets on leaves.

References

Abe, S. (2005). Support Vector Machines for Pattern Classification. London: Springer-

Verlag.

Backes, A. R., & Bruno, O. M. (2009). Plant leaf identification using multi-scale fractal

dimensions. In P. Foggin, C. Sansome & M. Vento (Eds.), Image Analysis and Processing -

ICIAP (pp. 143 - 150). Berlin/ Heidelberg: Springer.

Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data

representation. Neural Computation, 15, 1373-1396.

Belouchrani, A., Abed-Meraim, K., Cardoso, J. F., & Moulines, E. (1997). A blind source

separation technique using second order statistics. IEEE Transactions on Signal Processing,

45, 434-444.

Bishop, C. M. (1995). Neural Networks for Pattern Recognition. Oxford: Oxford University

Press.

Brodatz, P. (1966). Textures: A Photographic Album for Artists and Designers. New York:

Dover Publications

Camastra, F., & Vinciarelli, A. (2002). Estimating the intrinsic dimension of data with a

fractal-based method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24,

1404-1407.

Choi, H., & Choi, S. (2007). Robust kernel Isomap. Pattern Recognition, 40, 853-862.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

22

Cointault, F., Guérin, D., Guillemin, J. P., & Chopinet, B. (2008). In-field wheat ears

counting using color-texture image analysis. New Zealand Journal of Crop and Horticultural

Science, 36, 117-130.

Comon, P. (1994). Independent component analysis, a new concept ?" Signal Processing, 36,

287-314.

Demartines, P., & Hérault, J. (1997). Curvilinear component analysis: A self-organizing

neural network for nonlinear mapping of data sets. IEEE Transactions on neural networks,

8,148-154.

Deza, E. & Deza, M. (2006). Dictionary of Distances. Amsterdam: Elsevier.

Duda, R. O., Hart, P. E., & Stork, D. G. (2001). Pattern Classification (2nd Edition). New

York: Wiley Interscience Publication.

Fodor, I. K. (2002). A Survey of Dimension Reduction Techniques, Lawrence Livermore

National Laboratory technical report.

Forster, W. A., Zabkiewicz, J. A., & Kimberley, M. O. (2005). A universal spray droplet

adhesion model. Transactions of the ASAE, 48, 1321-1330.

Friedman, J. H., & Tukey, J. W. (1974). A projection pursuit algorithm for exploratory data

analysis. IEEE Transactions on computers, C23, 881-890.

Gauthier, J.-P., Bornard, G., & Silbermann, M. (1991). Harmonic analysis on motion groups

and their homogeneous spaces. IEEE Transactions on Systems, Man and Cybernetics, 21,

159-172

Ham, J., Lee, D. D., Mika, S., & Schölkopf, B. (2004). A kernel view of the dimensionality

reduction of manifolds. In Carla E. Brodley (Ed.). Twenty First International Conference on

Machine Learning (pp. 369–376). Banff, Canada: ACM International Conference Proceeding

Series.

Hijazi, B., Cointault, F., Yang, F., & Paindavoine, M. (2008). K. Harald & G. Martha Patricia

Butron. High-speed motion estimation of fertilizer granules with Gabor filters. Proceedings of

the 28th SPIE International Congress on High-Speed Imaging and Photonics. vol. 7126,.

Canberra, Australia

Hughes, G. F. (1968). On the mean accuracy of statistical pattern recognizers. IEEE

Transactions on Information Theory, 14, 55-63.

HyvÄarinen, A. (1999). Fast and Robust Fixed-Point Algorithms for Independent Component

Analysis. IEEE Transactions on Neural Networks, 10, 626-634.

Jain, A. K., & Tuceryan, M. (1993). Texture analysis. In C. H. Chen & P. S. P. Wang (Eds.),

Handbook of pattern recognition and computer vision (pp.235 - 276). Singapore: World

Scientific

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

23

Journaux, L., Foucherot, I., & Gouton, P. (2006). Reduction of the number of spectral bands

in Landsat images : a comparison of linear and nonlinear methods. Optical Engineering, 45,

067002

Kittler, J. (1978). Feature set search algorithms. In C. H.Chen (Ed.), Pattern Recognition and

Signal Processing (pp. 41-60). Alphen aan den Rijn, Netherlands: Sijthoff. and Noordhoff.

Kruskal, J. B. (1964). Non-metric multidimensional scaling: a numerical method.

Psychometrika, 29, 115-129.

Lee, J. A., Lendasse, A., & Verleysen, M. (2004). Nonlinear projection with curvilinear

distances: Isomap versus curvilinear distance analysis. Neurocomputing, 57, 49-76.

Lee, J. A., & Verleysen, M. (2007). Nonlinear Dimensionality Reduction. London: Springer.

Liang, Z., Zhang, D., & Shi, P. (2006). Robust kernel discriminant analysis and its application

to feature extraction and recognition. Neurocomputing, 69, 928-933.

Miteran, J., Gorria, P., & Robert, M. (1994). Geometric classification by stress polytopes.

Performances and integrations. Traitement du signal, 11, 393-407.

Niskanen, M., & Silven, O. (2003). Comparison of dimensionality reduction methods for

wood surface inspection. In K. W. Tobin & F. Meriaudeau (Eds.), Proceeding of the 6th

International Conference on Quality Control by Artificial Vision, (pp. 178-188), Gatlinburg,

Tennessee, USA, SPIE

Robert, P. C. (1999). Precision agriculture: research needs and status in the USA. In J. V.

Stafford (Ed.). Precision Agriculture '99. London, UK: SCI.

Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear

embedding. Science, 290, 2323-2326.

Rumelhart, D. E., & McClelland, J. L. (1986). Parallel Distributed Processing. Cambridge,

Mass: MIT Press.

Sammon, J. W. (1969). A nonlinear mapping for data analysis. IEEE Transactions on

Computers, C18, 401-409.

Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5, 197-227.

Schölkopf, B., Burges, J. C. C., & Smola, A. J. (1999). Advances in Kernel Methods - Support

Vector Learning. Cambridge, MA: MIT Press.

Schölkopf, B., Smola, A. J., & Müller, K.-R. (1998). Nonlinear component analysis as a

kernel eigenvalue problem. Neural Computation, 10, 1299-1319.

Shawe-Taylor, J., & Cristianini, N. (2004). Kernel methods for pattern analysis. Cambridge:

Cambridge University Press.

Shepard, R. N. (1962). The analysis of proximities: Multidimensional scaling with an

unknown distance function. Part 1. Psychometrika, 27, 125–140.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

24

Short, N. M. Remote sensing tutorial. Retrieved 04/14/2010, from

http://rst.gsfc.nasa.gov/Sect13/Sect13_9.html.

Smach, F., Lemaître, C., Gauthier, J.-P., Miteran, J., & Atri, M. (2007). Generalized Fourier

Descriptors with Applications to Objects Recognition in SVM Context. Journal of

Mathematical Imaging and Vision, 30, 43-71.

Tenenbaum, J. B., de Silva, V., & Langford, J. C. (2000). A Global Geometric Framework for

Nonlinear Dimensionality Reduction. Science, 290, 2319-2323.

Tzionas, P., Papadakis, S., & Manolakis, D. (2005). Plant Leaves Classification Based on

Morphological Features and a Fuzzy Surface Selection Technique. In D. Manolakis and A.

Gogoussis (Eds.) 5th International Conference on Technology and Automation, (pp. 365-

370), Thessaloniki, Greece: IEEE Computer society.

Vapnik, V. (1998). Statistical learning theory. New York: Wiley Interscience Publication.

Villette, S., Cointault, F., Piron, E., Chopinet, B., & Paindavoine, M. (2008). Simple imaging

system to measure velocity and improve the quality of fertilizer spreading in agriculture.

Journal of Electronic imaging, 17, 1109-1119.

Witten, I. H., & Eibe, F. (2005). Data Mining: Practical Machine Learning Tools and

Techniques, second edition. Morgan Kaufmann Series in Data Management Systems. Morgan

Kaufmann. San Francisco: Elsevier

Wu, S. G., Bao, F. S., Xu, E. Y., Wang, Y. X., Chang, Y.-F., & Xiang, Q.-L. (2007). A Leaf

Recognition Algorithm for Plant Classification Using Probabilistic Neural Network. IEEE

International Symposium on Signal Processing and Information Technology (pp. 11-16).

Cairo, Egypt. Giza: IEEE Computer society.

Yun, Z., Yong, H., Kexin, X., Qingming, L., Da, X., Alexander, V. P., & Valery, V. T.

(2006). Crop/weed discrimination using near-infrared reflectance spectroscopy (NIRS),

Progress in biomedical optics and imaging, 7(2), 37.