detection & prediction of pests/diseases using deep …the matlab application is built around...

1

DETECTION & PREDICTION OF PESTS/DISEASES USING

DEEP LEARNING

1.INTRODUCTION

Deep Learning technology can accurately detect presence of pests and disease in the farms.

Upon this Machine learning algorithm CART can even predict accurately the chance of any

disease and pest attacks in future. A normal human monitoring cannot accurately predict the

amount and intense of pests and disease attacked in farm for spraying correct and enough

fertilizers/pesticides to eliminate the host. Therefore, and artificial Percepton tells the accurate

value and give corrective measure of amount of pesticides/fertilizers to be sprayed at specified

target areas. The aim of the project is to help the farmers to protect his farm from any kind of

pests and disease attacks and eliminate them without disturbing the decorum of the soil and

untouched parts of other plants. Mostly in India farmers use manual monitoring and some apps

which have huge database limitations and are only bound to detection part. Since, Prevention

is better than cure, our project aims at predicting attack of pests/diseases in future thereby

making farmer to prevent such attacks.

Technology is playing a crucial in developing farms and agro-based industries. Today,

it is possible to grow crops in deserts by using technology. Technology has dived into depths

in agriculture sector. Automation technology is the present most demanded tool in agriculture.

Many companies have come up with latest solutions in Machine Learning, Artificial

Intelligence transforming agriculture into a Digital Agriculture. Many tests have proved that

deploying technology in farms, will increase crop yield and farmer’s revenue thereby. This

paper discusses and tests Deep Learning technology implementation in agriculture.

Diagnosis is always a concern for farmers in India. At the same time due to fear of

attack of pests/diseases, farmer uniformly sprays pesticides/fertilizers in whole farm which

may lead to damage of soil as well as plant. The aim of this project is to make the farmer to

spray a limited and enough pesticide/fertilizer at a specified target area where either

pest/disease is present or maybe an occurrence of attack in future. This helps the farmers mainly

to prevent any such attacks on his farm as well as eliminate them if present any by spraying in

limited amount and not polluting soil and other parts of plants. Major advantage of this is to

increase farmer’s annual monetary revenue and minimising crop loss caused by pests/disease

attacks.

2

2. LITERATURE SURVEY

In India, there is a drastic change in Agri-Tech. Not most of the farmers are using latest tech

gadgets in their farms. We often see IoT related agriculture in several journals but none of

them are properly adopted in Indian farms. There is a huge gap between technology and farmers

in India. Many start-ups have emerged to bridge this gap between the technology and the

farmers. Now, even many MNCs are investing in Agri-Tech in India. Food demand is

exponentially increasing due to rise in population. People talking about tractors and heavy

machinery in farms era is now replaced by smart technology such as Internet of Things,

Artificial Intelligence and Machine Learning. Smart sensors are replaced by heavy machinery

in American farms. Farmers are using technology such as temperature and moisture sensors,

drones, smart irrigation, terrain contour mapping, self-driving and GPS enabled tractors/rovers

- to produce food more sustainably. According to “The Economist”, farmers are being “teched

up” when it comes to growing crops/food more sustainable and profitable. It is often heard that

pests and diseases attack crops and therefore food gradually reduces due to these attacks. By

2050, earth’s population is expected to grow 9.7 billion. Therefore, a clear graph of rise in food

demand is visible.

Modern Agriculture faces tremendous challenges. Today, the agricultural sector has

grown into a highly competitive and globalized industry, where farmers and other actors have

to consider local climatic and geographic aspects as well as global ecological and political

factors in order to guarantee economic survival and sustainable production. Feeding a

growing world population asks for continuous increases in food production, but arable land

remains a limited resource. New requests for bio energy or changing diet preferences put

additional strains on agricultural production, while settlement and transport consume

increasing shares of land. Expected and observable changes in global climate, shifting rainfall

patterns, global warming, droughts, or the increasing frequency and duration of extreme

weather events endanger traditional production areas and bring new risks and uncertainties

for global harvest yields. To cope with these challenges, Agriculture requires a continuous

and sustainable increase in productivity and efficiency on all levels of agricultural

3

production, while resources like water, energy, fertilizers etc. need to be used carefully and

efficiently in order to protect and sustain the environment and the soil quality of the arable

land. The complexity of the challenge is increased by other short-term events which are

difficult to predict, such as epidemics, financial crisis, or price volatility for agricultural raw

materials and products.

From the AI point of view, Agriculture offers a vast application area for all kinds of

AI core technologies: Mobile, autonomous agents operating in uncontrolled environments,

stand-alone or in collaborative settings, allow to investigate, test and exploit technologies

from robotics, computer vision, sensing, and environment interaction. Integrating multiple

partners and their heterogeneous information sources leads to application of semantic

technologies. The complexity of the agricultural production asks for progress in modelling

capabilities, handling of uncertainty, and in the algorithmic and usability aspects of location-

and context-specific decision support. The growing interest in reliable predictions as a basis

for planning and control of agricultural activities requires the interdisciplinary cooperation

with domain experts e.g. from agricultural research. Modern agricultural machines shall use

self-configuring components and shall be able to collaborate and exhibit aspects of self-

organization and swarm intelligence.

Automation technology is the most focussed technology by the Indian start-ups.

Automated Drones and Bots are deployed in farms for monitoring and serving the crop. Day

by day the technology used is also swapping from normal spraying to specified target spraying

of pesticides and fertilizers in the farms. Artificial Intelligence, Machine Learning and Deep

Learning algorithms are adopted to monitor the crops precisely and detect the faulty areas in

the farms, hence spray corrective solution in that specific target area.

Several Start-Ups in India have put up their product in automates technology in

agricultural sector. Mostly drones and digital apps are designed to have better crop yield.

Drones are deployed and use RADAR to spray the entire field. The latitude and longitude of

the entire farm is drawn in the map and RADAR is used to maintain a constant height between

the drone and farm to avoid any sort of collision. This technology spray pesticides and

fertilizers in entire farm that too in wide ranges within few minutes. This saves the farmer’s

time and saves the crop too.

4

Several digital apps are designed to help farmers to identify diseases attacked in the

farm. Even NPK (Nitrogen, Phosphorus and Potassium) values of the plant are calculated to

monitor the plant’s health.

Many MNCs are investing hugely in using technology in agriculture. Artificial

Intelligence, Machine learning, Deep learning and IoT technologies are adopted by start-

ups and tech companies to boost the crop yield.

Some apps are designed in such a way to predict the weather condition and soil

condition and give an accurate measure to tell what kind and type of crop must be sown in the

soil in order to withstand perfectly till the harvest time based on present and future conditions.

So much of background research is going on to study the plant from A to Z mostly for

prediction analysis in order to design algorithms using Machine Learning and Deep learning.

Based on a single snap of a plant, A to Z analysis of it must be done, such type of research is

going on to gather the necessary database.

All technical papers surveyed gave us a first view on this challenging interplay

between AI and Agriculture. Taking profit from state-of-the-art sensing and actuator

technologies the contribution on DATA MINING AND PATTERN RECOGNITION IN

AGRICULTURE addresses challenges and potentials of appropriate methods in Agriculture.

Motivated by the need for increased resource efficiency, the paper on ROBOTS FOR FIELD

OPERATION WITH COMPREHENSIVE MULTILAYER CONTROL summarizes work on the

development of autonomous agricultural machines. A contribution to better understanding

between multiple cooperating actors is proposed in a submission on ONTOLOGY-BASED

MOBILE COMMUNICATION. Optimizing the operation of a harvesting logistics chain, consisting

of multiple cooperating vehicles in the field, will profit from the application of dynamic route

planning algorithms, as presented in a paper on SPATIAL-TEMPORAL CONSTAINT PLANNING.

While the report on the iGREEN project spans from support for sharing and exchange among

agricultural operators to decision support and application control, the report on TOWARDS

SUPPORTING MOBILE BUSINESS PROCESSES focuses on the uncertainty encountered in the non-

deterministic agricultural environment and the application of agent technology to cope with

that. Innovative ways for agricultural agents to see and perceive their environment are

described in DETECTION OF FIELD STRUCTURES, which combines laser scanners and computer

vision with sophisticated modeling capabilities to enable the intended structure recognition.

In addition, Progress results in successful and interesting doctoral dissertation work: In order

5

to enable self-organized sensor integration in modular machines, BIO-INSPIRED SENSOR DATA

MANAGEMENT took inspiration from ant colonies and similar observations. MECHATRONIC

SYSTEMS investigate the adaptation of the operating parameters of a modern agricultural

machine to the current context and task details in the field.

Finally, our survey concluded after interacting with several farmers that,

most of the farmers are preferring apps because it’s free and readily available 24x7 online. The

farmer need not do much with it, he simply needs to take a snap of a plant and upload it to the

cloud. The backend processing designed do the complete analysis of taken snap and gives a

detailed report to the farmer. This involves even prediction analysis. All of this is possible only

if required and accurate database is available to train the system. Hence, day by day research

on algorithms are going on rather than focussing on hardware parts. Since, algorithm part is

the major backbone in preparing the report.

6

3. HARDWARE and SOFTWARE REQUIREMENTS

As of now, we are vigorously focussing on only software algorithms. We

are using MATLAB 2017b tool for developing algorithms. In future, we will be using

TensorFlow and Python IDE for transforming this algorithm into a complete product.

3.1 MATLAB:

Coming to the currently used software, MATLAB abbreviated as “Matrix

Laboratory” is a multi-paradigm numerical computing environment and proprietary

programming language developed by MathWorks. MATLAB allows matrix manipulations,

plotting of functions and data, implementation of algorithms, creation of user interfaces, and

interfacing with programs written in other languages, including C, C++, C#, Java, Fortran and

Python.

Although MATLAB is intended primarily for numerical computing, an optional

toolbox uses the MuPAD symbolic engine, allowing access to symbolic computing abilities.

An additional package, Simulink, adds graphical multi-domain simulation and model-based

design for dynamic and embedded systems.

The MATLAB application is built around the MATLAB scripting language. Common

usage of the MATLAB application involves using the Command Window as an interactive

mathematical shell or executing text files containing MATLAB code. Variables are defined

using the assignment operator, = . MATLAB is a weakly typed programming language

because types are implicitly converted. It is an inferred typed language because variables can

be assigned without declaring their type, except if they are to be treated as symbolic

objects,[13] and that their type can change. Values can come from constants, from computation

involving values of other variables, or from the output of a function.

There are several in-built Toolboxes in Matlab like Image Processing toolbox, Bio

Informatics toolbox, Signal Processing toolbox, Fuzzy Logic toolbox, Neural Networks

toolbox, Statistics and Machine Learning toolbox and so on. All the toolboxes contain pre-

defined functions and variables. The main advantage of Matlab compared with other software’s

is the ease of calling the function from the specified toolbox. Matlab has in-built toolboxes

which can be directly used by just simply installing them. The toolboxes contain several

examples for better understanding for beginners.

https://en.wikipedia.org/wiki/Multi-paradigm_programming_language

https://en.wikipedia.org/wiki/Numerical_analysis

https://en.wikipedia.org/wiki/Proprietary_programming_language

https://en.wikipedia.org/wiki/Proprietary_programming_language

https://en.wikipedia.org/wiki/MathWorks

https://en.wikipedia.org/wiki/Matrix_(mathematics)

https://en.wikipedia.org/wiki/Function_(mathematics)

https://en.wikipedia.org/wiki/Algorithm

https://en.wikipedia.org/wiki/User_interface

https://en.wikipedia.org/wiki/MuPAD

https://en.wikipedia.org/wiki/Computer_algebra_system

https://en.wikipedia.org/wiki/Symbolic_computing

https://en.wikipedia.org/wiki/Simulink

https://en.wikipedia.org/wiki/Model-based_design

https://en.wikipedia.org/wiki/Model-based_design

https://en.wikipedia.org/wiki/Dynamical_system

https://en.wikipedia.org/wiki/Embedded_system

https://en.wikipedia.org/wiki/Command_line_interface

https://en.wikipedia.org/wiki/Strong_and_weak_typing

https://en.wikipedia.org/wiki/MATLAB#cite_note-13

https://en.wikipedia.org/wiki/Constant_(computer_science)

7

Fig.3.1 Editor & Command window in Matlab

Our project mainly focuses on Bio Informatics toolbox, Neural Networks toolbox, Statistics

and Machine Learning toolbox and Image processing and Computer Vision toolbox.

3.2 Neural Network Toolbox:

Neural Network Toolbox™ provides algorithms, pretrained models, and apps to create,

train, visualize, and simulate both shallow and deep neural networks. You can perform

classification, regression, clustering, dimensionality reduction, time-series forecasting, and

dynamic system modelling and control.

Deep learning networks include convolutional neural networks (ConvNets, CNNs),

directed acyclic graph (DAG) network topologies, and autoencoders for image classification,

regression, and feature learning. For time-series classification and regression, the toolbox

provides long short-term memory (LSTM) deep learning networks. You can visualize

intermediate layers and activations, modify network architecture, and monitor training

progress.

For small training sets, you can quickly apply deep learning by performing transfer

learning with pretrained deep network models (including Inception-v3, ResNet-50, ResNet-

8

101, GoogLeNet, AlexNet, VGG-16, and VGG-19) and models imported from TensorFlow™

Keras or Caffe.

To speed up training on large datasets, you can distribute computations and data across

multicore processors and GPUs on the desktop (with Parallel Computing Toolbox™), or scale

up to clusters and clouds, including Amazon EC2® P2, P3, and G3 GPU instances (with

MATLAB® Distributed Computing Server™).

Our project is dealing with Neural Network toolbox to create an artificial Perceptron

for creating several hidden layers or Epochs for feature extraction and accuracy. In each hidden

layer, there will be stage-by-stage feature extraction. Each hidden layer output will be input to

the corresponding next hidden layer.

Fig.3.2 Hidden Layers Model

9

3.3 Statistics and Machine Learning Toolbox:

Statistics and Machine Learning Toolbox™ provides functions and apps to describe,

analyze, and model data. You can use descriptive statistics and plots for exploratory data

analysis, fit probability distributions to data, generate random numbers for Monte Carlo

simulations, and perform hypothesis tests. Regression and classification algorithms let you

draw inferences from data and build predictive models.

For multidimensional data analysis, Statistics and Machine Learning Toolbox provides

feature selection, stepwise regression, principal component analysis (PCA), regularization, and

other dimensionality reduction methods that let you identify variables or features that impact

your model.

The toolbox provides supervised and unsupervised machine learning algorithms,

including support vector machines (SVMs), boosted and bagged decision trees, k-nearest

neighbour, k-means, k-medoids, hierarchical clustering, Gaussian mixture models, and hidden

Markov models. Many of the statistics and machine learning algorithms can be used for

computations on data sets that are too big to be stored in memory.

SVM, decision making tree and K & C- means algorithms are part and parcel of our

project. They make up our project successfully give the output. This toolbox gives the

prediction analyses in the algorithm.

3.4 Bio Informatics Toolbox:

Bioinformatics Toolbox™ provides algorithms and apps for Next Generation

Sequencing (NGS), microarray analysis, mass spectrometry, and gene ontology. Using toolbox

functions, you can read genomic and proteomic data from standard file formats such as SAM,

FASTA, CEL, and CDF, as well as from online databases such as the NCBI Gene Expression

Omnibus and GenBank®. You can explore and visualize this data with sequence browsers,

spatial heatmaps, and clustergrams. The toolbox also provides statistical techniques for

detecting peaks, imputing values for missing data, and selecting features. This toolbox is used

to extract features like GLCM texture features from the plant leaves and analyse them

Biologically.

https://in.mathworks.com/products/statistics/apps.html

10

Fig.3.3 K-Means Clustering model

3.5 Image Processing Toolbox:

Image Processing Toolbox™ provides a comprehensive set of reference-standard

algorithms and workflow apps for image processing, analysis, visualization, and algorithm

development. You can perform image segmentation, image enhancement, noise reduction,

geometric transformations, image registration, and 3D image processing.

Image Processing Toolbox apps let you automate common image processing

workflows. You can interactively segment image data, compare image registration techniques,

and batch-process large data sets. Visualization functions and apps let you explore images, 3D

volumes, and videos; adjust contrast; create histograms; and manipulate regions of interest

(ROIs).

11

This is the backbone of the project. Image inputs and image analyses for further

evaluation is done through this toolbox. This toolbox is the key subject to the project. Multiple

images are stored in an array form, they are converted to Gray scale, HSV formats for feature

extraction. The input image is normally an RGB image which consists of values from 0 to 255

pixels.

3.6 System Configuration:

Since we are using Deep Learning algorithms and that too image analysis, for greater

processing speed, generally, NVIDIA GPU 1050i version is recommended. Our project is done

on Intel i7 GPU version processor. Intel too supports Deep Learning image analysis through

its Graphic cards. The processing speed of Intel supported Graphic card for Deep Learning

image analysis is as much as speed as NVIDIA processor. Since, the algorithm deals with large

dataset of images, the iterations/Epochs will take some seconds to display the output because

the images are of higher pixels.

12

4. PROPOSED SOLUTION

Our project is broadly classified into 2 algorithms:

(1) Artificial Neural Networks algorithm

ANN is used to detect the plant swelling (moisture content), burning sensation, disease and

pest along with soil analysis. The dataset of the plant leaf, various diseases, pests and soil

images are trained in Matlab tool and classified into various clusters which classifies various

labels. These labels are used to differentiate our input test image and is allotted the nearest

corresponding label to it based on Euclidian distance. Fuzzy C-means algorithm is used to

identify the pest and disease present in the farm. Convolutional Neural Networks is designed

for accurate analysis. Unsupervised Learning classification is used since the input image is

unknown and new to the algorithm.

Most of the real time applications need unsupervised learning data since the input is

always unknown to the algorithm. Hence, Fuzzy C-means gives an accurate output when

compared to K-means which uses supervised learning.

(2) Machine Learning algorithm

Based on the feature extraction parameters, the algorithm predicts whether the crop is going

to get any pest and disease attacks in future. Machine learning algorithm uses CART

(classification and regression tree) to predict the condition of the plant in future based on the

given trained data.

Starting from the basic image reading process, we give dataset of nearly 500-600 each

set of healthy plant images, different pest infected images and diseases infected images.

Therefore, each set of categories will have 500-600 images. The large set is acquired for the

accuracy purpose. We extract the features of each image in each category. The feature

extraction is GLCM texture extraction and edge detection for moisture content level. These

features describe the actual condition of the plant based on the pixel values. One of the major

features for analysis purpose is the moisture content.

Generally, when the plant is suffering from excess amount of water, its leaves will swell

than the normal threshold size. Likewise, when the water content is less, its leaves will drain.

13

Hence, in both the cases the edges of the leaf describe the moisture content. Thereby, edge

detection is used to calculate the moisture content in the plant. The both scenarios lead to

disease attacks in plants. Therefore, the moisture level too plays a vital role in predicting

disease attacks.

The GLCM texture features include:

• Contrast

• Homogeneity

• Correlation

• Energy

• Skew

• Kurtosis

• RMS

• Standard Deviation

• Mean

• Variance

These parameters are texture extraction based on pixel values. The above-mentioned

parameters along with moisture content are analysed for all sort of categories of image dataset.

The parameters of completely disease and pest infected plants, partially infected disease and

pest plants and healthy plant images are taken as image dataset. These values are trained and

clustered based on their respective categories. Right from healthy to partially to completely

disease and pest infected plants are classified into separate labels. The values are trained in

accordance with Unsupervised Learning algorithm. Hence, the development of Fuzzy C-

Means algorithm is enhanced. K-Means algorithm is used for clustering the data and create a

centroid for comparing with the input test image.

14

So, let’s see how the authors have designed the algorithm for this application. After

gathering the image dataset, the data is classified into labels according to their defined names

such as

Label 0: Healthy Label 0: Healthy

Label 1: Disease 1 Label 1: Pest 1

Label 2: Disease 2 Label 2: Pest 2

. .

. .

Label N: Disease N Label N: Pest N

The trained network forms several clusters using SVM classifier. They will be formed

in X-Y axes in geometry plane. The input image taken called test image, after the extraction of

features of it, each feature gets compared with the trained data features. Therefore, a centroid

is formed where the nearest corresponding feature cluster is selected. The label under nearest

corresponding label gets displayed like “Disease X and Pest Y”. This is all done by K-Means

clustering mechanism. This phase is Disease/Pest Detecting part. The Neural Network part

used here is the creation of Convolutional Neural Network (CNN) used for Non-Linear

Regression models. We create an artificial neuron having several hidden layers in it. The

images given below (Fig.4.1 and Fig.4.2) gives a glimpse of CNN and Hidden layers. Each

hidden layer consists of several processing to be done like feature extraction in this project use

case. The number of hidden layers is given by the developer based on the size of dataset. Larger

the hidden layers and dataset, more the accuracy in detecting a feature or status of given input.

The Epochs (iterations) are too decided by the developer. But based on the size of dataset, the

algorithm automatically stops iteration after it reached the maximum accuracy stage. Suppose

we have given 100 images and 200 iterations. If the algorithm completely classifies the images

at 67th iteration, then it stops iterating from 68th iteration. This is one of the advantages of using

MATLAB in saving processing speed.

While coming to the innovative part of the project, Decision Making Tree using

Machine Learning algorithm, the same clusters, classifiers are followed and created. Each

parameter taken right from healthy to completely infected plant is analysed, and math

15

calculations are done for creating a decision-making tree. The values obtained from partially

and completely infected plants are the key roles to be played in making decision making tree.

Trial and Error is done based on those values and algorithm is trained to make its own decision

in accurately predicting the chances of pest and disease attacks in future and in how many days

based on the trained data given. This phase of the project is innovative and unique part and

research oriented compared to rest of the project phases.

Fig.4.1 CNN model

Fig.4.2 Neural Network model

16

Fig.4.3 Decision Making Tree model

Block Diagram:

17

Fig.4.4 Block Diagram

Here, is the block diagram of our proposed solution. We can find the broad layout of

our solution in the first figure and detail description in the second one. It contains both detection

and prediction part.

18

5. RESULTS & DISCUSSIONS

As of now we have completed the training phase of image dataset. Apart

from that we have classified the given input dataset into its corresponding labels and trained

algorithm to classify it using K-Means clustering. We have categorised the images using Bag

of features provided by MATLAB as an in-built function. The moisture content using Edge

Detection is also completed along with how many days will the crop withstand without any

external attacks for post-harvest period. This part is very useful for Agro-based industries

because according to survey 20% of the crop post-harvest is getting destroyed because of not

maintaining optimum moisture levels.

Fig.5.1 Moisture Content extraction with Sparsity levels

19

Fig.5.2 Moisture Content & Status of the leaf

Figures 5.1 & 5.2 describes the moisture content in the given input test image which is

compared with a perfect moisture leaf. The author has developed a math relation for comparing

the moisture levels through edge detection algorithm. After trail and error math, we got the

approximate moisture levels from the input levels and even approximated how many days will

the crop withstand without any external infections. The author has taken the edges (X-axis and

Y-axis) of the image and calculated the width. As mentioned above, if the moisture is more

than normal, leaf will swell, if moisture is less, it will drain.

Hence, this parameter is also used to predict disease attacks based on

present moisture content. For example, if the moisture is high, there is chance of fungal attacks.

Therefore, this is one of the key parameter in predicting external agents attacks.

20

Fig.5.3 Image Features Extraction using Bag-Of-Features & Iterations using K-Means

Clustering

Figure 5.3 describes the feature extraction using MATLAB in-built

function Bag-Of-Features. The algorithm automatically classifies the images using K-Means

Clustering. As we can see in the image, the algorithm made the number of features as 53055

and number of clusters as 500. After the creation of basic fundamentals for training, the clusters

are ready for training and allocating them into labels for detection phase. We have given the

category names as the disease names of the plants. After the clustering, the image data gets into

those corresponding labels for identification purpose. This forms the X- Y- axes in the SVM

plane. Hence, we have created a Perceptron where the image extraction is done in hidden layers.

The Epochs are number of iterations. Basically, the author as given 500 iterations. But as

mentioned above, the algorithm automatically stops the iteration when it gets highest accurate

value based on the size of image dataset.

21

Fig.5.4 GLCM Texture Extraction

Figure 5.4 describes the GLCM texture features which is based on the pixel values of trained

data and test data. This section is the feature extraction of the author defined parameters other

than the MATLAB defined in-built function. There are several parameters which can be

defined based on pixel intensity of the image and used for detecting and predicting analysis.

The trained data contains several parameters and are classified into labels. The input test image

features are extracted and compared with the trained values. The nearest corresponding

centroid “K” is calculated and displayed. This is done for detection process.

22

Fig.5.5 Training Data

Figure 5.5 describes the training data of given image dataset. Normal image

categories are formed for detecting purpose and predicting processing is also done as we can

see in the image. SVM classifier is used for multi-class classifier in predicting analysis. The

training data is created using Machine Learning algorithms and is further used in creating

CART cluster.

Hence, till now we have done the project till:

• Image classification using K-Means clustering

• Image categorisation using Bag-Of-Features

• Image feature extraction using GLCM texture extraction

• Training image dataset

• Moisture content calculation and predicting how long can it withstand

Environment Safety:

As we are purely using images for processing the plants without disturbing

their environmental decorum, there is no environment hazard issue with this proposed

project. Therefore, it is completely safe and in fact helps the plants to grow more effectively

with absolute Zero cost.

23

6. CONCLUSION & FUTURE SCOPE

6.1 Conclusion

Hence, we have completed Image classification, Image Categories,

Feature Extraction, and Training Data. The whole development of algorithm is done in

MATLAB tool. We have used several toolboxes like Statistics and Machine Learning toolbox,

Neural Network Toolbox and Image Processing Toolbox. The outputs as of now are the training

data in form of image categories, image classification using K-Means clustering and moisture

content along with predicting of withstanding. The algorithm is done with training data and

classification of given image dataset. The test input image is compared with the trained data

for detection and prediction analysis. We are using Unsupervised Learning for precise

accuracy. For example, let’s take trained data of Indian Rice plants and test input as African

Rice plant. The accuracy would be low because the slight difference in appearance. Hence, we

are focussing on Unsupervised Learning. The example would get 99% accuracy using Fuzzy

C-Means algorithm. The name itself tells the data might be Fuzzy but we will be getting precise

accuracy. Therefore, we are avoiding using Supervised Learning techniques.

6.2 Future Scope

Further we are planning to transform the project from prototype to a

complete end use product. This can be done using TensorFlow library function in Python IDE

with high processors (recommended using NVIDIA). The end product would be accurately

predicting disease/pest attacks along with identifying them. Larger set of data would be

provided for training network. The whole algorithm would be developed using TensorFlow for

better processing. OpenCV is used for Image analytics similar to Image Processing Toolbox in

MATLAB. Therefore, the farmer has to just take a snap of the leaf, upload it to the cloud where

the back end processing will do predict/detect analysis and give corrective measures for

preventing and eliminating external hosts.

24

7. BIBLIOGRAPHY

1. MachineLearning: What it is and why it matters, 09 2016, [online] Available:

www.sas.com

2. R. E. Schapire, "The boosting approach to machine learning: An overview" in

Nonlinear estimation and classification, New York:Springer, pp. 149-171, 2003.

3. A. Ben-Hur, D. Horn, H. T. Siegelmann, V. Vapnik, "Support vector

clustering", Journal of machine learning research, vol. 2, pp. 125-137, Dec 2001.

4. J. R. Otukei, T. Blaschke, "Land cover change assessment using decision trees support

vector machines and maximum likelihood classification algorithms", International

Journal of Applied Earth Observation and Geoinformation, vol. 12, pp. S27-S31, 2010.

5. Panigrahi, S. & Ting K.C. (1998) Artificial Intelligence for Biology and Agriculture,

Kluwer Academic Press).

6. Mukhopadhyay S.C. (2012) Smart Sensing Technology for Agriculture and

Environmental Monitoring. Vol. 146, Springer Berlin Heidelberg.

7. German, L., Ramisch, J.J. & Verma R. (2010) Beyond the Biophysical, Knowledge,

Culture, and Power in Agriculture and Natural Resource Management, Springer Publ.

8. Jun Wu, Anastasiya Olesnikova, Chi-Hwa Song, Won Don Lee (2009). The

Development and Application of Decision Tree for Agriculture Data. IITSI :16-20.

9. Leemans, V., Destain, M.F.,2004.A real-time grading method of apples based on

features extracted from defects. J. Food Eng. 61, 83-89.

10. Quinlan, J.R.(1985b). Decision trees and multi-valued attributes. In J.E. Hayes & D.

Michie (Eds.), Machine intelligence 11. Oxford University Press (in press).

11. Zelu Zia (2009). An Expert System Based on Spatial Data Mining used Decision Tree

for Agriculture Land Grading. Second International Conference on Intelligent

Computation Technology and Automation. Oct10-11, China

12. Y. Hayami, V. W. Ruttan, Agricultural development: an international perspective,

Baltimore, London:The Johns Hopkins Press, 1971.

13. S. Ray, Essentials of Machine Learning Algorithms (with Python and R Codes)“ in

Analytics Vidhya, 2015, [online] Available:

https://www.analyticsvidhya.com/blog/2015/08/common-machine-learning-

algorithms/.

http://www.sas.com/

https://www.analyticsvidhya.com/blog/2015/08/common-machine-learning-algorithms/

https://www.analyticsvidhya.com/blog/2015/08/common-machine-learning-algorithms/

25

14. S. Veenadhari, D. Bharat Mishra, D. C. Singh, "Soybean Productivity Modelling Using

Decision Tree Algorithms", International Journal of Computer Applications, vol. 27,

no. 7, pp. 11-15, Aug. 2011.

15. M. S. Dahikar, D. V. Rode, "Agricultural Crop Yield Prediction Using Artificial Neural

Network Approach", INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN

ELECTRICAL ELECTRONICS INSTRUMENTATION AND CONTROL

ENGINEERING, vol. 2, no. 1, pp. 683-686, 2014.

16. M. Somvanshi, P. Chavan, "A review of machine learning techniques using decision

tree and support vector machine", 2016 International Conference on Computing

Communication Control and automation (ICCUBEA), pp. 1-7, 2016.

26

8. WEBLIOGRAPHY

https://in.mathworks.com/help/matlab/ref/rgb2gray.html

http://matlab.izmiran.ru/help/toolbox/images/enhanc15.html

https://in.mathworks.com/help/images/ref/graycomatrix.html

https://in.mathworks.com/help/images/ref/graycoprops.html

https://in.mathworks.com/help/vision/examples/image-category-classification-using-deep-

learning.html

https://in.mathworks.com/matlabcentral/fileexchange/22187-glcm-texture-features

https://in.mathworks.com/help/matlab/ref/categorical.html

https://in.mathworks.com/help/stats/fitcecoc.html

https://in.mathworks.com/matlabcentral/answers/15307-image-operations-skewness-and-kurtosis

https://in.mathworks.com/matlabcentral/answers/112344-how-to-get-the-pixel-value-of-histogram

http://matlab.izmiran.ru/help/toolbox/images/graycomatrix.html

https://in.mathworks.com/matlabcentral/fileexchange/37197-dem--diffused-expectation-

maximisation-for-image-segmentation

https://in.mathworks.com/matlabcentral/answers/uploaded_files/98784/Classify_RGB_Image.m

https://in.mathworks.com/matlabcentral/answers/uploaded_files/98783/kmeans_color_segmentati

on.m

https://www.researchgate.net/post/how_to_calculate_Energy_entropy_correlation_using_GLCM

https://in.mathworks.com/help/nnet/ug/divide-data-for-optimal-neural-network-training.html

https://in.mathworks.com/matlabcentral/answers/106010-how-to-train-data-in-neural-network

https://in.mathworks.com/matlabcentral/fileexchange/62990-deep-learning-tutorial-series

https://in.mathworks.com/videos/using-feature-extraction-with-neural-networks-in-matlab-

1492009542601.html


https://in.mathworks.com/help/matlab/ref/rgb2gray.html

http://matlab.izmiran.ru/help/toolbox/images/enhanc15.html

https://in.mathworks.com/help/images/ref/graycomatrix.html

https://in.mathworks.com/help/images/ref/graycoprops.html

https://in.mathworks.com/help/vision/examples/image-category-classification-using-deep-learning.html

https://in.mathworks.com/help/vision/examples/image-category-classification-using-deep-learning.html


https://in.mathworks.com/help/matlab/ref/categorical.html

https://in.mathworks.com/help/stats/fitcecoc.html

https://in.mathworks.com/matlabcentral/answers/15307-image-operations-skewness-and-kurtosis

https://in.mathworks.com/matlabcentral/answers/112344-how-to-get-the-pixel-value-of-histogram

http://matlab.izmiran.ru/help/toolbox/images/graycomatrix.html

https://in.mathworks.com/matlabcentral/fileexchange/37197-dem--diffused-expectation-maximisation-for-image-segmentation

https://in.mathworks.com/matlabcentral/fileexchange/37197-dem--diffused-expectation-maximisation-for-image-segmentation

https://in.mathworks.com/matlabcentral/answers/uploaded_files/98784/Classify_RGB_Image.m

https://in.mathworks.com/matlabcentral/answers/uploaded_files/98783/kmeans_color_segmentation.m

https://in.mathworks.com/matlabcentral/answers/uploaded_files/98783/kmeans_color_segmentation.m

https://www.researchgate.net/post/how_to_calculate_Energy_entropy_correlation_using_GLCM

https://in.mathworks.com/help/nnet/ug/divide-data-for-optimal-neural-network-training.html

https://in.mathworks.com/matlabcentral/answers/106010-how-to-train-data-in-neural-network

https://in.mathworks.com/matlabcentral/fileexchange/62990-deep-learning-tutorial-series

https://in.mathworks.com/videos/using-feature-extraction-with-neural-networks-in-matlab-1492009542601.html

https://in.mathworks.com/videos/using-feature-extraction-with-neural-networks-in-matlab-1492009542601.html


27

https://in.mathworks.com/help/vision/examples/image-category-classification-using-bag-of-

features.html

https://in.mathworks.com/help/vision/ref/trainimagecategoryclassifier.html

https://in.mathworks.com/help/vision/ug/image-classification-with-bag-of-visual-words.html

https://in.mathworks.com/help/stats/k-means-clustering.html#bq_679x-19

https://in.mathworks.com/help/images/examples/color-based-segmentation-using-k-means-

clustering.html

https://in.mathworks.com/matlabcentral/answers/137750-how-to-input-train-data-and-test-data-

features-of-images-using-svm-calssifier

http://dipwm.blogspot.com/2013/01/svm-support-vector-machine-with-matlab.html

https://in.mathworks.com/help/stats/classificationsvm.html

https://in.mathworks.com/help/vision/ref/imageset-class.html

https://in.mathworks.com/help/nnet/ref/network.html

https://in.mathworks.com/help/nnet/examples/create-simple-deep-learning-network-for-

classification.html

https://in.mathworks.com/help/nnet/ug/layers-of-a-convolutional-neural-network.html#bvobklb-4

https://in.mathworks.com/help/nnet/ref/trainnetwork.html

https://in.mathworks.com/help/nnet/ref/nnet.cnn.layer.layer.html

https://in.mathworks.com/help/nnet/ref/regressionlayer.html

https://in.mathworks.com/help/vision/examples/image-category-classification-using-bag-of-features.html

https://in.mathworks.com/help/vision/examples/image-category-classification-using-bag-of-features.html

https://in.mathworks.com/help/vision/ref/trainimagecategoryclassifier.html

https://in.mathworks.com/help/vision/ug/image-classification-with-bag-of-visual-words.html

https://in.mathworks.com/help/stats/k-means-clustering.html#bq_679x-19

https://in.mathworks.com/help/images/examples/color-based-segmentation-using-k-means-clustering.html

https://in.mathworks.com/help/images/examples/color-based-segmentation-using-k-means-clustering.html

https://in.mathworks.com/matlabcentral/answers/137750-how-to-input-train-data-and-test-data-features-of-images-using-svm-calssifier

https://in.mathworks.com/matlabcentral/answers/137750-how-to-input-train-data-and-test-data-features-of-images-using-svm-calssifier

http://dipwm.blogspot.com/2013/01/svm-support-vector-machine-with-matlab.html

https://in.mathworks.com/help/stats/classificationsvm.html

https://in.mathworks.com/help/vision/ref/imageset-class.html

https://in.mathworks.com/help/nnet/ref/network.html

https://in.mathworks.com/help/nnet/examples/create-simple-deep-learning-network-for-classification.html

https://in.mathworks.com/help/nnet/examples/create-simple-deep-learning-network-for-classification.html

https://in.mathworks.com/help/nnet/ug/layers-of-a-convolutional-neural-network.html#bvobklb-4

https://in.mathworks.com/help/nnet/ref/trainnetwork.html

https://in.mathworks.com/help/nnet/ref/nnet.cnn.layer.layer.html

https://in.mathworks.com/help/nnet/ref/regressionlayer.html

28

9. APPENDICES

Codes:

1. Moisture Content

clc close all clear all testimag = imread('C:\matlab projects\cornhealthy.jpg'); I1 = rgb2gray(testimag); subplot(2,2,1); image(testimag); disp('input healthy rice leaf'); subplot(2,2,2); image(I1); disp('gray scale image'); BW = edge(I1); [r1,c1] = find(BW); subplot(2,2,3); spy(BW); disp('sparsity pattern of healthy rice'); x2 = max(r1); x1 = min(r1); X1 = x2 - x1 y2 = max(c1); y1 = min(c1); Y1 = y2 - y1 subplot(2,2,4); imshow(BW); disp('edge detection'); testimag = imread('C:\matlab projects\cornhealthy.jpg'); I2 = rgb2gray(testimag); subplot(2,2,1); image(testimag); disp('input healthy rice leaf'); subplot(2,2,2); image(I2); disp('gray scale image'); BW = edge(I2); [r2,c2] = find(BW); subplot(2,2,3); spy(BW); disp('sparsity pattern of healthy rice'); x3 = max(r2); x4 = min(r2); X2 = x3 - x4 y3 = max(c1); y4 = min(c1); Y2 = y3 - y4 subplot(2,2,4); imshow(BW); disp('edge detection'); if((X2<X1)||(Y2<Y1)) disp('less moisture'); else disp('more moisture'); end if((X1==X2)&&(Y1==Y2)) disp('healthy leaf');

29

end

c=isequal(I1,I2); if (c==1) disp('healthy'); else disp('not healthy'); end

2. GLCM Feature Extraction

imds1=imread('C:\matlab projects\Training\ObjectCategories\Planthopper\p-

1.jpg'); I1=rgb2gray(imds1); GLCM4 = graycomatrix(I1); stats1 =

graycoprops(GLCM4,{'contrast','homogeneity','correlation','Energy'}); disp(stats1); V1=rms(GLCM4); disp(V1); hsv_im = rgb2hsv(imds1); h1 = hsv_im(:,:,1); [pixelCount, grayLevels] = hist(h1(:), 360); pixelCount(20:30)=0; % Get the number of pixels in the histogram. numberOfPixels1 = sum(pixelCount); disp(numberOfPixels1); % Get the mean gray lavel. meanGL1 = sum(grayLevels .* pixelCount) / numberOfPixels1; disp(meanGL1); % Get the variance, which is the second central moment. varianceGL1 = sum((grayLevels - meanGL1) .^ 2 .* pixelCount) /

(numberOfPixels1-1); disp(varianceGL1); % Get the standard deviation. sd1 = sqrt(varianceGL1); disp(sd1); % Get the skew. skew1 = sum((grayLevels - meanGL1) .^ 3 .* pixelCount) / ((numberOfPixels1

- 1) * sd1^3); disp(skew1); % Get the kurtosis. kurtosis1 = sum((grayLevels - meanGL1) .^ 4 .* pixelCount) /

((numberOfPixels1 - 1) * sd1^4); disp(kurtosis1);

30

3. Training Image Data Set

rootFolder='C:\matlab projects\Manu Disease Dataset'; categories={'Alternaria Alternata','Anthracnose','Bacterial

Blight','Cercospora Leaf Spot','Healthy Leaves'}; imds = imageDatastore(fullfile(rootFolder, categories), 'LabelSource',

'foldernames'); imds = splitEachLabel(imds,5, 'randomize'); bag = bagOfFeatures(imds); [trainingSet,testSet] = splitEachLabel(imds,5,'randomize'); bag = bagOfFeatures(trainingSet); categoryClassifier = trainImageCategoryClassifier(trainingSet,bag); [labelIdx, score] = predict(categoryClassifier,imds); categoryClassifier.Labels(labelIdx);

detection & prediction of pests/diseases using deep …the matlab application is built around...

Documents