detection & prediction of pests/diseases using deep …the matlab application is built around...
TRANSCRIPT
1
DETECTION & PREDICTION OF PESTS/DISEASES USING
DEEP LEARNING
1.INTRODUCTION
Deep Learning technology can accurately detect presence of pests and disease in the farms.
Upon this Machine learning algorithm CART can even predict accurately the chance of any
disease and pest attacks in future. A normal human monitoring cannot accurately predict the
amount and intense of pests and disease attacked in farm for spraying correct and enough
fertilizers/pesticides to eliminate the host. Therefore, and artificial Percepton tells the accurate
value and give corrective measure of amount of pesticides/fertilizers to be sprayed at specified
target areas. The aim of the project is to help the farmers to protect his farm from any kind of
pests and disease attacks and eliminate them without disturbing the decorum of the soil and
untouched parts of other plants. Mostly in India farmers use manual monitoring and some apps
which have huge database limitations and are only bound to detection part. Since, Prevention
is better than cure, our project aims at predicting attack of pests/diseases in future thereby
making farmer to prevent such attacks.
Technology is playing a crucial in developing farms and agro-based industries. Today,
it is possible to grow crops in deserts by using technology. Technology has dived into depths
in agriculture sector. Automation technology is the present most demanded tool in agriculture.
Many companies have come up with latest solutions in Machine Learning, Artificial
Intelligence transforming agriculture into a Digital Agriculture. Many tests have proved that
deploying technology in farms, will increase crop yield and farmer’s revenue thereby. This
paper discusses and tests Deep Learning technology implementation in agriculture.
Diagnosis is always a concern for farmers in India. At the same time due to fear of
attack of pests/diseases, farmer uniformly sprays pesticides/fertilizers in whole farm which
may lead to damage of soil as well as plant. The aim of this project is to make the farmer to
spray a limited and enough pesticide/fertilizer at a specified target area where either
pest/disease is present or maybe an occurrence of attack in future. This helps the farmers mainly
to prevent any such attacks on his farm as well as eliminate them if present any by spraying in
limited amount and not polluting soil and other parts of plants. Major advantage of this is to
increase farmer’s annual monetary revenue and minimising crop loss caused by pests/disease
attacks.
2
2. LITERATURE SURVEY
In India, there is a drastic change in Agri-Tech. Not most of the farmers are using latest tech
gadgets in their farms. We often see IoT related agriculture in several journals but none of
them are properly adopted in Indian farms. There is a huge gap between technology and farmers
in India. Many start-ups have emerged to bridge this gap between the technology and the
farmers. Now, even many MNCs are investing in Agri-Tech in India. Food demand is
exponentially increasing due to rise in population. People talking about tractors and heavy
machinery in farms era is now replaced by smart technology such as Internet of Things,
Artificial Intelligence and Machine Learning. Smart sensors are replaced by heavy machinery
in American farms. Farmers are using technology such as temperature and moisture sensors,
drones, smart irrigation, terrain contour mapping, self-driving and GPS enabled tractors/rovers
- to produce food more sustainably. According to “The Economist”, farmers are being “teched
up” when it comes to growing crops/food more sustainable and profitable. It is often heard that
pests and diseases attack crops and therefore food gradually reduces due to these attacks. By
2050, earth’s population is expected to grow 9.7 billion. Therefore, a clear graph of rise in food
demand is visible.
Modern Agriculture faces tremendous challenges. Today, the agricultural sector has
grown into a highly competitive and globalized industry, where farmers and other actors have
to consider local climatic and geographic aspects as well as global ecological and political
factors in order to guarantee economic survival and sustainable production. Feeding a
growing world population asks for continuous increases in food production, but arable land
remains a limited resource. New requests for bio energy or changing diet preferences put
additional strains on agricultural production, while settlement and transport consume
increasing shares of land. Expected and observable changes in global climate, shifting rainfall
patterns, global warming, droughts, or the increasing frequency and duration of extreme
weather events endanger traditional production areas and bring new risks and uncertainties
for global harvest yields. To cope with these challenges, Agriculture requires a continuous
and sustainable increase in productivity and efficiency on all levels of agricultural
3
production, while resources like water, energy, fertilizers etc. need to be used carefully and
efficiently in order to protect and sustain the environment and the soil quality of the arable
land. The complexity of the challenge is increased by other short-term events which are
difficult to predict, such as epidemics, financial crisis, or price volatility for agricultural raw
materials and products.
From the AI point of view, Agriculture offers a vast application area for all kinds of
AI core technologies: Mobile, autonomous agents operating in uncontrolled environments,
stand-alone or in collaborative settings, allow to investigate, test and exploit technologies
from robotics, computer vision, sensing, and environment interaction. Integrating multiple
partners and their heterogeneous information sources leads to application of semantic
technologies. The complexity of the agricultural production asks for progress in modelling
capabilities, handling of uncertainty, and in the algorithmic and usability aspects of location-
and context-specific decision support. The growing interest in reliable predictions as a basis
for planning and control of agricultural activities requires the interdisciplinary cooperation
with domain experts e.g. from agricultural research. Modern agricultural machines shall use
self-configuring components and shall be able to collaborate and exhibit aspects of self-
organization and swarm intelligence.
Automation technology is the most focussed technology by the Indian start-ups.
Automated Drones and Bots are deployed in farms for monitoring and serving the crop. Day
by day the technology used is also swapping from normal spraying to specified target spraying
of pesticides and fertilizers in the farms. Artificial Intelligence, Machine Learning and Deep
Learning algorithms are adopted to monitor the crops precisely and detect the faulty areas in
the farms, hence spray corrective solution in that specific target area.
Several Start-Ups in India have put up their product in automates technology in
agricultural sector. Mostly drones and digital apps are designed to have better crop yield.
Drones are deployed and use RADAR to spray the entire field. The latitude and longitude of
the entire farm is drawn in the map and RADAR is used to maintain a constant height between
the drone and farm to avoid any sort of collision. This technology spray pesticides and
fertilizers in entire farm that too in wide ranges within few minutes. This saves the farmer’s
time and saves the crop too.
4
Several digital apps are designed to help farmers to identify diseases attacked in the
farm. Even NPK (Nitrogen, Phosphorus and Potassium) values of the plant are calculated to
monitor the plant’s health.
Many MNCs are investing hugely in using technology in agriculture. Artificial
Intelligence, Machine learning, Deep learning and IoT technologies are adopted by start-
ups and tech companies to boost the crop yield.
Some apps are designed in such a way to predict the weather condition and soil
condition and give an accurate measure to tell what kind and type of crop must be sown in the
soil in order to withstand perfectly till the harvest time based on present and future conditions.
So much of background research is going on to study the plant from A to Z mostly for
prediction analysis in order to design algorithms using Machine Learning and Deep learning.
Based on a single snap of a plant, A to Z analysis of it must be done, such type of research is
going on to gather the necessary database.
All technical papers surveyed gave us a first view on this challenging interplay
between AI and Agriculture. Taking profit from state-of-the-art sensing and actuator
technologies the contribution on DATA MINING AND PATTERN RECOGNITION IN
AGRICULTURE addresses challenges and potentials of appropriate methods in Agriculture.
Motivated by the need for increased resource efficiency, the paper on ROBOTS FOR FIELD
OPERATION WITH COMPREHENSIVE MULTILAYER CONTROL summarizes work on the
development of autonomous agricultural machines. A contribution to better understanding
between multiple cooperating actors is proposed in a submission on ONTOLOGY-BASED
MOBILE COMMUNICATION. Optimizing the operation of a harvesting logistics chain, consisting
of multiple cooperating vehicles in the field, will profit from the application of dynamic route
planning algorithms, as presented in a paper on SPATIAL-TEMPORAL CONSTAINT PLANNING.
While the report on the iGREEN project spans from support for sharing and exchange among
agricultural operators to decision support and application control, the report on TOWARDS
SUPPORTING MOBILE BUSINESS PROCESSES focuses on the uncertainty encountered in the non-
deterministic agricultural environment and the application of agent technology to cope with
that. Innovative ways for agricultural agents to see and perceive their environment are
described in DETECTION OF FIELD STRUCTURES, which combines laser scanners and computer
vision with sophisticated modeling capabilities to enable the intended structure recognition.
In addition, Progress results in successful and interesting doctoral dissertation work: In order
5
to enable self-organized sensor integration in modular machines, BIO-INSPIRED SENSOR DATA
MANAGEMENT took inspiration from ant colonies and similar observations. MECHATRONIC
SYSTEMS investigate the adaptation of the operating parameters of a modern agricultural
machine to the current context and task details in the field.
Finally, our survey concluded after interacting with several farmers that,
most of the farmers are preferring apps because it’s free and readily available 24x7 online. The
farmer need not do much with it, he simply needs to take a snap of a plant and upload it to the
cloud. The backend processing designed do the complete analysis of taken snap and gives a
detailed report to the farmer. This involves even prediction analysis. All of this is possible only
if required and accurate database is available to train the system. Hence, day by day research
on algorithms are going on rather than focussing on hardware parts. Since, algorithm part is
the major backbone in preparing the report.
6
3. HARDWARE and SOFTWARE REQUIREMENTS
As of now, we are vigorously focussing on only software algorithms. We
are using MATLAB 2017b tool for developing algorithms. In future, we will be using
TensorFlow and Python IDE for transforming this algorithm into a complete product.
3.1 MATLAB:
Coming to the currently used software, MATLAB abbreviated as “Matrix
Laboratory” is a multi-paradigm numerical computing environment and proprietary
programming language developed by MathWorks. MATLAB allows matrix manipulations,
plotting of functions and data, implementation of algorithms, creation of user interfaces, and
interfacing with programs written in other languages, including C, C++, C#, Java, Fortran and
Python.
Although MATLAB is intended primarily for numerical computing, an optional
toolbox uses the MuPAD symbolic engine, allowing access to symbolic computing abilities.
An additional package, Simulink, adds graphical multi-domain simulation and model-based
design for dynamic and embedded systems.
The MATLAB application is built around the MATLAB scripting language. Common
usage of the MATLAB application involves using the Command Window as an interactive
mathematical shell or executing text files containing MATLAB code. Variables are defined
using the assignment operator, = . MATLAB is a weakly typed programming language
because types are implicitly converted. It is an inferred typed language because variables can
be assigned without declaring their type, except if they are to be treated as symbolic
objects,[13] and that their type can change. Values can come from constants, from computation
involving values of other variables, or from the output of a function.
There are several in-built Toolboxes in Matlab like Image Processing toolbox, Bio
Informatics toolbox, Signal Processing toolbox, Fuzzy Logic toolbox, Neural Networks
toolbox, Statistics and Machine Learning toolbox and so on. All the toolboxes contain pre-
defined functions and variables. The main advantage of Matlab compared with other software’s
is the ease of calling the function from the specified toolbox. Matlab has in-built toolboxes
which can be directly used by just simply installing them. The toolboxes contain several
examples for better understanding for beginners.
7
Fig.3.1 Editor & Command window in Matlab
Our project mainly focuses on Bio Informatics toolbox, Neural Networks toolbox, Statistics
and Machine Learning toolbox and Image processing and Computer Vision toolbox.
3.2 Neural Network Toolbox:
Neural Network Toolbox™ provides algorithms, pretrained models, and apps to create,
train, visualize, and simulate both shallow and deep neural networks. You can perform
classification, regression, clustering, dimensionality reduction, time-series forecasting, and
dynamic system modelling and control.
Deep learning networks include convolutional neural networks (ConvNets, CNNs),
directed acyclic graph (DAG) network topologies, and autoencoders for image classification,
regression, and feature learning. For time-series classification and regression, the toolbox
provides long short-term memory (LSTM) deep learning networks. You can visualize
intermediate layers and activations, modify network architecture, and monitor training
progress.
For small training sets, you can quickly apply deep learning by performing transfer
learning with pretrained deep network models (including Inception-v3, ResNet-50, ResNet-
8
101, GoogLeNet, AlexNet, VGG-16, and VGG-19) and models imported from TensorFlow™
Keras or Caffe.
To speed up training on large datasets, you can distribute computations and data across
multicore processors and GPUs on the desktop (with Parallel Computing Toolbox™), or scale
up to clusters and clouds, including Amazon EC2® P2, P3, and G3 GPU instances (with
MATLAB® Distributed Computing Server™).
Our project is dealing with Neural Network toolbox to create an artificial Perceptron
for creating several hidden layers or Epochs for feature extraction and accuracy. In each hidden
layer, there will be stage-by-stage feature extraction. Each hidden layer output will be input to
the corresponding next hidden layer.
Fig.3.2 Hidden Layers Model
9
3.3 Statistics and Machine Learning Toolbox:
Statistics and Machine Learning Toolbox™ provides functions and apps to describe,
analyze, and model data. You can use descriptive statistics and plots for exploratory data
analysis, fit probability distributions to data, generate random numbers for Monte Carlo
simulations, and perform hypothesis tests. Regression and classification algorithms let you
draw inferences from data and build predictive models.
For multidimensional data analysis, Statistics and Machine Learning Toolbox provides
feature selection, stepwise regression, principal component analysis (PCA), regularization, and
other dimensionality reduction methods that let you identify variables or features that impact
your model.
The toolbox provides supervised and unsupervised machine learning algorithms,
including support vector machines (SVMs), boosted and bagged decision trees, k-nearest
neighbour, k-means, k-medoids, hierarchical clustering, Gaussian mixture models, and hidden
Markov models. Many of the statistics and machine learning algorithms can be used for
computations on data sets that are too big to be stored in memory.
SVM, decision making tree and K & C- means algorithms are part and parcel of our
project. They make up our project successfully give the output. This toolbox gives the
prediction analyses in the algorithm.
3.4 Bio Informatics Toolbox:
Bioinformatics Toolbox™ provides algorithms and apps for Next Generation
Sequencing (NGS), microarray analysis, mass spectrometry, and gene ontology. Using toolbox
functions, you can read genomic and proteomic data from standard file formats such as SAM,
FASTA, CEL, and CDF, as well as from online databases such as the NCBI Gene Expression
Omnibus and GenBank®. You can explore and visualize this data with sequence browsers,
spatial heatmaps, and clustergrams. The toolbox also provides statistical techniques for
detecting peaks, imputing values for missing data, and selecting features. This toolbox is used
to extract features like GLCM texture features from the plant leaves and analyse them
Biologically.
10
Fig.3.3 K-Means Clustering model
3.5 Image Processing Toolbox:
Image Processing Toolbox™ provides a comprehensive set of reference-standard
algorithms and workflow apps for image processing, analysis, visualization, and algorithm
development. You can perform image segmentation, image enhancement, noise reduction,
geometric transformations, image registration, and 3D image processing.
Image Processing Toolbox apps let you automate common image processing
workflows. You can interactively segment image data, compare image registration techniques,
and batch-process large data sets. Visualization functions and apps let you explore images, 3D
volumes, and videos; adjust contrast; create histograms; and manipulate regions of interest
(ROIs).
11
This is the backbone of the project. Image inputs and image analyses for further
evaluation is done through this toolbox. This toolbox is the key subject to the project. Multiple
images are stored in an array form, they are converted to Gray scale, HSV formats for feature
extraction. The input image is normally an RGB image which consists of values from 0 to 255
pixels.
3.6 System Configuration:
Since we are using Deep Learning algorithms and that too image analysis, for greater
processing speed, generally, NVIDIA GPU 1050i version is recommended. Our project is done
on Intel i7 GPU version processor. Intel too supports Deep Learning image analysis through
its Graphic cards. The processing speed of Intel supported Graphic card for Deep Learning
image analysis is as much as speed as NVIDIA processor. Since, the algorithm deals with large
dataset of images, the iterations/Epochs will take some seconds to display the output because
the images are of higher pixels.
12
4. PROPOSED SOLUTION
Our project is broadly classified into 2 algorithms:
(1) Artificial Neural Networks algorithm
ANN is used to detect the plant swelling (moisture content), burning sensation, disease and
pest along with soil analysis. The dataset of the plant leaf, various diseases, pests and soil
images are trained in Matlab tool and classified into various clusters which classifies various
labels. These labels are used to differentiate our input test image and is allotted the nearest
corresponding label to it based on Euclidian distance. Fuzzy C-means algorithm is used to
identify the pest and disease present in the farm. Convolutional Neural Networks is designed
for accurate analysis. Unsupervised Learning classification is used since the input image is
unknown and new to the algorithm.
Most of the real time applications need unsupervised learning data since the input is
always unknown to the algorithm. Hence, Fuzzy C-means gives an accurate output when
compared to K-means which uses supervised learning.
(2) Machine Learning algorithm
Based on the feature extraction parameters, the algorithm predicts whether the crop is going
to get any pest and disease attacks in future. Machine learning algorithm uses CART
(classification and regression tree) to predict the condition of the plant in future based on the
given trained data.
Starting from the basic image reading process, we give dataset of nearly 500-600 each
set of healthy plant images, different pest infected images and diseases infected images.
Therefore, each set of categories will have 500-600 images. The large set is acquired for the
accuracy purpose. We extract the features of each image in each category. The feature
extraction is GLCM texture extraction and edge detection for moisture content level. These
features describe the actual condition of the plant based on the pixel values. One of the major
features for analysis purpose is the moisture content.
Generally, when the plant is suffering from excess amount of water, its leaves will swell
than the normal threshold size. Likewise, when the water content is less, its leaves will drain.
13
Hence, in both the cases the edges of the leaf describe the moisture content. Thereby, edge
detection is used to calculate the moisture content in the plant. The both scenarios lead to
disease attacks in plants. Therefore, the moisture level too plays a vital role in predicting
disease attacks.
The GLCM texture features include:
• Contrast
• Homogeneity
• Correlation
• Energy
• Skew
• Kurtosis
• RMS
• Standard Deviation
• Mean
• Variance
These parameters are texture extraction based on pixel values. The above-mentioned
parameters along with moisture content are analysed for all sort of categories of image dataset.
The parameters of completely disease and pest infected plants, partially infected disease and
pest plants and healthy plant images are taken as image dataset. These values are trained and
clustered based on their respective categories. Right from healthy to partially to completely
disease and pest infected plants are classified into separate labels. The values are trained in
accordance with Unsupervised Learning algorithm. Hence, the development of Fuzzy C-
Means algorithm is enhanced. K-Means algorithm is used for clustering the data and create a
centroid for comparing with the input test image.
14
So, let’s see how the authors have designed the algorithm for this application. After
gathering the image dataset, the data is classified into labels according to their defined names
such as
Label 0: Healthy Label 0: Healthy
Label 1: Disease 1 Label 1: Pest 1
Label 2: Disease 2 Label 2: Pest 2
. .
. .
Label N: Disease N Label N: Pest N
The trained network forms several clusters using SVM classifier. They will be formed
in X-Y axes in geometry plane. The input image taken called test image, after the extraction of
features of it, each feature gets compared with the trained data features. Therefore, a centroid
is formed where the nearest corresponding feature cluster is selected. The label under nearest
corresponding label gets displayed like “Disease X and Pest Y”. This is all done by K-Means
clustering mechanism. This phase is Disease/Pest Detecting part. The Neural Network part
used here is the creation of Convolutional Neural Network (CNN) used for Non-Linear
Regression models. We create an artificial neuron having several hidden layers in it. The
images given below (Fig.4.1 and Fig.4.2) gives a glimpse of CNN and Hidden layers. Each
hidden layer consists of several processing to be done like feature extraction in this project use
case. The number of hidden layers is given by the developer based on the size of dataset. Larger
the hidden layers and dataset, more the accuracy in detecting a feature or status of given input.
The Epochs (iterations) are too decided by the developer. But based on the size of dataset, the
algorithm automatically stops iteration after it reached the maximum accuracy stage. Suppose
we have given 100 images and 200 iterations. If the algorithm completely classifies the images
at 67th iteration, then it stops iterating from 68th iteration. This is one of the advantages of using
MATLAB in saving processing speed.
While coming to the innovative part of the project, Decision Making Tree using
Machine Learning algorithm, the same clusters, classifiers are followed and created. Each
parameter taken right from healthy to completely infected plant is analysed, and math
15
calculations are done for creating a decision-making tree. The values obtained from partially
and completely infected plants are the key roles to be played in making decision making tree.
Trial and Error is done based on those values and algorithm is trained to make its own decision
in accurately predicting the chances of pest and disease attacks in future and in how many days
based on the trained data given. This phase of the project is innovative and unique part and
research oriented compared to rest of the project phases.
Fig.4.1 CNN model
Fig.4.2 Neural Network model
16
Fig.4.3 Decision Making Tree model
Block Diagram:
17
Fig.4.4 Block Diagram
Here, is the block diagram of our proposed solution. We can find the broad layout of
our solution in the first figure and detail description in the second one. It contains both detection
and prediction part.
18
5. RESULTS & DISCUSSIONS
As of now we have completed the training phase of image dataset. Apart
from that we have classified the given input dataset into its corresponding labels and trained
algorithm to classify it using K-Means clustering. We have categorised the images using Bag
of features provided by MATLAB as an in-built function. The moisture content using Edge
Detection is also completed along with how many days will the crop withstand without any
external attacks for post-harvest period. This part is very useful for Agro-based industries
because according to survey 20% of the crop post-harvest is getting destroyed because of not
maintaining optimum moisture levels.
Fig.5.1 Moisture Content extraction with Sparsity levels
19
Fig.5.2 Moisture Content & Status of the leaf
Figures 5.1 & 5.2 describes the moisture content in the given input test image which is
compared with a perfect moisture leaf. The author has developed a math relation for comparing
the moisture levels through edge detection algorithm. After trail and error math, we got the
approximate moisture levels from the input levels and even approximated how many days will
the crop withstand without any external infections. The author has taken the edges (X-axis and
Y-axis) of the image and calculated the width. As mentioned above, if the moisture is more
than normal, leaf will swell, if moisture is less, it will drain.
Hence, this parameter is also used to predict disease attacks based on
present moisture content. For example, if the moisture is high, there is chance of fungal attacks.
Therefore, this is one of the key parameter in predicting external agents attacks.
20
Fig.5.3 Image Features Extraction using Bag-Of-Features & Iterations using K-Means
Clustering
Figure 5.3 describes the feature extraction using MATLAB in-built
function Bag-Of-Features. The algorithm automatically classifies the images using K-Means
Clustering. As we can see in the image, the algorithm made the number of features as 53055
and number of clusters as 500. After the creation of basic fundamentals for training, the clusters
are ready for training and allocating them into labels for detection phase. We have given the
category names as the disease names of the plants. After the clustering, the image data gets into
those corresponding labels for identification purpose. This forms the X- Y- axes in the SVM
plane. Hence, we have created a Perceptron where the image extraction is done in hidden layers.
The Epochs are number of iterations. Basically, the author as given 500 iterations. But as
mentioned above, the algorithm automatically stops the iteration when it gets highest accurate
value based on the size of image dataset.
21
Fig.5.4 GLCM Texture Extraction
Figure 5.4 describes the GLCM texture features which is based on the pixel values of trained
data and test data. This section is the feature extraction of the author defined parameters other
than the MATLAB defined in-built function. There are several parameters which can be
defined based on pixel intensity of the image and used for detecting and predicting analysis.
The trained data contains several parameters and are classified into labels. The input test image
features are extracted and compared with the trained values. The nearest corresponding
centroid “K” is calculated and displayed. This is done for detection process.
22
Fig.5.5 Training Data
Figure 5.5 describes the training data of given image dataset. Normal image
categories are formed for detecting purpose and predicting processing is also done as we can
see in the image. SVM classifier is used for multi-class classifier in predicting analysis. The
training data is created using Machine Learning algorithms and is further used in creating
CART cluster.
Hence, till now we have done the project till:
• Image classification using K-Means clustering
• Image categorisation using Bag-Of-Features
• Image feature extraction using GLCM texture extraction
• Training image dataset
• Moisture content calculation and predicting how long can it withstand
Environment Safety:
As we are purely using images for processing the plants without disturbing
their environmental decorum, there is no environment hazard issue with this proposed
project. Therefore, it is completely safe and in fact helps the plants to grow more effectively
with absolute Zero cost.
23
6. CONCLUSION & FUTURE SCOPE
6.1 Conclusion
Hence, we have completed Image classification, Image Categories,
Feature Extraction, and Training Data. The whole development of algorithm is done in
MATLAB tool. We have used several toolboxes like Statistics and Machine Learning toolbox,
Neural Network Toolbox and Image Processing Toolbox. The outputs as of now are the training
data in form of image categories, image classification using K-Means clustering and moisture
content along with predicting of withstanding. The algorithm is done with training data and
classification of given image dataset. The test input image is compared with the trained data
for detection and prediction analysis. We are using Unsupervised Learning for precise
accuracy. For example, let’s take trained data of Indian Rice plants and test input as African
Rice plant. The accuracy would be low because the slight difference in appearance. Hence, we
are focussing on Unsupervised Learning. The example would get 99% accuracy using Fuzzy
C-Means algorithm. The name itself tells the data might be Fuzzy but we will be getting precise
accuracy. Therefore, we are avoiding using Supervised Learning techniques.
6.2 Future Scope
Further we are planning to transform the project from prototype to a
complete end use product. This can be done using TensorFlow library function in Python IDE
with high processors (recommended using NVIDIA). The end product would be accurately
predicting disease/pest attacks along with identifying them. Larger set of data would be
provided for training network. The whole algorithm would be developed using TensorFlow for
better processing. OpenCV is used for Image analytics similar to Image Processing Toolbox in
MATLAB. Therefore, the farmer has to just take a snap of the leaf, upload it to the cloud where
the back end processing will do predict/detect analysis and give corrective measures for
preventing and eliminating external hosts.
24
7. BIBLIOGRAPHY
1. MachineLearning: What it is and why it matters, 09 2016, [online] Available:
www.sas.com
2. R. E. Schapire, "The boosting approach to machine learning: An overview" in
Nonlinear estimation and classification, New York:Springer, pp. 149-171, 2003.
3. A. Ben-Hur, D. Horn, H. T. Siegelmann, V. Vapnik, "Support vector
clustering", Journal of machine learning research, vol. 2, pp. 125-137, Dec 2001.
4. J. R. Otukei, T. Blaschke, "Land cover change assessment using decision trees support
vector machines and maximum likelihood classification algorithms", International
Journal of Applied Earth Observation and Geoinformation, vol. 12, pp. S27-S31, 2010.
5. Panigrahi, S. & Ting K.C. (1998) Artificial Intelligence for Biology and Agriculture,
Kluwer Academic Press).
6. Mukhopadhyay S.C. (2012) Smart Sensing Technology for Agriculture and
Environmental Monitoring. Vol. 146, Springer Berlin Heidelberg.
7. German, L., Ramisch, J.J. & Verma R. (2010) Beyond the Biophysical, Knowledge,
Culture, and Power in Agriculture and Natural Resource Management, Springer Publ.
8. Jun Wu, Anastasiya Olesnikova, Chi-Hwa Song, Won Don Lee (2009). The
Development and Application of Decision Tree for Agriculture Data. IITSI :16-20.
9. Leemans, V., Destain, M.F.,2004.A real-time grading method of apples based on
features extracted from defects. J. Food Eng. 61, 83-89.
10. Quinlan, J.R.(1985b). Decision trees and multi-valued attributes. In J.E. Hayes & D.
Michie (Eds.), Machine intelligence 11. Oxford University Press (in press).
11. Zelu Zia (2009). An Expert System Based on Spatial Data Mining used Decision Tree
for Agriculture Land Grading. Second International Conference on Intelligent
Computation Technology and Automation. Oct10-11, China
12. Y. Hayami, V. W. Ruttan, Agricultural development: an international perspective,
Baltimore, London:The Johns Hopkins Press, 1971.
13. S. Ray, Essentials of Machine Learning Algorithms (with Python and R Codes)“ in
Analytics Vidhya, 2015, [online] Available:
https://www.analyticsvidhya.com/blog/2015/08/common-machine-learning-
algorithms/.
25
14. S. Veenadhari, D. Bharat Mishra, D. C. Singh, "Soybean Productivity Modelling Using
Decision Tree Algorithms", International Journal of Computer Applications, vol. 27,
no. 7, pp. 11-15, Aug. 2011.
15. M. S. Dahikar, D. V. Rode, "Agricultural Crop Yield Prediction Using Artificial Neural
Network Approach", INTERNATIONAL JOURNAL OF INNOVATIVE RESEARCH IN
ELECTRICAL ELECTRONICS INSTRUMENTATION AND CONTROL
ENGINEERING, vol. 2, no. 1, pp. 683-686, 2014.
16. M. Somvanshi, P. Chavan, "A review of machine learning techniques using decision
tree and support vector machine", 2016 International Conference on Computing
Communication Control and automation (ICCUBEA), pp. 1-7, 2016.
26
8. WEBLIOGRAPHY
https://in.mathworks.com/help/matlab/ref/rgb2gray.html
http://matlab.izmiran.ru/help/toolbox/images/enhanc15.html
https://in.mathworks.com/help/images/ref/graycomatrix.html
https://in.mathworks.com/help/images/ref/graycoprops.html
https://in.mathworks.com/help/vision/examples/image-category-classification-using-deep-
learning.html
https://in.mathworks.com/matlabcentral/fileexchange/22187-glcm-texture-features
https://in.mathworks.com/help/matlab/ref/categorical.html
https://in.mathworks.com/help/stats/fitcecoc.html
https://in.mathworks.com/matlabcentral/answers/15307-image-operations-skewness-and-kurtosis
https://in.mathworks.com/matlabcentral/answers/112344-how-to-get-the-pixel-value-of-histogram
http://matlab.izmiran.ru/help/toolbox/images/graycomatrix.html
https://in.mathworks.com/matlabcentral/fileexchange/37197-dem--diffused-expectation-
maximisation-for-image-segmentation
https://in.mathworks.com/matlabcentral/answers/uploaded_files/98784/Classify_RGB_Image.m
https://in.mathworks.com/matlabcentral/answers/uploaded_files/98783/kmeans_color_segmentati
on.m
https://www.researchgate.net/post/how_to_calculate_Energy_entropy_correlation_using_GLCM
https://in.mathworks.com/help/nnet/ug/divide-data-for-optimal-neural-network-training.html
https://in.mathworks.com/matlabcentral/answers/106010-how-to-train-data-in-neural-network
https://in.mathworks.com/matlabcentral/fileexchange/62990-deep-learning-tutorial-series
https://in.mathworks.com/videos/using-feature-extraction-with-neural-networks-in-matlab-
1492009542601.html
https://in.mathworks.com/matlabcentral/fileexchange/22187-glcm-texture-features
27
https://in.mathworks.com/help/vision/examples/image-category-classification-using-bag-of-
features.html
https://in.mathworks.com/help/vision/ref/trainimagecategoryclassifier.html
https://in.mathworks.com/help/vision/ug/image-classification-with-bag-of-visual-words.html
https://in.mathworks.com/help/stats/k-means-clustering.html#bq_679x-19
https://in.mathworks.com/help/images/examples/color-based-segmentation-using-k-means-
clustering.html
https://in.mathworks.com/matlabcentral/answers/137750-how-to-input-train-data-and-test-data-
features-of-images-using-svm-calssifier
http://dipwm.blogspot.com/2013/01/svm-support-vector-machine-with-matlab.html
https://in.mathworks.com/help/stats/classificationsvm.html
https://in.mathworks.com/help/vision/ref/imageset-class.html
https://in.mathworks.com/help/nnet/ref/network.html
https://in.mathworks.com/help/nnet/examples/create-simple-deep-learning-network-for-
classification.html
https://in.mathworks.com/help/nnet/ug/layers-of-a-convolutional-neural-network.html#bvobklb-4
https://in.mathworks.com/help/nnet/ref/trainnetwork.html
https://in.mathworks.com/help/nnet/ref/nnet.cnn.layer.layer.html
https://in.mathworks.com/help/nnet/ref/regressionlayer.html
28
9. APPENDICES
Codes:
1. Moisture Content
clc close all clear all testimag = imread('C:\matlab projects\cornhealthy.jpg'); I1 = rgb2gray(testimag); subplot(2,2,1); image(testimag); disp('input healthy rice leaf'); subplot(2,2,2); image(I1); disp('gray scale image'); BW = edge(I1); [r1,c1] = find(BW); subplot(2,2,3); spy(BW); disp('sparsity pattern of healthy rice'); x2 = max(r1); x1 = min(r1); X1 = x2 - x1 y2 = max(c1); y1 = min(c1); Y1 = y2 - y1 subplot(2,2,4); imshow(BW); disp('edge detection'); testimag = imread('C:\matlab projects\cornhealthy.jpg'); I2 = rgb2gray(testimag); subplot(2,2,1); image(testimag); disp('input healthy rice leaf'); subplot(2,2,2); image(I2); disp('gray scale image'); BW = edge(I2); [r2,c2] = find(BW); subplot(2,2,3); spy(BW); disp('sparsity pattern of healthy rice'); x3 = max(r2); x4 = min(r2); X2 = x3 - x4 y3 = max(c1); y4 = min(c1); Y2 = y3 - y4 subplot(2,2,4); imshow(BW); disp('edge detection'); if((X2<X1)||(Y2<Y1)) disp('less moisture'); else disp('more moisture'); end if((X1==X2)&&(Y1==Y2)) disp('healthy leaf');
29
end
c=isequal(I1,I2); if (c==1) disp('healthy'); else disp('not healthy'); end
2. GLCM Feature Extraction
imds1=imread('C:\matlab projects\Training\ObjectCategories\Planthopper\p-
1.jpg'); I1=rgb2gray(imds1); GLCM4 = graycomatrix(I1); stats1 =
graycoprops(GLCM4,{'contrast','homogeneity','correlation','Energy'}); disp(stats1); V1=rms(GLCM4); disp(V1); hsv_im = rgb2hsv(imds1); h1 = hsv_im(:,:,1); [pixelCount, grayLevels] = hist(h1(:), 360); pixelCount(20:30)=0; % Get the number of pixels in the histogram. numberOfPixels1 = sum(pixelCount); disp(numberOfPixels1); % Get the mean gray lavel. meanGL1 = sum(grayLevels .* pixelCount) / numberOfPixels1; disp(meanGL1); % Get the variance, which is the second central moment. varianceGL1 = sum((grayLevels - meanGL1) .^ 2 .* pixelCount) /
(numberOfPixels1-1); disp(varianceGL1); % Get the standard deviation. sd1 = sqrt(varianceGL1); disp(sd1); % Get the skew. skew1 = sum((grayLevels - meanGL1) .^ 3 .* pixelCount) / ((numberOfPixels1
- 1) * sd1^3); disp(skew1); % Get the kurtosis. kurtosis1 = sum((grayLevels - meanGL1) .^ 4 .* pixelCount) /
((numberOfPixels1 - 1) * sd1^4); disp(kurtosis1);
30
3. Training Image Data Set
rootFolder='C:\matlab projects\Manu Disease Dataset'; categories={'Alternaria Alternata','Anthracnose','Bacterial
Blight','Cercospora Leaf Spot','Healthy Leaves'}; imds = imageDatastore(fullfile(rootFolder, categories), 'LabelSource',
'foldernames'); imds = splitEachLabel(imds,5, 'randomize'); bag = bagOfFeatures(imds); [trainingSet,testSet] = splitEachLabel(imds,5,'randomize'); bag = bagOfFeatures(trainingSet); categoryClassifier = trainImageCategoryClassifier(trainingSet,bag); [labelIdx, score] = predict(categoryClassifier,imds); categoryClassifier.Labels(labelIdx);