smartfarm: improving agriculture sustainability using ...ckrintz/papers/dsfew16.pdf · smartfarm:...

5
SmartFarm: Improving Agriculture Sustainability Using Modern Information Technology Chandra Krintz, Rich Wolski, Nevena Golubovic, Benji Lampel, Varun Kulkarni Dept. of Computer Science UC Santa Barbara Balaji Sethuramasamyraja*, Bruce Roberts** *Dept. of Industrial Technology **Dept. of Plant Sciences Fresno State University Bo Liu Dept. of BioResource and Agricultural Engineering Cal Poly University, San Luis Obispo ABSTRACT In this paper, we overview our work on an open source, hybrid cloud approach to agriculture analytics for enabling sustainable farming practices. SmartFarm integrates dis- parate environmental sensor technologies into an on-farm, private cloud software infrastructure that provides farm- ers with a secure, easy to use, low-cost data analysis sys- tem. SmartFarm couples data from external cloud sources (weather predictions, satellite imagery, state and national datasets, etc) with farm-local statistics, provides an inter- face into which custom analytics apps can be plugged, and ensures that all private data remain under grower control. 1. INTRODUCTION Ecological sustainability depends critically on the ability of world food production to manage increasingly limited nat- ural resources (e.g. arable land and water) with new tech- niques that both enhance environmental stewardship and in- crease farm productivity. We believe that the key to future food security, food safety, and ecological sustainability lies in the use of customized data analytics by individual growers, agricultural workers, and food producers. For analytics to become the farm implements of the twenty-first century, new Information Technology (IT) developments that are acceler- ating e-commerce (e.g. cloud computing, machine learning, mobile client-server systems) must (i) be made inexpensive, accessible to, and beneficial for, a vast diversity of the popu- lation, (ii) address a number of disparate problems including increasing yields, conserving water, and ensuring soil and plant health, and (iii) facilitate new sustainable agriculture science. Today smallholder agriculture and their commu- nities (as opposed to the industrial-scale farming concerns) are strikingly underserved by modern IT. The goal of our research is to achieve these goals via a new open source, IT system called UCSB SmartFarm. Smart- Farm is hybrid cloud technology designed to enable small- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. © 2016 ACM. ISBN 978-1-4503-2138-9. DOI: 10.1145/1235 holder growers and other agricultural (ag) professionals, re- searchers, and students, to use analytics to improve envi- ronmental sustainability and efficiencies in food production. SmartFarm combines data aggregation for disparate ag- related data sensors and sources, integrates multiple analyt- ics technologies to facilitate a wide range of data inference, prediction, and visualization, and gives growers full control over the privacy, security, and sharing of their most valuable farm asset: their data. Although many precision farming and ag decision support systems have emerged recently e.g. [8, 12, 18, 19, 21, 29], ex- tant solutions have not received wide spread use to date. The reasons for this, which we have discovered by work- ing with multiple growers and ag technology vendors over the past year, center around concerns related to data pri- vacy, security, and control. First, extant solutions employ a cloud model to facilitate scalable customer acquisition and software maintenance. To use these solutions, growers must transmit potentially vast amounts of data over expensive and low bandwidth network links (for which the growers pay) to a remote cloud application owned and controlled by the vendor. Such connectivity can be very costly and is infeasible in many rural communities. Second, this model requires that growers reveal private in- formation and relinquish ownership and control of their data to vendors that supply their sensors, equipment, or other ag products. Data from these technologies reveals private and personal information about grower practices, crop input (chemicals, fertilizers, pesticides), and farm implement use, purchasing and sales details, water use, disease problems, etc., that define a grower’s business and competitiveness. The next issue concerns lock-in – vendors make it very dif- ficult and in some cases impossible for growers to get data out of these systems, e.g. to stop using a vendor’s service or to move data to another vendor. In addition, vendors increasingly charge a recurring fee to host and export ac- cess to the data to growers (via their web browsers), and in many cases, sell the data or advertising opportunities to other companies and/or use the data to target marketing and sales campaigns. Finally, the use of centralized, net- worked systems for deployment of technologies makes grow- ers vulnerable to security breaches, data loss, and interrup- tion since they expose a large attack surface, introduce single points of failure, and require continuous, high quality Inter- net connectivity. SmartFarm presents an alternative approach to ag an-

Upload: others

Post on 20-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SmartFarm: Improving Agriculture Sustainability Using ...ckrintz/papers/dsfew16.pdf · SmartFarm: Improving Agriculture Sustainability Using Modern Information Technology Chandra

SmartFarm: Improving Agriculture Sustainability UsingModern Information Technology

Chandra Krintz, RichWolski, Nevena

Golubovic, Benji Lampel,Varun Kulkarni

Dept. of Computer ScienceUC Santa Barbara

BalajiSethuramasamyraja*,

Bruce Roberts***Dept. of Industrial Technology

**Dept. of Plant SciencesFresno State University

Bo LiuDept. of BioResource andAgricultural Engineering

Cal Poly University, San LuisObispo

ABSTRACTIn this paper, we overview our work on an open source,hybrid cloud approach to agriculture analytics for enablingsustainable farming practices. SmartFarm integrates dis-parate environmental sensor technologies into an on-farm,private cloud software infrastructure that provides farm-ers with a secure, easy to use, low-cost data analysis sys-tem. SmartFarm couples data from external cloud sources(weather predictions, satellite imagery, state and nationaldatasets, etc) with farm-local statistics, provides an inter-face into which custom analytics apps can be plugged, andensures that all private data remain under grower control.

1. INTRODUCTIONEcological sustainability depends critically on the ability

of world food production to manage increasingly limited nat-ural resources (e.g. arable land and water) with new tech-niques that both enhance environmental stewardship and in-crease farm productivity. We believe that the key to futurefood security, food safety, and ecological sustainability lies inthe use of customized data analytics by individual growers,agricultural workers, and food producers. For analytics tobecome the farm implements of the twenty-first century, newInformation Technology (IT) developments that are acceler-ating e-commerce (e.g. cloud computing, machine learning,mobile client-server systems) must (i) be made inexpensive,accessible to, and beneficial for, a vast diversity of the popu-lation, (ii) address a number of disparate problems includingincreasing yields, conserving water, and ensuring soil andplant health, and (iii) facilitate new sustainable agriculturescience. Today smallholder agriculture and their commu-nities (as opposed to the industrial-scale farming concerns)are strikingly underserved by modern IT.

The goal of our research is to achieve these goals via a newopen source, IT system called UCSB SmartFarm. Smart-Farm is hybrid cloud technology designed to enable small-

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full cita-tion on the first page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-publish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected].

© 2016 ACM. ISBN 978-1-4503-2138-9.

DOI: 10.1145/1235

holder growers and other agricultural (ag) professionals, re-searchers, and students, to use analytics to improve envi-ronmental sustainability and efficiencies in food production.SmartFarm combines data aggregation for disparate ag-related data sensors and sources, integrates multiple analyt-ics technologies to facilitate a wide range of data inference,prediction, and visualization, and gives growers full controlover the privacy, security, and sharing of their most valuablefarm asset: their data.

Although many precision farming and ag decision supportsystems have emerged recently e.g. [8, 12, 18, 19, 21, 29], ex-tant solutions have not received wide spread use to date.The reasons for this, which we have discovered by work-ing with multiple growers and ag technology vendors overthe past year, center around concerns related to data pri-vacy, security, and control. First, extant solutions employ acloud model to facilitate scalable customer acquisition andsoftware maintenance. To use these solutions, growers musttransmit potentially vast amounts of data over expensiveand low bandwidth network links (for which the growerspay) to a remote cloud application owned and controlledby the vendor. Such connectivity can be very costly and isinfeasible in many rural communities.

Second, this model requires that growers reveal private in-formation and relinquish ownership and control of their datato vendors that supply their sensors, equipment, or otherag products. Data from these technologies reveals privateand personal information about grower practices, crop input(chemicals, fertilizers, pesticides), and farm implement use,purchasing and sales details, water use, disease problems,etc., that define a grower’s business and competitiveness.

The next issue concerns lock-in – vendors make it very dif-ficult and in some cases impossible for growers to get dataout of these systems, e.g. to stop using a vendor’s serviceor to move data to another vendor. In addition, vendorsincreasingly charge a recurring fee to host and export ac-cess to the data to growers (via their web browsers), andin many cases, sell the data or advertising opportunities toother companies and/or use the data to target marketingand sales campaigns. Finally, the use of centralized, net-worked systems for deployment of technologies makes grow-ers vulnerable to security breaches, data loss, and interrup-tion since they expose a large attack surface, introduce singlepoints of failure, and require continuous, high quality Inter-net connectivity.

SmartFarm presents an alternative approach to ag an-

Page 2: SmartFarm: Improving Agriculture Sustainability Using ...ckrintz/papers/dsfew16.pdf · SmartFarm: Improving Agriculture Sustainability Using Modern Information Technology Chandra

farm  office  

security  boundary   ac2onable  

analy2cs,  alerts  

D  A  T  A  

External  data    sources  (ingress      only)  

UAVs  

Private  Cloud  

Aerial  imagery  

Internet  data  

climate  data  

servers  

many  sources  of      farm  sensor  data    

APIs  

Public  Clouds  

!"#$%

Data Privacy and Management Layer

Google App Engine APIs

NoSQL% MySQL%

Analytics Engines

AppSca

le%

Euca

lyptu

s or

Am

azo

n A

WS%

Extensibility APIs

Self-Managing and Fault Tolerance Support

SmartFarm Analytics and Visualization Apps%

HDFS% Backup/ Restore%

Hybrid

Clo

ud

Support%

Sensor Support%

Figure 1: SmartFarm Vision (left) and the SmartFarm Software Architecture (right)

alytics and decision support. It provides growers with ahybrid, distributed architecture (hardware and software sys-tem) that provides growers with an on-farm “analytics ap-pliance”. SmartFarm aggregates data for growers and auto-matically applies analysis algorithms to implement decisionsupport. If Internet connectivity is available, SmartFarmcan download data analytics and visualization applications(apps) from an Ag App Store (otherwise apps can be in-tegrated via DVD mailers). Thus, SmartFarm “moves thecode to the data” rather than “moving the data to the code”,thereby significantly reducing network/mailer use.

Because SmartFarm runs on-farm (but can interoperatewith remote data sources if necessary), we have designed itto be very easy to use and to operate autonomously with orwithout Internet connectivity. Moreover, the system ensuresthat no private data is leaked and provides robustness andsecurity via its distributed black-box design. In the sectionsthat follow, we overview the design and implementation ofSmartFarm. We then overview our multiple design choicesand their implications. Finally, we show how SmartFarmcan be used to extract actionable insights for precision har-vesting and root cause analysis.

2. UCSB SMARTFARMA depiction of our overarching vision for SmartFarm is

shown in Figure 1 (left). SmartFarm integrates disparateenvironmental sensor technologies into an on-premise, pri-vate cloud software infrastructure that provides farmers witha secure, easy to use, low-cost data analysis system. Smart-Farm couples data from external sources (if/when available)with farm-local statistics, provides an interface into whichcustom analytics tools can be plugged and automaticallydeployed, and ensures that all data and analyses remain un-der the control of farmers. SmartFarm enables farmers toextract actionable insights from their data, to quantify theimpact of their decisions and environmental changes, and toidentify opportunities for increasing farm productivity.

Figure 1 (right) presents the SmartFarm software archi-tecture. SmartFarm extends open source technologies fromthe domains of cloud computing, big data analytics, andIoT. While these freely available software technologies are

powerful, they are designed for use primarily in commerciale-commerce and Information Technology (IT) settings. As aresult, they are robust but cannot be easily used off-the-shelfbecause they require significant expertise, IT support, andexpensive staffing to setup, manage, and maintain – whichare not available to smallholder farming concerns. Smart-Farm integrates and leverages these technologies to con-struct an open source data appliance specifically designed toenable individual growers to practice precision agriculture.

To enable this, SmartFarm leverages the AppScale cloudplatform (PaaS) and the Eucalyptus cloud infrastructure(IaaS) to provide portability across clouds and autonomousIT management. Both of these systems are commercial-grade platforms in wide-spread enterprise use today. App-Scale is an open source distributed runtime system thatmakes it very easy to write and deploy network-accessibleapplications (including mobile device back-ends) in high-level languages. AppScale is commercially supported, freelyavailable, and is API-compatible with Google’s cloud plat-form App Engine. This means that applications that run onApp Engine, also run on AppScale and thus SmartFarm,without modification. AppScale runs on public cloud infras-tructures (Amazon Elastic Compute Cloud (EC2), GoogleCompute Engine, Microsoft Azure) or on local cluster re-sources within virtual machine instances over Eucalyptus.Eucalyptus is an open source cloud infrastructure that runson local cluster resources and provides highly available, faulttolerant virtual machine management. Eucalyptus is API-compatible with Amazon EC2 and Simple Storage Service(S3) so that any EC2 instance/service can run on Euca-lyptus without modification. AppScale and Eucalyptus to-gether in SmartFarm provide us with a production-ready,highly available private cloud system and our users with avast ecosystem of freely available cloud software from Googleand Amazon.

Our software architecture for SmartFarm extends thisprivate cloud system in multiple ways. First it integratespopular, open source analytics engines from the “Big Data”community. In particular, the SmartFarm platform sup-ports Spark [5], Hadoop [13], Cassandra [6], postgreSQL [20],MatLab [17], and R [9]. We integrate these disparate tech-nologies into a single distributed system in order to permit

Page 3: SmartFarm: Improving Agriculture Sustainability Using ...ckrintz/papers/dsfew16.pdf · SmartFarm: Improving Agriculture Sustainability Using Modern Information Technology Chandra

Figure 2: Prototype SmartFarm appliance for on-farm de-ployment. Top half contains compute elements runningSmartFarm software stack. Bottom half contains networkswitch and uninterruptable power supply (UPS).

a wide variety of applications to be executed over Smart-Farm. That is, we envision a model in which developers, re-searchers, and industry vendors develop analytics and datavisualization “apps” that rely on these technologies for theirimplementation. Growers then download the apps from an“App Store” (for free or for a fee) and execute them on theSmartFarm appliance using data owned and controlled bythe grower. There are numerous such apps available to-day for general purpose (non-ag) data analysis, visualiza-tion, and machine learning [4, 14–16,23,26].

Figure 2 shows a picture of an on-farm prototype appli-ance. The SmartFarm software implementation (the soft-ware “stack”) runs on a set of small, durable computing“bricks.” Each brick, in this example, contains an 3.1 GHzIntel i7 dual core CPU, 16GB of memory, and 250GB of SSDstorage, an 802.11 wireless interface, and a 1 Gb CAT-V eth-ernet interface. The bricks are “headless” meaning that theycan operate without an attached console or keyboard. Theycommunicate with each other using the wired network viaa switch (located in the bottom half of the appliance) andall components are plugged into an uninterruptable powersupply (UPS) also located in the bottow half.

Because these components can operate at relatively hightemperatures (120 degree F or more) and they contain nomoving parts, they are well suited to deployment on-farm inan out building or office where there is 120V AC and wire-less connectivity. Its firewall enables growers to control fullywhat data, if any, they share with remote entities. More-over, SmartFarm apps can be used to anonymize shareddata according to the growers’ needs and interests. Finally,SmartFarm employs a private wired or wireless network forconnectivity to on-farm sensors, for full isolation from webaccessibility if required.

Our experience with this early SmartFarm prototypeshows that it is feasible to integrate these technologies effi-ciently and effectively and that it is possible to operate thiscombination effectively both at the large scale for which itwas defined and, more interestingly, at the small scale whichwill be typical of on-farm deployments. Thus SmartFarmessentially provides a universal, user-controlled cloud plat-form capable of leveraging the predominant public cloud andpopular analytics technologies and applications.

3. DATA INGRESSWe use the term sensor herein to describe any device or

system that serves as a data source for the analytics or vi-sualization apps executing in a SmartFarm deployment.Sensors under this definition include physical devices that

measure environmental phenomena or record events and ac-tivities of farm agents (human, animal, computer, or me-chanical) that are germane to precision farming. Addition-ally, “files” that have been gathered by the grower, extantdatabases, and Internet accessible web services (e.g. weatherdata sites) that can provide measurements are modeled andtreated as “sensors” by the SmartFarm system.

Our goal with SmartFarm is to support as“farm sensors”a variety of data sources including historical records (handwritten and digital), sensing devices, the Internet (if/whenavailable), and imagery data. Historical records includethose for input (pesticides, water, fertilizers) applications,disease, crop variability and yields, animal health and wel-fare, and manual information extraction (BRIX tests – sugarcontent in solution), Anthocyanin (pigment), pressure cham-bers (plant water status), among others. The sensing de-vices include those that measure water pump performance/-pressure, irrigation processes, and soil moisture/quality aswell as weather stations, energy meters (e.g. GreenBut-ton [10]), and devices that collect farm implement (planters,harvesters, etc.) activity and animal behavior. Internet datais data that is made available via web APIs and include thosefrom social networks, area weather stations (e.g. CIMIS [7],WeatherUnderground), and web services such as those forgeo-location, news, and stock reports. Finally, imagery datainclude multi-spectral (that reveal plant health, disease, ani-mal activity, water use, etc.) video and still images collectedon-farm manually or via UAVs, e.g. [3, 22, 28], and off-farmvia fixed-wing aircraft [27] and satellites [11,24].

No extant system today provides such unified data ag-gregation. The challenges in doing so include configuringand deploying (i.e. interoperating with) such a diversity ofoptions. Moreover, many vendors hold their sensor deviceAPI as proprietary for commercial purposes. In addition,individual vendors are not commercially incentivized or in-terested in providing a unifying data integration service thatworks with the devices of their competitors. Thus, such aninfrastructure and feature set must come from the researchand open source communities.

The most mature step in this direction is with the an opensource middleware for the Internet-of-Things (IoT) calledGlobal Sensor Network [1, 2] (GSN). As such, we have in-tegrated GSN into SmartFarm and have extended it withcustomized support for agriculture settings. GSN is a ro-bust, widely used technology for deploying sensor networksand processing data produced by them in a distributed set-ting. Because GSN is a general purpose system, it providesmechanisms for integrating arbitrary sensors. It does so bydefining a “virtual sensor” in software that represents andinteroprates with one or more physical sensing (data pro-ducing) devices. The GSN community has contributed anumber of different virtual sensor specifications and imple-mentations for popular IoT devices (home monitoring de-vices, photographs, weather stations, and others).

Each GSN virtual sensor consists of a component that in-terfaces to the physical sensor or group of sensors and a com-ponent that collects and persists the data from the sensorin the data management layer of SmartFarm . These com-ponents are implemented as part of the GSN virtual sensorsoftware stack consists of a sensor descriptor, data process-ing logic, and a software wrapper that communicates withone or more instances of a physical sensor. This interfaceis sufficiently abstract to support a wide range of different

Page 4: SmartFarm: Improving Agriculture Sustainability Using ...ckrintz/papers/dsfew16.pdf · SmartFarm: Improving Agriculture Sustainability Using Modern Information Technology Chandra

sensor types (including GSN itself to link multiple GSN de-ployments if necessary).

To date we have developed GSN virtual sensors for lo-cal multiple soil moisture sensors, local temperature sen-sors, and local records including multi-spectral imagery andcomma separated value (CSV) files. We store images in abucket store and their metadata in GSN. In addition wehave developed virtual sensors for remote (web/cloud appli-cation programming interfaces (APIs)) data sources (whenavailable). The technologies that we have integrated includethose from GreenButton (electricity use), CIMIS (weatherand irrigation-related data), WeatherUnderground (commu-nity weather stations), PowWow Energy (imagery), Tule(evapotranspiration), Irrometer (soil moisture and temper-ature), as well as DropBox and Box (generic image/CSVfile storage). We have also extended GSN with support fordirect JSON queries (only vanilla SQL was supported pre-viously). In addition, we have developed multiple tools thatautomate virtual sensor specification and GSN deployment.Our extensions simplify sensor development and GSN useto make GSN more accessible to non-experts. In addition,they enable GSN users to ingress a wide variety of disparatedatasets from a vast diversity of physical sensors and datasources. All of our extensions and tools are available viaGitHub at https://github.com/UCSB-CS-RACELab/gsn.

In all settings, deployed sensor data is automatically in-gressed by GSN and stored in SQL database tables. GSNprovides visualization tools to visualize the data over timeand SmartFarm apps access the data via SQL queries orby transforming the data to HDFS for access via Hadoopand Spark analytics and NoSQL for large scale distributedrange queries.

4. DATA EGRESS AND ANALYSISOur goal with SmartFarm is to help farmers and ranchers

improve environmental stewardship, while increasing yields,profitability, and animal welfare. To enable this, we makeSmartFarm available as open source to researchers andsoftware developers to investigate new sustainability scienceand engineering in the areas of precision agriculture, agro-nomics, and bio-resource and agriculture engineering in theform of SmartFarm applications (apps). The SmartFarmplatform exports a set of APIs and tools that simplify appdeployment and once running, automatically manages theirexecution. The platform is also able to communicate andcontrol farm-local sensors and equipment based on the out-come of app analytics.

As an example, we have developed a SmartFarm appfor wine grape growers called RootRApp. Our foundationRootRApp employs anthocyanin sampling via Near Infrared(NIR) sensing in combination with Ordinary Kriging to im-plement automated differential harvesting [25]. This use ofsampling and Kriging constitutes an example of statisticalanalytics used as a control mechanism since the harvesteruses the Kriging interpolations to separate grapes as it picksthem into two different quality batches automatically.

SmartFarm makes it possible to use the same data setwith different analytics also to implement decision supportfor the grower. RootRApp uses samples of anthocyanin (col-lected for the original study) and attempts to find a“point oforigin” for a potential root cause associated with the grapesthat had fruit readings lower than 0.87mg/g (low quality).

Figure 3 (a) reproduces the map in our original work with

respect to anthocyanin readings. In the original study, thegrower selected a threshold value of 0.87 to differentiate be-tween grape qualities. The figure shows regions where theharvester used a value interpolated from the sample to de-cide whether the grapes were Quality A (dark red regions)or Quality B (yellow regions).

Using the same dataset, RootRApp also computes theα = 0.01 confidence region for the Quality B grapes basedon an assumption of bivariate normality. If the reason forquality difference has a point of origin then a bivariate Nor-mal distribution (as a function of distance) is a reasonablefirst-order model to investigate as a dispersion model. Fig-ure 3 (b) shows the results of this preliminary investigation.Observing the data, it is possible to discern (based on theproximity to the tree and water derrick marked in each)where the likely point-of-origin is located. From these twofigures it appears that such a point would lie in the interiorof the yellow region shown in the center of Figure 3 (a).

SmartFarm makes it easy to extract or egress data fromthe system so that it can be used by online tools and hybridcloud services. In this example, we use Google Earth to en-able the growers to zoom in and out of SmartFarm data(i.e. “fly” through the annotated image) in real time. Multi-ple levels of detail combined with analytics calculation allowthe grower to pinpoint the exact region in which a point-of-origin is likely to occur. By storing the data and performingthe analysis locally, SmartFarm provides growers with thesame outcome that public cloud solutions provide while con-serving both bandwidth and Internet data plan charges andpreserving grower privacy.

In particular, no information about the farm (other thanthe GPS locations of the samples) is shared with Google, i.e.no information about anthocyanin or BRIX (which indicatepotential productivity for the plot) is shared. RootRAppprecludes the need to upload these latter measurements toGoogle when producing the map for the grower. In this way,an adversary gaining access to Google’s infrastructure mightlearn that a grower is interested in a particular plot of land(based on latitude and longitude) but could not discern whatinformation beyond that were part of the analysis. Becausethe app is customized to a specific growing region, it is alsopossible to determine whether a datum falls within the lat-itude and longitude under study and if it does not, block itfrom being egressed.

Notice also that RootRApp is purely an analytics calcu-lation. The root cause could be lack of plant stress, disease,insect infestation, soil contamination, etc. That is, we canapply it to a wide range of problem sets and sensor datatypes. The analytics combined with high-quality and inter-active imaging allow the grower to focus his or her effortsto investigate specific plants and areas in the vineyard. Forexample, if the grower wishes to investigate whether the rea-son for the lower levels of anthocyanin were due to a waterleak that reduced plant stress, the level of detail providedby the interactive capability provided by Google Earth com-bined with the analytics calculation showing the confidenceregion provides a useful guide on where to look for the leak.

5. CONCLUSIONS AND FUTURE WORKIn this paper, we overview the vision, software architec-

ture, and our on-going work surrounding UCSB Smart-Farm. In addition, we describe a new SmartFarm applica-tion (app), called RootRApp, that makes use of farm sensor

Page 5: SmartFarm: Improving Agriculture Sustainability Using ...ckrintz/papers/dsfew16.pdf · SmartFarm: Improving Agriculture Sustainability Using Modern Information Technology Chandra

(a) (b)Figure 3: Example SmartFarm app: Harvester control and origin analysis using the same data and hybrid cloud (local analysisand Google Earth). Figure (a) is an interpolated anthocyanin map from [25]. Points indicate vines that were sampled. Darkred regions indicate anthocyanin readings > 0.87 (Quality A grapes) and yellow regions corresponds to readings <= 0.87(Quality B grapes). Figure (b) shows the anthocyanin samples (red points) from [25] and point-of-origin using the bivariateNormal for probability 0.99. The white ellipse delineates the α = 0.01 confidence region.

data to perform root cause analysis for wine grape qualitydifferences. RootRApp, summarizes and anonymizes dataso that it can be shared and visualized with a cloud geo-spatial mapping service (Google Earth) running locally orat Google.

As part of future work, we are developing techniques thatwill enable autonomous management of the SmartFarmprivate cloud (greater than 1 year) by leveraging the product-ion-quality mechanisms for high availability that Eucalyptusprovides. We are also working on SmartFarm data modelsand HDFS interoperation to facilitate efficient data queryand processing (via the analytics engines). Finally, we areinvestigating a wide range of SmartFarm apps for frostdamage mitigation, irrigation scheduling, and yield predic-tion as well as a number of new farm sensor technologies.We hope that SmartFarm enables growers to ask “what if”questions like the point-of-origin question dynamically, inreal time, using the data they have harvested locally – at amuch greater frequency and efficacy than is possible today.Closing the loop between grower experience, data gathering,and data analysis will enable growers to leverage their owndata assets to improve yields more sustainably.

This work is funded in part by NSF (CCF-1539586, CNS-1218808, CNS-0905237, ACI-0751315), NIH (1R01EB014877-01), ONR NEEC (N00174-16-C-0020), Huawei Technologies,and the California Energy Commission (PON-14-304).

6. REFERENCES[1] K. Aberer, M. Hauswirth, and A. Salehi. A middleware for fast

and flexible sensor network deployment. Technical Report2006-006, LSIR - EPFL, 2006.

[2] K. Aberer, M. Hauswirth, and A. Salehi. A middleware for fastand flexible sensor network deployment. In InternationalConference on Very Large Data Bases, 2006.

[3] AgEagle. http://www.ageagle.com [Online; accessed2-April-2014].

[4] Apache Mahout. http://mahout.apache.org/ [Online; accessed14-Feb-2015].

[5] Apache Spark. https://spark.apache.org/ [Online; accessed14-Feb-2015].

[6] Cassandra. http://cassandra.apache.org/.

[7] CIMIS. California Irrigation Management Information System.http://wwwcimis.water.ca.gov/cimis/welcome.jsp [Online;accessed 2-April-2014].

[8] Climate Corporation (A Monsanto Company), 2014.http://www.climate.com/ [Online; accessed 2-April-2014].

[9] CRAN R Project. http://cran.r-project.org.

[10] Department of Energy Green Button Intiative.http://energy.gov/data/green-button. [Online; accessed25-March-2015].

[11] Google earth. http://www.google.com/earth/ [Online; accessed14-Feb-2015].

[12] gThrive, 2014. http://www.gthrive.com/ [Online; accessed2-April-2014].

[13] Hadoop MapReduce. http://hadoop.apache.org/ [Online;accessed 14-Feb-2015].

[14] Lumify: open source big data analysis and visualizationplatform. http://lumify.io/ [Online; accessed 14-Feb-2015].

[15] Machine Learning Library. http://spark.apache.org/mllib/[Online; accessed 14-Feb-2015].

[16] Machine Learning Open Source Software.http://jmlr.org/mloss/ [Online; accessed 14-Feb-2015].

[17] MatLab. http://www.mathworks.com/products/matlab/[Online; accessed 14-Feb-2015].

[18] MyAgCentral. MyAgCentral Launches Cloud-Based SolutionFor UAV Imagery, Feb. 2014. http://www.myagcentral.com/http://www.precisionag.com/equipment/myagcentral-launches-cloud-based-solution-for-uav-imagery/[Online; accessed 2-April-2014].

[19] OnFarm: SaaS-based Data Visualization Product.www.onfarm.com. [Online; accessed 24-March-2014].

[20] PostgreSQL. http://www.postgresql.org/. [Online; accessed25-March-2015].

[21] Pow Wow Energy: SaaS-based Smart Leak Detection.https://www.powwowenergy.com/learnMore/. [Online; accessed24-March-2014].

[22] PrecisionHawk. http://www.precisionhawk.com [Online;accessed 2-April-2014].

[23] Prediction IO. http://prediction.io/ [Online; accessed14-Feb-2015].

[24] Satellite imaging corp. http://www.satimagingcorp.com/applications/natural-resources/agriculture/ [Online; accessed14-Feb-2015].

[25] B. Sethuramasamyraja, H. Singh, andG. Mazhuvancheriparambath. Geospatial modeling of winegrape quality (anthocyanin) for optimum sampling strategy inmechanized on-the-go differential harvesting programs.International Journal of Engineering Science and Technology,2(10), 2010.

[26] Spark Applications.https://github.com/databricks/reference-apps [Online; accessed14-Feb-2015].

[27] Terravion. http://www.terravion.com/ [Online; accessed14-Feb-2015].

[28] VineRangers. http://www.vinerangers.com/ [Online; accessed14-Feb-2015].

[29] WatrHub, 2014. http://www.watrhub.com/ [Online; accessed2-April-2014].