development of a model of data-flows for precision agriculture based on a collaborative research...

13
computers and electronics in agriculture 66 ( 2 0 0 9 ) 25–37 available at www.sciencedirect.com journal homepage: www.elsevier.com/locate/compag Development of a model of data-flows for precision agriculture based on a collaborative research project Edward Nash a,, Frank Dreger b , Jürgen Schwarz b , Ralf Bill a , Armin Werner b a Institute for Management of Rural Areas, Faculty of Agricultural and Environmental Sciences, Rostock University, Germany b Leibniz-Centre for Agricultural Landscape Research (ZALF), Department of Land Use Systems, Müncheberg, Germany article info Article history: Received 21 July 2008 Received in revised form 7 November 2008 Accepted 10 November 2008 Keywords: Precision agriculture Data-flows Information model abstract The understanding of actual and potential data-flows in the practice of precision agriculture (PA) is an essential prerequisite for the optimisation and automation of information man- agement in this field. As a contribution to this process, this paper presents an analysis of the data-flows within the work of a collaborative research project concerned with the testing of the methods developed within the project on two demonstration fields. This work provides a good case study for the modelling of a range of data-flows covering a broad spectrum of PA techniques. Using the notation and software tools for the Unified Modeling Language (UML), a com- plete model of all identified data-flows was created. Individual data-streams relating to particular source or product datasets were then extracted from this model. These data- streams present a practical application of the model in identifying the benefit that may be obtained from a particular gathered dataset (e.g. yield data) or in identifying the data that must be gathered to generate a particular product dataset (e.g. sustainability indi- cators). Whilst the current model is focussed on one particular research project, it has potential to be extended to cover more generally the common practice of precision agri- culture. Such a model may then be used by farmers as a roadmap for the adoption for precision agriculture by allowing them to determine what datasets are available to them or may be easily collected and what products they may generate from these, or vice versa to identify what datasets they must obtain in order to generate a particular dataset of interest. © 2008 Elsevier B.V. All rights reserved. 1. Introduction In this paper we present the results of the modelling of the work of the pre agro project relating to two demonstra- tion fields during the final year of this collaborative research project. The work on the demonstration fields was intended Corresponding author at: Institute for Management of Rural Areas, Faculty of Agricultural and Environmental Sciences, Rostock Univer- sity, Justus-von-Liebig-Weg 6, 18059 Rostock, Germany. Tel.: +49 381 498 3212. E-mail addresses: [email protected] (E. Nash), [email protected] (F. Dreger), [email protected] (J. Schwarz), [email protected] (R. Bill), [email protected] (A. Werner). as a means of testing the methods and software which had been developed during the preceding years of the project. The modelling of this work concentrated mainly on the data-flows within the project and the individual datasets gathered, pro- cessed or created by the individual sub-projects for their work on the demonstration fields. 0168-1699/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.compag.2008.11.005

Upload: edward-nash

Post on 05-Sep-2016

231 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Development of a model of data-flows for precision agriculture based on a collaborative research project

Da

Ea

b

a

A

R

R

7

A

K

P

D

I

1

Ittp

s

r0d

c o m p u t e r s a n d e l e c t r o n i c s i n a g r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37

avai lab le at www.sc iencedi rec t .com

journa l homepage: www.e lsev ier .com/ locate /compag

evelopment of a model of data-flows for precisiongriculture based on a collaborative research project

dward Nasha,∗, Frank Dregerb, Jürgen Schwarzb, Ralf Bill a, Armin Wernerb

Institute for Management of Rural Areas, Faculty of Agricultural and Environmental Sciences, Rostock University, GermanyLeibniz-Centre for Agricultural Landscape Research (ZALF), Department of Land Use Systems, Müncheberg, Germany

r t i c l e i n f o

rticle history:

eceived 21 July 2008

eceived in revised form

November 2008

ccepted 10 November 2008

eywords:

recision agriculture

ata-flows

nformation model

a b s t r a c t

The understanding of actual and potential data-flows in the practice of precision agriculture

(PA) is an essential prerequisite for the optimisation and automation of information man-

agement in this field. As a contribution to this process, this paper presents an analysis of the

data-flows within the work of a collaborative research project concerned with the testing of

the methods developed within the project on two demonstration fields. This work provides

a good case study for the modelling of a range of data-flows covering a broad spectrum of

PA techniques.

Using the notation and software tools for the Unified Modeling Language (UML), a com-

plete model of all identified data-flows was created. Individual data-streams relating to

particular source or product datasets were then extracted from this model. These data-

streams present a practical application of the model in identifying the benefit that may

be obtained from a particular gathered dataset (e.g. yield data) or in identifying the data

that must be gathered to generate a particular product dataset (e.g. sustainability indi-

cators). Whilst the current model is focussed on one particular research project, it has

potential to be extended to cover more generally the common practice of precision agri-

culture. Such a model may then be used by farmers as a roadmap for the adoption for

precision agriculture by allowing them to determine what datasets are available to them or

may be easily collected and what products they may generate from these, or vice versa

to identify what datasets they must obtain in order to generate a particular dataset of

interest.

. Introduction

n this paper we present the results of the modelling ofhe work of the pre agro project relating to two demonstra-ion fields during the final year of this collaborative researchroject. The work on the demonstration fields was intended

∗ Corresponding author at: Institute for Management of Rural Areas, Facity, Justus-von-Liebig-Weg 6, 18059 Rostock, Germany. Tel.: +49 381 498

E-mail addresses: [email protected] (E. Nash), [email protected]@uni-rostock.de (R. Bill), [email protected] (A. Werner).168-1699/$ – see front matter © 2008 Elsevier B.V. All rights reserved.oi:10.1016/j.compag.2008.11.005

© 2008 Elsevier B.V. All rights reserved.

as a means of testing the methods and software which hadbeen developed during the preceding years of the project. Themodelling of this work concentrated mainly on the data-flowswithin the project and the individual datasets gathered, pro-

ulty of Agricultural and Environmental Sciences, Rostock Univer-3212.

lf.de (F. Dreger), [email protected] (J. Schwarz),

cessed or created by the individual sub-projects for their workon the demonstration fields.

Page 2: Development of a model of data-flows for precision agriculture based on a collaborative research project

i n a

26 c o m p u t e r s a n d e l e c t r o n i c s

The modelling of data-flows in precision agriculture (PA)1

is not new: e.g. Fountas et al. (2006) present an analysis ofinformation flows in decision-making. Whilst that work con-centrates on the information and its context at the point adecision is made, the current paper has a more general focusof analysing the (potential) relationships and transformationsbetween the datasets which will then be used either for deriva-tion of further datasets or as the basis for decision-making.We therefore wish to consider not only the datasets uponwhich decisions are based but more generally all datasetswhich are required at some point in providing informationupon which decisions are based and the relationships betweenthese datasets. The analysis of actual decision-making andprocessing methods for PA is however outside the scope ofthis paper.

It should also be noted at this point that the model pre-sented here is necessarily incomplete as the application fieldswere used for demonstrating methods under investigation inone particular research project; pre agro. There are many validmethods in PA which were not part of the research in pre agroand for which the datasets and data models are not part ofthis model.

This paper will first introduce the motivation for theresearch described here as well as the background of thepre agro project and the work on two demonstration fieldswhich underlies this paper before presenting the methodologyused for modelling data-flows. The results are then presented,first as a whole model of all data-flows identified and there-after as individual data-streams highlighting how individualdatasets were used or generated. These data-streams havebeen chosen as being widely relevant to PA: sustainability, andassessment thereof, is a persistent topic, as is the generationand use of management zones as a form of site-specific agri-culture, whilst yield mapping and remote sensing are two ofthe most widely used data sources in PA. The paper finisheswith a discussion of the results presented here, their limita-tions and their relevance to PA and PA research.

1.1. Motivation

As well as providing a document of the work of a collabora-tive research project on PA, which we hope is an interestingsubject in its own right, this model may serve as a basis forthe better understanding of the data-flows for any farm prac-tising PA. Such an understanding is an essential prerequisitefor a future optimisation of these data-flows, e.g. as part ofa service-oriented architecture (SOA) for PA in which eachindividual dataset or processing task may be performed bya web-service provided by a different specialist organisation

(Nash et al., 2007b). Whilst such a SOA may be a medium-long term vision, the clear identification of required datasetsand the links between them in this model may also assist inthe short-term in the further development of standardised

1 Note that pre agro focused on crop production technologies.Hence, the term “precision agriculture” in this paper refers only tocrop production. Any processes linked to precision livestock farm-ing or other ‘precision’ technologies are outside the scope of thispaper.

g r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37

data transfer formats such as agroXML (KTBL, 2008) for theexchange of PA data between farmers and their agriculturalpartners by identifying the data which such formats must becapable of representing. Even for the successful adoption ofdecision-support systems (DSS) for PA, an understanding ofthe relationships between different datasets is essential—if itis clear what source data are being used in what way it shouldenable a better evaluation of the credibility of a method,which is an important factor in the adoption of agriculturalDSS (Matthews et al., 2008; Carberry et al., 2002; McCown,2002).

Although many of the methods and algorithms used as partof a scientific project such as pre agro are not yet sufficientlydeveloped for widespread practical use by farmers, many ofthe datasets used, generated and exchanged are identical withthose commonly used in normal practice. A further expectedbenefit of this model is therefore in the identification of whatresults may be obtained based on which datasets, and whichdatasets are necessary for the generation of which results, i.e.to answer the questions “what can I do with dataset X?” and“what datasets do I need to generate Y?”. Farmers are preparedto collect, finance and manage geographically referenced datawhen the uses and benefits of the data are clear (Vogel et al.,2008), and this aspect of the model may therefore be a meansof increasing acceptance of PA among farmers.

1.2. The pre agro project

pre agro was an interdisciplinary collaborative research projectinvolving 26 partner institutions. The project consisted of 22sub-projects which split into 4 project domains (Table 1), notall of which were directly involved in the work modelled here.The project reported on here, running from 2005 to 2007, wasthe second phase, with the first phase having run from 1999to 2003. Further information, in both German and English, isavailable on the project website.2

PA offers new options in gathering data and informa-tion in crop production. The aims of the second projectphase lay not only in the development of ‘traditional’ PA,i.e. the characterisation of soils and crop stands as well assite-specific applications, but also on the development of aninformation-driven crop production, with the farm as onenode in a network of geospatial and agricultural data man-agement, as well as the acceptance of PA in the agriculturalcommunity (e.g. Reichardt & Jürgens, in press) and henceto encourage the actual utilisation of precision agricultureapplications. The ongoing endeavours to standardise inter-faces between machinery and software (ISOBUS, agroXML, EDITeelt, eDAPLOS, etc.) enable new options for the documen-tation of production processes and for the communicationwith the value chain of food production as well as with otherstakeholders. It was another target of the project to developa sustainability certificate to be applied on single farms

using, wherever possible, sustainability indicators derivedfrom datasets gathered by precision agriculture applications(Schaffner et al., 2008; Schaffner and Hövelmann, in press).

2 http://www.preagro.de.

Page 3: Development of a model of data-flows for precision agriculture based on a collaborative research project

c o m p u t e r s a n d e l e c t r o n i c s i n a g r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37 27

Table 1 – Structure of the pre agro project.

Sub-project (TP) Domain Title

1 Value chain of food Sustainable added value chain food2 Value chain of food Shaping technology by participation along the agro-food chain3 Cross-domain Economic evaluation of precision agriculture in a farm-wide

context4 Site-specific cropping Nature conservation5 Value chain of food Economical analysis6 Value chain of food The acceptance of precision agriculture7 Information management Geodata infrastructure for precision farming8 Information management Integration of automated process data acquisition in

information flows9 Site-specific cropping Integrated crop management

10 Site-specific cropping Site-specific plant protection11 Site-specific cropping Research of methods for operationally on-farm experiments in

precision farming12 Site-specific cropping Remote sensing of plant diseases13 Characterisation of sites and crop stands Model-based analyses with remote sensing14 Characterisation of sites and crop stands Development of an integrative site analysis15 Characterisation of sites and crop stands Model supported generation of yield expectation maps16 Characterisation of sites and crop stands Potential root growth17 Information management Information processing in business18 Information management Office-software19 Information management agroXML as electronic data interchange language for precision

farming20 Cross-domain Transfer and education

1

TA

ayccta

F

21 Cross-domain22 Cross-domain

.3. The demonstration fields

he demonstration fields are located on farms in Saxony-nhalt and Lower Saxony, Germany (Fig. 1).

The farm WIMEX, located in Wulfen/Saxony-Anhalt hassize of about 7000 ha. The average temperature in recent

ears was ca. 9.0 ◦C and the annual precipitation 474 mm. The

limate is characterised by an early summer drought. Mainrops are winter wheat, corn, spring barley and canola (or win-er rape). The parent materials of soils are Pleistocene glacialnd periglacial sediments of Saale and Weichsel glacials,

ig. 1 – Location of the demonstration farms in Germany.

Overall coordination of the project’s contentProject information system

deposited on top of Mesozoic to Canenozoic sedimentaryrocks. The dominant soil types (according to the World Ref-erence Base, IUSS Working Group WRB) are Chernozems. Thespatial variability of the soils can be estimated using the Ger-man soil classification system (‘Reichsbodenschätzung’ (RBS)from 1934). This system has a range from 0 (very poor soil) upto 100 (very good soil). The soils at the WIMEX farm are veryheterogeneous with a range in the RBS from 16 to 98 with anarea-weighted mean of 70.6.

The farm Täger-Farny, located in Groß-Twülpstedt/LowerSaxony has a size of about 450 ha. The average temperaturein recent years was ca. 8.4 ◦C and the annual precipita-tion 600 mm. Main crops are winter wheat, sugar beets andcanola (winter rape). The parent materials of soils are Pleis-tocene glacial and periglacial sediments of the Saale glacialdeposited on top of Middle- and Upper Jurassic claystone orolder moraine material of the Weichsel glaciation. The dom-inant soil types are Vertisols and Stagnosols. For the farmTäger-Farny the RBS values range from 32 to 63 with an area-weighted mean of 48.4.

The demonstration fields were selected (i) due to the cropwinter wheat, (ii) a sufficient size for the experimental sites,(iii) good accessibility for measurements, and (iv) last but notleast the usage of the field at the farm Täger-Farny for a fielddemonstration day (Feldtag).

2. Methodology

The aim of the model presented here is to represent alldatasets used during the work on the demonstration fields,whether these datasets were pre-existing and obtained from

Page 4: Development of a model of data-flows for precision agriculture based on a collaborative research project

28 c o m p u t e r s a n d e l e c t r o n i c s i n a g r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37

repr

Fig. 2 – Symbology used to

external sources, collected within the project or generatedbased upon other datasets (the ‘status’ of the dataset), whichsub-project (TP) was responsible for the dataset, and the rela-tionships between the datasets. No attempt has been madeto record the methods and algorithms used for collection orgeneration of datasets, although the model could perhaps beextended to include this information. Three facets were cho-sen as the primary basis for the model:

1. A representation of the complete data-flow.2. A representation of selected complete data-streams relat-

ing to individual datasets as a means of answering thequestions introduced in Section 1.1.

3. The identification of data-flows for individual sub-projectsas a means of gathering feedback during the developmentof the model and for identification of locally relevant andintermediate datasets.

Each of these facets is of course interlinked, and the com-plete model may be regarded as the sum of the individualdata-flows or the individual data-streams as extracts from thecomplete model. Since different users are interested in differ-ent aspects of the model, and the complete model is inherentlycomplex, the simplification offered by the individual facetsimproves the usability of the model.

Note that we use two similar but distinct terms for differentaspects of the model; data-flows and data-streams. A data-flow is defined as any single or multiple transition betweenany two related datasets, whereas a data-stream is a sequenceof data-flows which has one or more distinct source and tar-get datasets which will be directly collected or used in PA.That is to say, a data-stream links the recognised inputs anddesired outputs for a particular analysis or set of analyses inPA whereas a data-flow may represent nothing more than anintermediate processing step.

For the recording and presentation of the model, theinternational standard Unified Modeling Language (UML) waschosen due to its flexibility and the availability of softwaretools capable of handling complex models and generatinggraphical views of subsets of these models. In particular, thestate diagram was chosen as the main means of represent-ing the model, with each dataset representing a state, and thedataset source being represented as the class to which that

state refers. The data-flows are therefore represented by thetransitions between these states, whilst the stereotype mech-anism was used to represent the status of each dataset. Wherea particular dataset was used on only one of the demon-

esent model of data-flows.

stration fields, generally due to the differing data availabilitybetween the farms, this is indicated as a condition applyingto the transition.

The graphical representation used is presented in Fig. 2.This is a small subset of UML notation and we will onlyexplain the symbols used here—for a complete coverage ofUML, see e.g. Booch et al. (1999). Each dataset is representedas a rounded rectangle (the UML state symbol). Within, the sta-tus of the dataset as used in the project (pre-existing, collectedor generated) is shown in the first line in guillemets (« », theUML stereotype notation). This is followed by the source of thedataset within the project (usually a specific sub-project (TPfrom the German ‘Teilprojekt’)) before the double-colon (theUML object name notation) and the name of the dataset afterthe double-colon (the UML state name notation). Individualdata-flows are indicated using arrows from a source datasetto a target dataset (the UML transition notation). Where adata-flow was only present in relation to a particular farm,the arrow is annotated with the name of the farm in squarebrackets (the UML condition notation).

The model was created using the software EnterpriseArchitect (SparxSystems, 2007), which allows the model to bestored in a database and appropriate diagrams to be gener-ated to represent different aspects. Although we have usedthe state diagram in a non-standard way, the software andmethod were found to be eminently suitable for the task.

An initial draft of the model was prepared based on thedocumented aims and initial work-plan for the demonstrationfields. Individual extracts of this model were then prepared foreach sub-project to allow them to provide feedback as to whichdatasets they used and generated in reality. The feedback fromeach sub-project was then matched and incorporated into themodel before it was again sent out for feedback. This itera-tive process, with each sub-project able to concentrate on thedata-flows around their individual work allowed the modelto be rapidly developed and improved. Once the model wasfinalised, individual data-streams were identified to illustratehow extracts of the necessarily complex complete model maybe used to highlight relationships between different datasets.

3. Results

The complete model, as expected, rapidly increased in com-plexity such that the representation of it in a single legible dia-gram became nearly impossible (Fig. 3). The large number ofdata-flows highlights the good coordination and cooperation

Page 5: Development of a model of data-flows for precision agriculture based on a collaborative research project

c o m p u t e r s a n d e l e c t r o n i c s i n a g r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37 29

am n

rpftmgwdrtoei

nflomam

Fig. 3 – Data-flow diagram using UML State Diagr

equired within a large interdisciplinary collaborative researchroject and achieved in pre agro, which will hopefully informuture projects as to what may be expected. More generally,he complexity reflects the fact that many datasets used in PA

ay be required for many different purposes, and that manyenerated datasets are based on a large number of datasets,hich may be in turn be based on further datasets. Individualata-streams may be deduced from Fig. 3 by identifying theelevant source or target dataset and then repeatedly workinghrough the outgoing or incoming data-flows to determine thether datasets forming part of that data-stream. This is how-ver an onerous task and so selected data-streams have beendentified and are presented in Figs. 4–8.

It can be seen in Fig. 3 that some datasets form centralodes with a large number of incoming and/or outgoing data-ows. Datasets which are generated based on a large number

f sources (e.g. ‘Sustainability’ and ‘Cost-Effectiveness’) haveany incoming data-flows, and those which are used assource for the generation of many other datasets haveany outgoing arrows (e.g. ‘N-Fertilisation’). Other datasets

otation for all work on the demonstration fields.

have significant numbers of both incoming and outgoingdata-flows (e.g. ‘Yield Target Map’ and ‘Crop ProtectionApplication’) whereas further generated datasets have fewincoming data-flows and collected datasets few outgoingdata-flows. Although from this it can probably be concludedthat some datasets require a more widespread data collec-tion to generate than others and that some datasets are more‘multi-purpose’ than others, the importance of each datasetcannot be deduced from this model: a dataset may be demand-ing to generate despite being based on a single source or mayhave a single use which is essential for correct application ofa particular PA method.

Even the legible representation of individual data-streamsshowing which original source (collected or pre-existing)datasets were exploited for the generation of result datasetswas in many cases problematic. Nevertheless, following sec-

tions present some of these data-streams, each focussedeither on one particular source or one particular target dataset.For the presentation of data-streams from source datasets,the datasets generated from these sources, and subsequent
Page 6: Development of a model of data-flows for precision agriculture based on a collaborative research project

30 c o m p u t e r s a n d e l e c t r o n i c s i n a g r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37

nera

Fig. 4 – Data-stream leading to ge

generations, are shown. The relations to other prerequisitedatasets for the generated datasets are not represented, evenwhere these datasets are shown in the diagram. For the pre-sentation of data-streams to target datasets, the immediate

Fig. 5 – Data-stream leading to gen

tion of sustainability indicators.

prerequisite datasets of the target and all their prerequisitedatasets, and the relations between them, are represented,together with the datasets for which the target dataset is animmediate prerequisite. For these latter datasets, relations to

eration of management zones.

Page 7: Development of a model of data-flows for precision agriculture based on a collaborative research project

c o m p u t e r s a n d e l e c t r o n i c s i n a g r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37 31

he e

odcda

Fig. 6 – Data-stream for t

ther datasets (i.e. other prerequisites) are not shown, even for

atasets which already appear in the diagram. The datasetsan therefore be read as showing all possible uses of sourceatasets and all prerequisite datasets for target datasets, i.e.nswering the questions presented in Section 1.1.

Fig. 7 – Data-stream for the exploit

xploitation of yield data.

3.1. Datasets collected, used and generated

In total, 60 distinct datasets, summarised in Table 2 are presentin the model. Of these, 7 were pre-existing datasets (sta-tus ‘E’), 14 collected as part of the pre agro project (status

ation of remote sensing data.

Page 8: Development of a model of data-flows for precision agriculture based on a collaborative research project

32 c o m p u t e r s a n d e l e c t r o n i c s i n a g r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37

nd R

Fig. 8 – Data-streams for the exploitation of soil data a

‘C’) and 34 generated based on other datasets (status ‘G’).In some cases (e.g. ‘Infestation Monitoring’), certain datasetswere used to decide exactly where, when and what data tocollect; in these cases the new dataset is indicated as col-lected rather than generated as this reflects the status ofthe primary information contained. The source column indi-cates where the dataset was obtained from within the project.Where only a number is given this is the number of the sub-project within pre agro (cf. Table 1). A number of datasets wereused which were collected as part of the first phase of thepre agro project. Although the status of these datasets couldbe determined, the source was not any sub-project withinthe second phase of the project and are therefore indicatedas having the source preMIS, the pre agro Management andInformation System (Korduan, 2004), which is the central datarepository for the project. The dataset ‘Market Prices’ wasobtained from the ‘REPRO’ software system for producing sus-tainability indicators (Christensen et al., 2008). The ‘BiotopeMapping’ was obtained directly from the relevant regional gov-ernment agencies. Some of the datasets taken from preMISwere also originally obtained from such external sources (e.g.‘RBS’ and ‘Topographic Map 1:25000’) and so could also havebeen considered to have source ‘external’.

3.2. Data-stream for generating sustainabilityindicators

Sub-project 1 was concerned with the development of sustain-ability indicators, both at the level of the individual field andat the level of the farm. More information about the work inthis sub-project and the sustainability indicators can be found

BS (“Reichsbodenschätzung” soil classification maps).

in Schaffner et al. (2008) and Schaffner and Hövelmann (inpress). The generation of such indicators requires a holisticview of the agricultural, economic and ecological aspects ofcrop production and farm management, necessitating the useof a large number of input datasets. This is well illustratedin Fig. 4, which shows the data-stream used to generate thesustainability indicators. As well as data related to individualon-field and part-field processes (e.g. fertilisation, crop pro-tection and yield), information regarding the management ofthe farm (e.g. labour and financial data and QA systems inuse) and external factors (e.g. consumer expectations) are allrequired. Some datasets may more-or-less directly be used toassess particular sustainability indicators (those shown with adata-flow directly to ‘Sustainability’ in Fig. 4), whereas othersare only used indirectly (e.g. ‘DTM’), as the source for otherdatasets which may be used to assess sustainability. With-out such a data-stream analysis the relevance of these latterdatasets to sustainability assessments may not be apparent.

3.3. Data-stream for generating management zones

The procedure used for automated generation of prototypemanagement zones within pre agro is described in more detailin Nash et al. (2007a). Essentially, a hill-climbing clusteringalgorithm (Rubin, 1967) was applied to allow any relevantraster dataset to be used as a factor in the delineation of ad hoczones. The ease-of-use of such an automated approach may

allow use of specific management zones based on data rel-evant to a particular application rather than using the samezones for all site-specific applications in one or more growingseasons. The range of source datasets shown in Fig. 5 illus-
Page 9: Development of a model of data-flows for precision agriculture based on a collaborative research project

c o m p u t e r s a n d e l e c t r o n i c s i n a g r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37 33

Table 2 – Datasets included in the data-flow model.

Dataset Source Status Description

Banded over-seed fertilisation 9 G Constant fertilisation applied to maizecrop

Biotope mapping External E Official habitat mapping from regionalenvironmental agency

Camera measurements 12 C Targeted camera measurements for cropdamage monitoring

Clay content map 14 G Map of clay content of soil in %Consumer expectations 2 C Expectations from consumers as to the

outcomes of agriculture regardingproduct, environment, etc.

Corporate social responsibility 1 C Involvement of the farm in the localcommunity (e.g. sponsorship and opendays)

Cost-benefit analysis 5 G Analysis of benefits and drawbacks of useof PA technology in terms of product,economy, environment, etc.

Cost-effectiveness 3 G Evaluation of cost-effectiveness ofPA/site-specific farming compared toconstant rate

Crop protection agent costs 9 E Costs for herb- and pesticidesCrop protection application 10 G Site-specific herb- and pesticide

applicationCrop rotation 9 G Planned crop rotation (2 years)Crop vitality map 13 G Map of relative crop vitality based on

fluorescenceData model in FMIS 18 G Model used to manage farm data in a

commercial farm managementinformation system

DTM Premis C High-resolution digital terrain modelECa mapping Premis G Interpolated apparent electrical

conductivity mapsECa measurement 14 C Apparent electrical conductivityEnvironmental protection quality targets 4 G Goals for environmental management

based on local habitat and contextEnvironmental protection quality targets Premis G Goals for environmental management

based on local habitat and contextExpected nutrient removal 3 G Total nutrient removal from main and

by-products of one crop rotationExpected yield map 13 G Estimated site-specific yield for the

current cropFarm financial data 1 C Economic data from the farm including

balance, credit, turnover, profit/loss, etc.Farm labour data 1 C Personnel data including wages, training,

holiday days, etc.Fertiliser/crop protection types 18 E Classification based on licensed agent

typesFertiliser costs 9 E Costs for fertiliserFuel consumption 8 G Fuel consumption per operation from

board computer dataGround truth 9 C Data collected in-field for validating

remotely sensed dataHeterogeneity indicator 5 G Measure of the diversity within a crop

standHistorical yield Premis C Raw yield for previous yearsHyperspectral remote sensor data 13 C Collected from an airborne platformInfestation monitoring 12 C Observations of infestation and fungal

infectionInterpolated yield Premis G Yield mapping from filtered and

interpolated dataK-fertilisation 9 G Site-specific potassium fertilisationLAI map 13 G Leaf area index mappingManagement zones 7 G Ad hoc partfield zonesMarket prices 3 C Market prices for crops and consumablesMarket prices Repro E Market prices for crops and consumables

Page 10: Development of a model of data-flows for precision agriculture based on a collaborative research project

34 c o m p u t e r s a n d e l e c t r o n i c s i n a g r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37

Table 2 – (Continued )

Dataset Source Status Description

N-fertilisation 9 G Site-specific nitrogen fertilisationN-sensor data 9 C Measurements from YARA N-SensorNutrients in harvest residues 3 G Expected nutrient content of harvest

residues left in situOperation documentation 8 G Documentation of individual site-specific

operations carried outPenetrometer resistance 14 C Results of in-field penetrometer

measurementsP-fertilisation 9 G Site-specific phosphorus fertilisationPotential root depth 16 G Map of theoretical maximum root depth

for plants based on local geologicalconditions

PRI map 15 G Photochemical reflectance index mappingProcess data 8 C Status-monitoring data continuously

gathered by board computerQA systems 1 C Quality assurance systems in use on the

farmRaw yield 9 C Yield measured using on-board sensorsRBS Premis E “Reichsbodenschätzung” (German

national agricultural land class dataset)Soil compression map 14 G Map of soil compression based on soil

information and penetrometer resistanceSoil data Premis C Results from soil probesSoil probes 3 C Results of laboratory analysis of soil coresSowing map 9 G Site-specific sowingSustainability 1 G Indicators for social, economic and

ecological sustainability of the farmTopographic map 1:25,000 Premis E Standard official topographic mapTotal nutrient requirement 3 G Expected nutrient removal for a crop

rotationTrial evaluation 11 G Verification of on-farm experimental

methodologyTWI Premis G Topographic wetness indexYield analysis 3 G Initial inspection of yield dataYield target map 15 G Theoretical maximum site-specific yield

capacity

Yield zones Premis

trates some potential inputs to the generation of managementzones: both data which requires some existing commitmentto PA (yield mapping, preferably from multiple years) and datawhich may be available from external sources (digital terrainmodels) or rapidly collected using local or remote sensors(DTM, ECa and hyperspectral remote sensing). Particularly theuse of the latter two categories of input may allow the gener-ation of management zones as an initial stage of PA adoptionwith a low initial investment: presenting farmers with sucha data-stream analysis may help illustrate how their exist-ing data, or data which may be easily and cheaply acquired,could be used for PA. Some of these datasets however requirea degree of pre-processing in order to be used for generationof management zones (e.g. extraction of a topographic wet-ness index from a DTM)—in this case the data-stream analysismay be used by software providers to identify what automaticprocessing may be possible and/or required.

3.4. Data-stream for exploitation of yield data

Yield data plays a central role in many decisions in PA(Blackmore, 2000; Werner et al., 2002). Two sets of yield data

G High/middle/low expected yield zonesgenerated from multi-year yield mapping

were used during the work on the demonstration fields. Thedata-streams for the exploitation of both of these datasets areshown in Fig. 6. On the one hand, the yield maps from pre-vious years were used as a basis for the generation of manydatasets relating to planned applications (management zones,N-fertilisation, and crop protection). On the other hand, theraw yield data collected for the current year was used as thebasis for datasets evaluating aspects of the management of thefield during the year (cost-effectiveness, sustainability), oftenin combination with the historical yield. The significance ofthese data-streams are analysed further in Section 4.

3.5. Data-stream for exploitation of remote-sensingdata

Remote-sensing (RS) data are of interest for PA as they givea spatially high-resolution continuous dataset which may berepeatedly captured at varying intervals without having any

influence on the growth of the crop (e.g. due to soil com-paction). In the pre agro project the airborne AVIS-2 system,described in Mauser (2003), was used for collecting hyperspec-tral data at desired points in time. From this hyperspectral
Page 11: Development of a model of data-flows for precision agriculture based on a collaborative research project

n a g

dtyeptaosvmmpRp

4

OaaGedmitttafdro

ddpiatmKap

fdaflmd

aptptr

c o m p u t e r s a n d e l e c t r o n i c s i

ata a number of bio-physical parameters (photometric reflec-ive index, leaf-area index, relative crop vitality and expectedield) were derived using a radiative transfer model (Begiebingt al., in press). These parameters were then used withinre agro for the generation of further datasets (Fig. 7). Withhis approach, similar or identical indices at varying spatialnd temporal resolutions may be equally well generated usingther hyper- or multispectral air- or space-borne remote sen-ors, and the increasing interest in low-cost unmanned aerialehicles (e.g. Strombaugh et al., 2003; Rydberg et al., 2007)eans that RS may in the future become an increasingly com-on source of data for PA. Fig. 7 shows which bio-physical

arameters it should be possible to generate from these newS platforms, with the data-flows ‘downstream’ from thesearameters being generally applicable for all RS data sources.

. Discussion

ne area in which the modelling was found to be problem-tic was in the distinction between planned and as-appliedpplication maps, e.g. for fertiliser or pesticide application.enerally these represent the end of one data-stream for gen-rating an application map followed by the collection of a newataset, the as-applied data, during the application, whichay then be used in further data-streams. However, feedback

ndicated that in the majority of cases the planned applica-ion maps were used in preference to the as-applied data dueo problems with the reliability of collecting the latter. Applica-ion maps were therefore treated as a single dataset, implicitlyssuming that the planned application was carried out per-ectly. The only datasets collected using on-board equipmenturing field operations are therefore the process data and theaw yield data, being both datasets which are not directly partf the application planning.

In some areas of the model, alternative methods or inputata were used to generate datasets during the work on theemonstration fields as compared with other work in theroject or the standard farm practice. Where a significant

mprovement in the scope or detail of the model could bechieved through including data-flows not actually used onhe demonstration fields then these were included in the

odel. Examples are in the generation of site-specific P- and-fertiliser application maps, which were not used on thepplication fields where a constant P- and K-fertilisation wasractised.

Due to differing availability of data on each of the twoarms, different input data were in some cases used on eachemonstration field. In these cases, the transition arrow isnnotated with the name of the farm on which the data-ow occurred; either WIMEX or Täger-Farny. To generalise theodel it would be appropriate to consider any or all of these

ata-flows as being valid.The work presented here has shown the data-flows within

research project on PA. Although it is hoped that the modelroduced is generally applicable, it is inevitable that it reflects

he specific work on the demonstration fields in the pre agroroject rather than general data-flows in PA research or prac-ice. One instance of this bias can perhaps be found in theelative use of soil data and yield data: in practice soil probe

r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37 35

data is regarded as more useful (Fountas et al., 2003) and ismore widely used than yield mapping (Reichardt and Jürgens,2006). As Fig. 8 shows, soil data were not widely used in thepre agro work, with data from probes being used to gener-ate two datasets. More widely used was the German RBS soildataset which due to nationwide availability is a common datasource in German PA and was directly used for generation oftwo datasets and indirectly a further four. This compares withthe yield data which, as shown in Fig. 6 and ignoring the simpleproduction of derived yield datasets, was used directly in thegeneration of nine datasets and indirectly a further three. Thisapparent divergence is probably the result of the availabilityof data and choice of methods for use on the demonstrationfields; it does however illustrate well that the data-flow modelpresented here does not necessarily represent widespreadpractice.

A further aspect in which this model is specific to pre agrois in the sources and statuses of the datasets as shown inFigs. 3–8 and Table 2. Although the statuses of the datasetsmay be considered as being generally applicable (with theconsideration that it may be possible, e.g. to directly collectdatasets which were generated or pre-existing for pre agro), thesources of the datasets cannot be considered general. How-ever, since each sub-project was concerned with a differentaspect of information-intensive agriculture (as can be seenin the sub-project titles in Table 1), the different sources pre-sented here could be used as a guide as to the separation intodifferent organisations which may provide collection or pro-cessing of the datasets or the range of services and serviceproviders which may be expected in a future SOA for PA. Sucha separation into different ‘real-life’ organisations as sourcesfor the datasets would require further research around com-mercial PA practitioners and the structure of the commercialPA ‘ecosystem’.

A useful extension to the work presented here would there-fore be the continuation of the modelling based on otherregions, and including both research projects and commer-cial farms, with an indication of whether the datasets anddata-flows are those currently the subject of research orwhether they are already in (widespread) commercial use. Asthe literature on PA however shows, there are many differ-ent methods which, depending on the circumstances, delivercredible results and which base on many different datasets.Attempting to model every potential data-flow, and when itmay or should occur, would therefore likely result in a modelso large as to be unusable with many parallel data-streams.An approach based on individual case-studies (such as theone presented here) resulting in a number of isolated mod-els which may then be compared in an attempt to produce ageneral model is therefore to be preferred.

Nevertheless, the work presented in this paper should helpto achieve a better general understanding of data-flows in agri-cultural practice. In particular, the repeated usage of particulardatasets, e.g. for the generation of seeding, fertilisation andpesticide application can be analysed. A better understandingof the usage of the various datasets which are collected dur-

ing agricultural processes should help to optimise the datacollection, e.g. it may be more economically viable to collectmulti-purpose datasets. Additionally, the requirement to usedata from external sources, e.g. regional administrations, for
Page 12: Development of a model of data-flows for precision agriculture based on a collaborative research project

i n a

r

36 c o m p u t e r s a n d e l e c t r o n i c s

particular tasks can be seen. Highlighting the importance andwide-range of potential usage of such datasets in agriculturemay influence external agencies to make this data more easilyaccessible, for instance via spatial data infrastructures.

In the future, different farm types may benefit from sucha data-flow model. The necessary immediate inputs for spe-cific PA techniques are generally well-known and so with someadjustments to take account of regional and other differences,a data-flow model may be used as the basis for a faster andeasier adoption of PA. An additional benefit of such a generaldata-flow model may be gained by software companies pro-ducing applications for PA. The model should help identifywhere data must be exchanged between different programs,or even different organisational units within or outside thefarm, e.g. with government administrations, consultants, andcontractors, and assist in the development of standardiseddata transfer formats and mechanisms to ensure interoper-ability. Particularly with an extension of the data-flow modelto e.g. livestock farming, a better integration of all on-farmprocesses should be possible. Some aspects of the model arealso relevant to farmers for whom PA is not of interest, e.g.the production of sustainability indicators may also be doneon a whole-field or whole-farm basis using similar datasets tothose presented here but collected at a more general level.

5. Conclusions

In this paper we have presented a data-flow model for thework on the demonstration fields in the pre agro collaborativeresearch project. The aim of this model is to show the rela-tionships between datasets as a means of understanding therole of different data in PA. From the model presented here itis possible to determine the datasets which where gatheredin order to generate a dataset of interest and which datasetsare derived products of the original source datasets. Althoughthe model is inevitably restricted to the processes carried outwithin one research project, it may be used as a base for amore generalised model of data-flows for precision farming.Such a generalised model may on the one hand improve thetransparency of data management in PA, potentially aidingadoption, whilst on the other hand providing the basis for anautomation of data handling using service-based distributedsystems.

Acknowledgements

This work was carried out as part of pre agro, a collabora-tive research project funded by the German Federal Ministryof Education and Research (BMBF). The work of the authorswas funded under grant numbers 0330663 and 0339740/2. Theauthors would like to take this opportunity to thank all col-leagues from the pre agro project who provided feedback whichhas assisted in the development of the model presented in thispaper.

e f e r e n c e s

Begiebing, S., Bach, H., Bobert, J., Wehrhan, M., Mauser, W., inpress. Observation of wheat growth based on hyperspectral

g r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37

data and a canopy reflectance model. Precision Agricul-ture.

Blackmore, S., 2000. The interpretation of trends from multipleyield maps. Computers and Electronics in Agriculture 26 (1),37–51.

Booch, G., Rumbaugh, J., Jacobson, I., 1999. The Unified ModellingLanguage User Guide. Addison-Wesley, ISBN 0-201-57168-4.

Carberry, P.S., Hochman, Z., McCown, R.L., Dalgliesh, N.P., Foale,M.A., Poulton, P.L., Hargreaves, J.N.G., Hargreaves, D.M.G.,Cawthray, S., Hillcoat, N., Robertson, M.J., 2002. TheFARMSCAPE approach to decision support: farmers’, advisers’,researchers’ monitoring, simulation, communication andperformance evaluation. Agricultural Systems 74 (1), 141–177.

Christensen, O., Wagner, B., Hülsbergen, K.-J., 2008. REPRO,http://www.oelb.uni-halle.de/repro/.

Fountas, S., Ess, D.R., Sorensen, C.G., Hawkins, S.E., Pedersen,H.H., Blackmore, B.S., Lowenberg-Deboer, J., 2003. Informationsources in precision agriculture in Denmark and the USA. In:Stafford, J., Werner, A. (Eds.), Precision Agriculture,Proceedings of the 4th ECPA. Berlin, Germany, June 15–19,2003. Wageningen Academic Publishers, The Netherlands, pp.211–216, ISBN 9076998213.

Fountas, S., Wulfsohn, D., Blackmore, B.S., Jacobsen, H.L.,Pedersen, S.M., 2006. A model of decision-making andinformation flows for information-intensive agriculture.Agricultural Systems 87 (2), 192–210.

Korduan, P., 2004. Metainformationssysteme für PrecisionAgriculture. Doctoral Thesis completed at the Faculty ofAgricultural and Environmental Science, Rostock University,Internal report 17. ISBN 3-66009-282-0.

KTBL (Kuratorium für Technik und Bauwesen in derLandwirtschaft), 2008. agroXML, http://www.agroxml.de.

Matthews, K.B., Schwarz, G., Buchan, K., Rivington, M., Miller, D.,2008. Wither agricultural DSS? Computers and Electronics inAgriculture 61 (2), 149–159.

Mauser, W., 2003. The airborne visible/infrared imagingspectrometer AVIS-2—multiangular und hyperspectral datafor environmental analysis. In: Proceedings of IGARSS 2003,Toulouse, France, July 21–25, 2003, pp. 2020–2022.

McCown, R.L., 2002. Changing systems for supporting farmers’decisions: problems, paradigms and prospects. AgriculturalSystems 74 (1), 179–220.

Nash, E., Bobert, J., Wenkel, K.-O., Mirschel, W., Wieland, R.,2007a. Geocomputing made simple: service-chain basedautomated geoprocessing for precision agriculture. In:Demsar, U. (Ed.), Proceedings of GeoComputation 2007.Maynooth, Ireland. National University of Ireland, Maynooth.

Nash, E., Korduan, P., Bill, R., 2007b. Optimising data flows inprecision agriculture using open geospatial web services. In:Stafford, J.V. (Ed.), Precision agriculture’07, Proceedings of the6th ECPA. Skiathos, Greece, June 3–6, 2007. WageningenAcademic Publishers, The Netherlands, pp. 753–759, ISBN9789086860241.

Reichardt, M., Jürgens, C., 2006. The farmers view on the usabilityof precision farming in Germany—results of a multitemporalsurvey. In: Agricultural Engineering for a Better World:Proceedings of XVI CIGR World Congress, VDI Verlag GmbH,Düsseldorf, Germany, ISBN 3-18-091958-2.

Reichardt, M., Jürgens, C., in press. Adoption and futureperspectives of precision farming in Germany: results ofseveral surveys among different agricultural target groups.Precision Agriculture.

Rubin, J., 1967. Optimal classification into groups: an approachfor solving the taxonomy problem. Journal of Theoretical

Biology 15 (1), 103–144.

Rydberg, A., Söderström, M., Hagner, O., Börjesson, T., 2007. Fieldspecific overview of crops using UAV (Unmanned AerialVehicle). In: Stafford, J.V. (Ed.), Precision agriculture’07,

Page 13: Development of a model of data-flows for precision agriculture based on a collaborative research project

n a g

S

S

S

S

of sub-units within fields as a key input for crop management

c o m p u t e r s a n d e l e c t r o n i c s i

Proceedings of the 6th ECPA. Skiathos, Greece, June 3–6, 2007.Wageningen Academic Publishers, The Netherlands, pp.357–364, ISBN 9789086860241.

chaffner, A., Hövelmann, L., in press. DerDLG-Nachhaltigkeitsstandard ‘nachhaltigeLandwirtschaft—zukunftsfähig’. DBU-SchriftenreiheInitiativen zum Naturschutz. Deutsche BundesstiftungUmweltschutz, Osnabrück, Germany.

chaffner, A., Hövelmann, L., Reinicke, F., Christen, O., 2008.Produktionsinformationen aus Precision Farming als Basis fürdie Nachhaltigkeitszertifizierung landwirtschaftlicherBetriebe. Landtechnik 1/2008, 58.

parxSystems, 2007. Enterprise Architect 7.0, http://www.

sparxsystems.com.au.

trombaugh, T., Simpson, A., Jacobs, J., Mueller, T., 2003. A lowcost platform for obtaining remote sensed imagery. In:Stafford, J., Werner, A. (Eds.), Precision Agriculture,Proceedings of the 4th ECPA. Berlin, Germany, June 15–19,

r i c u l t u r e 6 6 ( 2 0 0 9 ) 25–37 37

2003. Wageningen Academic Publishers, The Netherlands, pp.665–670, ISBN 9076998213.

Vogel, K., Blumenrath, S., Lipski, A., 2008. Beurteilung derProgrammfunktionen durch potenzielle Anwender. In: vonHaaren, C., Hülsbergen, K.-J., Hachmann, R. (Eds.),Naturschutz im landwirtschaftlichen Betriebsmanagement:EDV-Systeme zur Unterstützung der Erfassung, Bewertungund Konzeption von Naturschutzleistungenlandwirtschaflticher Betriebe. Ibidem-Verlag, Stuttgart,Germany, pp. 219–228 (ISBN 3898218767).

Werner, A., Kettner, E., Pauly, J., Reining, E., Roth, R., Kühn, J.,Bobert, J., Schmidhalter, U., Hufnagel, J., 2002. Yield potentials

in precision agriculture. In: Werner, A., Jarfe, A. (Eds.),Precision Agriculture: Herausforderung an integrativeForschung, Entwicklung und Anwendung in der Praxis. KTBL,Darmstadt, Germany, pp. 197–200, ISBN 3980827909.