datacamp user guide

Upload: stefan-urbanek

Post on 30-May-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/14/2019 Datacamp User Guide

    1/41

    [email protected]

    www.knowerce.sk

    DatacampUsers and Administrators Guide

    January 2010

    knowerce|consulting

  • 8/14/2019 Datacamp User Guide

    2/41

    Document information

    Creator Knowerce, s.r.o.

    [email protected]

    www.knowerce.sk

    Author tefan Urbnek, [email protected]

    Date of creation 2009-11-30

    Document revision 1.2

    Document Restrictions

    Copyright (C) 2009 Knowerce, s.r.o., Stefan Urbanek

    This document is distributed under Creative Commons License: Attribution-Noncommercial-Share

    Alike 3.0

    Date Revision Changes Author

    2010-02-02 1.2 added record status, appendices (search engine, external

    sources)

    Stefan Urbanek

    2010-01-15 1.1 added import Stefan Urbanek

    2009-11-30 1.0 document created Stefan Urbanek

    knowerce|consulting

    Offer [email protected] 2

    http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/
  • 8/14/2019 Datacamp User Guide

    3/41

    Contents

    .....................................................................................................................................................................................Introduction 5

    ........................................................................................................................................................Project Page and Sources 6

    Support 6

    .......................................................................................................................................................................................Data Users 7

    .....................................................................................................................................................................................Main Screen 8

    Session bar 8

    ...............................................................................................................................................................................Data Browsing 9

    Data Catalogue 9

    Dataset Display 10

    Record Details 11

    Adding to Favourites 11

    Sharing 11

    ........................................................................................................................................................................................Searching 12

    Global Search 12

    Dataset Advanced Search 13

    ................................................................................................................................................................................Users Profile 14

    Profile 14

    Favourites 14

    Comments 14

    .................................................................................................................................Application Programming Interface 16

    API key 16

    Requests 16

    Errors 16

    API Command Line Tool 17

    .....................................................................................................................................................................Data Management 18

    .............................................................................................................................................................Datasets and Records 19

    Record Status 19

    ...............................................................................................................................................................Record Management 21

    Create Record 21

    Edit Record 21

    Record Status 21

    .................................................................................................................................................................................Import Data 23

    File Selection 23

    Field Mapping 23

    knowerce|consulting

    Offer [email protected] 3

  • 8/14/2019 Datacamp User Guide

    4/41

    .............................................................................................................................................................................Administration 25

    ..............................................................................................................................................................Dataset Management 26

    Dataset Categories 26

    Create Dataset

    27Inspecting and Edit Dataset Description 28

    Create Dataset Field 29

    Edit Dataset Field 31

    Derived Fields 31

    Data Format 31

    .....................................................................................................................................................................User Management 33

    Create User 33

    Edit User 34

    Roles and Rights 34

    ...................................................................................................................................................................................Appendices 36

    A. ... .. .. .. .. .. ... .. .. .. .. .. .. .. .. .. .. .. .. ... .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... .. .. .. .. .. .. .. .. .. .. .. .. ... .. .. .. .. .. .. .. .. .. .. .. .. ... .. .. .. .. .. .. .. .. .Dataset Implementation 37

    Datasets 37

    Fields 37

    Summary 37

    B. .............................................................................................................................................External Sources and ETL 38

    C. .......................................................................................................................................................................Search Engine 39

    Predicates 39

    D. ....................................................................................................................................................................API Shell Script 40

    knowerce|consulting

    Offer [email protected] 4

  • 8/14/2019 Datacamp User Guide

    5/41

    Introduction

    Datacamp is a Web application for publishing, searching and managing data in form of datasets. Each

    chapter of this guide presents major features and examples how to use the application. The guide is

    split into three sections: guide for data users, guide for data managers and guide for application

    administrators.

    knowerce|consulting

    Offer [email protected] 5

  • 8/14/2019 Datacamp User Guide

    6/41

    Project Page and Sources

    Project page with sources can be found:

    http://github.com/Stiivi/datacamp

    Wiki Documentation:

    http://wiki.github.com/Stiivi/datacamp/

    Related project Datacamp:

    http://github.com/Stiivi/Datacamp-ETL

    Support

    General Discussion Mailing List

    http://groups.google.com/group/datacamp

    Development Mailing List (recommended for Datacamp-ETL project):

    http://groups.google.com/group/datacamp-dev

    knowerce|consulting

    Offer [email protected] 6

    http://groups.google.com/group/datacamp-devhttp://groups.google.com/group/datacamp-devhttp://groups.google.com/group/datacamphttp://groups.google.com/group/datacamphttp://github.com/Stiivi/Datacamp-ETLhttp://github.com/Stiivi/Datacamp-ETLhttp://wiki.github.com/Stiivi/datacamphttp://wiki.github.com/Stiivi/datacamphttp://github.com/Stiivi/datacamphttp://github.com/Stiivi/datacamp
  • 8/14/2019 Datacamp User Guide

    7/41

    Data UsersThis section describes how to browse data, search for data, discuss about data.

    knowerce|consulting

    Offer [email protected] 7

  • 8/14/2019 Datacamp User Guide

    8/41

    Main Screen

    a

    b

    c

    d

    e

    (a) main menu navigation through the application

    (b) session bar log-in, log-out, user related pages

    (c) search field global database search

    (d) data catalogue list of available dataset categories

    (e) information pages information about the service, provider and other related information

    What you can do here:

    search all datasets using the search filed (c). Please refer to chapter about Searching to learn

    more

    view datasets in desired category through (d)

    log-in, log-out or change users preferences in (b)

    switch application language in (b)

    Session bar

    ba c d

    (a) current users name

    (b) users account preferences, favourites, comments

    (c) log-out

    (d) language switcher

    knowerce|consulting

    Offer [email protected] 8

  • 8/14/2019 Datacamp User Guide

    9/41

    Data Browsing

    Data Catalogue

    When you open the Data Catalogue you will see list of all datasets that are published. The datasets

    are grouped by dataset category.

    b

    a

    (a) dataset category

    (b) list of datasets and their descriptions

    What you can do here:

    view dataset by clicking on the dataset name or by pressing the button

    knowerce|consulting

    Offer [email protected] 9

  • 8/14/2019 Datacamp User Guide

    10/41

    Dataset Display

    On the dataset page you see:

    a

    b

    c

    d

    (a) dataset name and description (if present)

    (b) dataset menu bar

    (c) pages with record listings

    (d) dataset table with sortable columns

    What you can do here

    view record by pressing button

    browse dataset data by flipping through pages

    sort data by given column by clicking on the column name

    add dataset to your favourites by clicking on button

    get more information about the dataset, such as data provider, update frequency, data sources

    add comments to the current dataset

    knowerce|consulting

    Offer [email protected] 10

  • 8/14/2019 Datacamp User Guide

    11/41

    perform advanced search (more information in the chapter about searching)

    Record Details

    What you can do here:

    discuss the record by adding a new comment or replying to another comment

    add record to favourites

    Adding to Favourites

    You can add any record or dataset into list of your favourites by pressing the add to favourites button

    under dataset or record description:

    You will be asked to add a short note to the record or dataset you are about to add to your

    favourites.

    SharingAny dataset or record can be shared by clicking on the Share link under a dataset or record

    description:

    knowerce|consulting

    Offer [email protected] 11

  • 8/14/2019 Datacamp User Guide

    12/41

    Searching

    There are two ways how to search for data: global search and advanced dataset search. Using the

    global search all datasets and fields are being searched for given search query. Advanced dataset

    search allows you to specify search criteria more precisely, but you are limited to one dataset only.

    Global Search

    To start searching through all datasets you can use either front page search:

    or you can type your query into the search filed that is present in the right side of the menu bar allthe time:

    Search Query

    Database is searched for all words or expressions you type in the search field. Search examples:

    john smith searches for both wordsjohn and smith

    public television

    searches for two words: public and television, matches all records containingboth words in any of the fields, might be separated

    public television searches for whole phrase, matches only records that contain exact phrase

    as a part of a field

    To exclude a word from search query add minus sign in front of a word or phrase:

    john -smith search for records which containjohn, but not smith

    Pattern matching

    To match partial words, such as prefixes or suffixes use asterisk * symbol to denote missing part of a

    word:

    *tech matches all fields that end with tech, such as microtech, macrotech but not technology

    tech* matches all fields that start with tech, such as technology, but not microtech or

    macrotech.

    Advanced Query

    Advanced users might want to refine their search by using advanced queries:

    dataset:procurements search only in datasets containing word procurements in their name

    -dataset:donations exclude datasets that contain word donations in their name

    field:name search only in fields that have word name in their title or identifier

    knowerce|consulting

    Offer [email protected] 12

  • 8/14/2019 Datacamp User Guide

    13/41

    -field:city exclude search in fields that have word cityin their title or identifier

    Known Current Search Limitations

    The search engine has some limitations, that are planned to be removed in the future. They are:

    search is performed on search fields granularity not word granularity. That means that if you

    search for part of the word, such as beginning or end, then it is matched against whole field not

    agains a word in that field

    field titles are currently ignored, only identifiers are used

    Dataset Advanced Search

    To search dataset with more precise criteria, open the dataset you want to perform search on and

    click on search tab. You will get the a screen where you specify conditions for searching:

    (a) searched field

    (b) operator

    (c) value

    (d) add/remove condition

    (e) start search

    First you have to select a field you want to search in. Then select operator, for example for text fields

    the options are:

    You can add more conditions by pressing the add button. To remove a condition press remove

    button.

    a b c d

    e

    knowerce|consulting

    Offer [email protected] 13

  • 8/14/2019 Datacamp User Guide

    14/41

    Users Profile

    Profile

    The profile tab is used to change basic information about the user, change display name, email

    address or password.

    Favourites

    You can browse records you have marked as favourites in the Favourites tab.

    To display the favourite just click on the record reference. To show dataset where the record is

    contained click on the dataset name. To delete the favourite click on the trash can icon.

    Comments

    You can see all comments you have given to records or datasets. If you view profile of another user

    you see all comments that he has given.

    knowerce|consulting

    Offer [email protected] 14

  • 8/14/2019 Datacamp User Guide

    15/41

    knowerce|consulting

    Offer [email protected] 15

  • 8/14/2019 Datacamp User Guide

    16/41

    Application Programming Interface

    Datacamp has Application Programming Interface (API) for accessing raw data and metadata. The API

    request has form:

    DATACAMP_BASE_URL/api/method?api_key=key&other_argumentsFor example:

    http://my-datacamp.org/api/dataset_description?api_key=abc123&dataset_id=1

    API key

    You get your API key in the web application: go to your profile (top-right corner of the page) and

    select API tab. There you have your API key. If you thing that your API key is being abused by someone

    else, you might generate another key.

    Requests

    Request Arguments Description Format

    version none return API version number text or Ruby XML

    datasets none return list of all datasets Ruby XML

    dataset_description dataset_id dataset information, list of dataset fields and

    their properties

    Ruby XML

    dataset_dump dataset_id raw dataset table dump (with system

    metadata)

    CSV

    record dataset_id, record_id return a record Ruby XML

    dataset_records dataset_id dataset records CSV

    Ruby XML means Ruby object serialized to XML by object.to_xml

    Note that all requests are restricted by API key. That means that only objects and their fields that can

    be accessed with given key are returned.

    Errors

    If an error occures during request, error reply is returned in XML format:

    knowerce|consulting

    Offer [email protected] 16

  • 8/14/2019 Datacamp User Guide

    17/41

    Code Description

    HTTP

    status

    internal_inconsistency Something went wrong in the application internally. Development

    team should be contacted

    500

    unknown_request Unknown or not implemented request 400

    invalid_argument Wrong number, format or value of arguments 400

    object_not_found* Requested object referenced by id (record, dataset, ) was not

    found

    404

    access_denied Inval id API key or key owner has no access to requested method

    or object

    401

    object_not_found error is replied only when concrete id of an object is expected and the object

    with provided id does not exist in the database. There is no error reply when one is searching for an

    object using search criteria and no object was found searching operation succeeded and foundnothing.

    API Command Line Tool

    Datacamp application comes with command line tool datacamp located in tools/ directory of

    datacamp sources. The source code of the script is also listed in appendix API Shell Script.

    Usage:

    datacamp [-h] [OPTIONS] REQUEST [ARGUMENTS]

    Send REQUEST to a Datacamp application and return ser ver reply.

    Options:

    -b url specify base URL for Datacamp. Default: http://localhost:3000

    -k api_key specify API key for accessing Datacamp data

    -f format request different format, if available. Options are: xml

    -g get_method method of accessing the datacamp: curl (default), wget

    Environment variables:

    DATACAMP_BASE_URL

    DATACAMP_API_KEY

    DATACAMP_FORMAT

    DATACAMP_GET_METHOD

    Command line options override environment variables.

    Example:

    export DATACAMP_API_KEY=my_api_keyexport DATACAMP_BASE_URL=http://my-datacamp.orgdatacamp versiondatacamp datasets

    knowerce|consulting

    Offer [email protected] 17

  • 8/14/2019 Datacamp User Guide

    18/41

    Data ManagementThis section is about creating and editing records, importing data from a file.

    knowerce|consulting

    Offer [email protected] 18

  • 8/14/2019 Datacamp User Guide

    19/41

    Datasets and Records

    Datacamp is based on datasets and records. Contrary to the other data publishing applications, the

    datasets are not closed sets with finite number of records, they are rather refillable containers for

    data with similar nature, structure and meaning. Records are not static - once they are put into

    datasets, they might be updated or corrected. This allows to have datasets with missing, incomplete or

    wrong information at the beginning and to subsequently improve quality of the data.

    Record Status

    Records are kind of live entity that might change over time. Datacamp has tools to manage status of

    records, which can be compared to status of a service customer or status of an article in content

    management system. There are five record statuses:

    Stat s Visibility Description

    loaded data managers record was loaded to datastore by ETL process. ETL process

    uses this status to know which records were actually

    imported to be able to do additional finalisation of new

    records. This status should never be seen in the application.

    new data managers manually created record or record created by CSV import.

    Record requires review before publishing.

    active (published) anyone record is live, viewable and searchable by anyone

    suspended (hidden) data managers there are some issues with the records that might confusepotential viewers or there are other reasons for not

    publishing the records, such as quality issues or trust of the

    source

    deleted (closed) data managers, requires

    explicit filter to list

    record has no further use in the database, either because of

    redundancy or relevancy; or it might be obsoleted

    destroyed

    (not actual status)

    no one pseudo-status. the record does not exist in database any

    more and all references and dependencies to this record

    were removed

    Records should be persistent and should not be destroyed (deleted from database) only when really

    necessary. Reasons for destroying record might be failed loading, multiple import of the same file or

    something serious.

    knowerce|consulting

    Offer [email protected] 19

  • 8/14/2019 Datacamp User Guide

    20/41

    Following diagram shows record statuses and possible transitions between the statuses:

    Dataset RecordStaging Record

    loaded new

    active

    (published)

    deleted(closed)

    suspended

    (hidden)

    ETL import CVS import

    destroy

    ETL finalize

    check/publish

    suspend

    publish

    delete

    undelete

    delete

    knowerce|consulting

    Offer [email protected] 20

  • 8/14/2019 Datacamp User Guide

    21/41

    Record Management

    Create Record

    To create a record, you do:

    1. open a dataset you want to add record to

    2. click on create record at the top of the dataset display:

    3. you will be presented a form where you fill in information

    4. when done press save button

    Edit Record

    To edit a record you have to:

    1. find and open the record you want to edit

    2. click on Edit in the top right corner of the record display:

    3. change record fields

    4. press save button when done

    Record Status

    Each record has a publishing status which can be one of these:

    new

    published (active)

    hidden (suspended)

    deleted (closed)

    Records should not be deleted from the database, only in exceptional cases, such as:

    incorrectly imported

    redundantly imported

    un-intentional redundancy

    knowerce|consulting

    Offer [email protected] 21

  • 8/14/2019 Datacamp User Guide

    22/41

    Status Overview

    Status Description Visible to

    new record was just created, either manually, by impor ting from a file or by

    a background loading process

    data managers

    published (active) viewable by anyone anyone

    hidden (suspended) records that are not intended to be published because of quality

    issues, controversy, redundancy, uncertainty or any other reason

    data managers

    deleted (closed) records removed from the database data managers, when

    explicitly requested

    knowerce|consulting

    Offer [email protected] 22

  • 8/14/2019 Datacamp User Guide

    23/41

    Import Data

    To import data:

    1. open data dictionary

    2. click on import in the menu bar:

    The import is being done in three steps:

    1. file selection and specification

    2. field mapping preview

    3. actual processing of imported file

    File Selection

    1. select dataset description to which you want import new data:

    2. select file to be imported:

    3. You can optionally specify title of the file and source, for the record

    4. Chose file template:

    currently there are two templates available: plain CSV and CSV with more header rows, for

    example one might contain human readable column titles and the other field identifiers for

    automatic field matching

    5. You can optionally specify format of CSV file: separator of columns, number of header lines

    Field Mapping

    After confirming the file you want to import, mapping screen will be displayed:

    knowerce|consulting

    Offer [email protected] 23

  • 8/14/2019 Datacamp User Guide

    24/41

    a

    b c

    d

    (a) change settings of import: specify another dataset, change file format

    (b) guess field mapping from file headers (see below)

    (c) revert to predefined column mapping

    Columns in the file are matched to the dataset based on settings specified in Dataset Description

    Import Settings. You can override the mappings by specifying dataset fields from field lists.

    If you want to get field mapping from file, you can guess it by pressing button. This

    action will try to match file header to dataset field identifiers.

    Press if you have messed up the mappings and want to revert to predefined

    dataset mapping.

    Click if you are satisfied with file to dataset mapping to import records from file into

    specified dataset.

    Technical note: all imports are being stored in database, even there is no user interface for it. All records

    refer to a batch they come from, therefore you can identify which records came from which file.

    knowerce|consulting

    Offer [email protected] 24

  • 8/14/2019 Datacamp User Guide

    25/41

    AdministrationThis section is for application administrators and is about creating new datasets and fields, managing

    users, assigning user rights and roles.

    knowerce|consulting

    Offer [email protected] 25

  • 8/14/2019 Datacamp User Guide

    26/41

    Dataset Management

    Datasets are being management in the Data Dictionary section:

    On this page you see descriptions for all datasets in the database.

    a

    b

    c

    d

    (a) data dictionary actions

    (b) dataset category

    (c) list of datasets

    (d) remove dataset button

    Dataset Categories

    To create a category, press . You will be asked for a title for the new categor y:

    Type the category title and press the submit button.

    knowerce|consulting

    Offer [email protected] 26

  • 8/14/2019 Datacamp User Guide

    27/41

    To edit a category name or remove a category hover mouse over category name to show category

    actions and press the desired action:

    Note: If you remove category, the datasets will not be removed. They will become uncategorised.

    Create Dataset

    To create a new dataset, go to the data dictionary and chose New Dataset from the menu:

    A dataset creation form will be opened:

    a

    b

    c

    d

    e

    (a) language for dataset description.

    (b) title of the dataset. This field is required for the first language in the language list.

    (c) dataset identifier in the database (see technical note below)

    (d) chose whether the dataset will be published or hidden

    (e) submit creation form and create dataset

    knowerce|consulting

    Offer [email protected] 27

  • 8/14/2019 Datacamp User Guide

    28/41

    Inspecting and Edit Dataset Description

    To inspect or edit dataset:

    1. open data dictionary

    2. click on a dataset you want to edit

    You might also edit dataset description when you have dataset open:

    Dataset description page looks like this:

    a d

    b

    c

    (a) dataset title and descritpion

    (b) tabs containing parts of dataset information

    (c) list of dataset fields

    (d) dataset status

    knowerce|consulting

    Offer [email protected] 28

  • 8/14/2019 Datacamp User Guide

    29/41

    Field Descriptions

    This tab contains list of all fields in the dataset. For more information, please read section about field

    descriptions.

    Information

    This tab shows basic information about the dataset:

    Note: The identifier can be changed only by superuser.

    Create Dataset Field

    To create a new field:

    1. Open a dataset in data dictionary

    2. Select field descriptions tab:

    3. press Add field description button at the bottom of the page

    You will get New field description page:

    knowerce|consulting

    Offer [email protected] 29

  • 8/14/2019 Datacamp User Guide

    30/41

    category field grouping category (optional)

    title required

    description description of the dataset field

    identifier how the field will be referenced in the datastore (see technical note below)

    data type type of data contained in the field. Current possibilities are: string, text, decimal

    (floating point number), integer and date

    derived and derived value whether the field is represented by existing data or derived through

    a formula or a transformation. See section about derived fields for more information.

    data format how the data is being displayed. See section about data format for more

    information.

    knowerce|consulting

    Offer [email protected] 30

  • 8/14/2019 Datacamp User Guide

    31/41

    Edit Dataset Field

    To edit dataset field:

    1. Open dataset

    2. Click on Field descriptions tab:

    3. Click on fields name

    Derived Fields

    You might have fields that are derived from other fields, for example you might combine name and

    surname in one field. To create a derived field, check Derived checkbox and write derive expression:

    The derive expression is SQL expression. At this moment it depends on the SQL server used for

    storing the datasets. You might use other fields identifiers in the expression. You might not use

    derived fields to derive other fields at the moment.

    Example: Create derived field named Full Name and put derive expression:

    CONCAT(name, , surname)

    Data Format

    You might specify, how the data are being formatted in the application. This functionality is very similar

    to the functionality in spreadsheet applications. Data format does not affect the actual stored data,

    only their presentation to the user.

    knowerce|consulting

    Offer [email protected] 31

  • 8/14/2019 Datacamp User Guide

    32/41

    Available data formats

    Data format Description Argument Data

    Formatted

    Data

    Default No formatting is done. The data is beingdisplayed as provided by the database.

    none 123456,789 123456,789

    Number Number with local ised thousand and decimal

    separators

    none 123456,789 123 456,79

    Currency Number with localised thousand and decimal

    separators

    currency symbol 123456,789 123 456,79 Sk

    Percentage Value is treated as percentage ratio, where 1 is

    100%. Number is displayed with localised

    thousand and decimal separators

    none 0,5 50%

    Size in bytes Value is converted to human readable size in

    bytes with number order adjusted.

    none 12345678 11,77 mb

    Date Localized date format none 2009-10-15 15.10.2009

    Text Force no formatting. none text text

    URL Create clickable URL link from data none

    e-mail Create mailto URL link from the email-address. none

    Flag Conver t boolean value into words. By default:

    yes, no for english locale. Argument contains

    comma separated values for true and false, for

    example: conains, does not contain

    comma

    separated values

    for true and

    false

    1 yes

    knowerce|consulting

    Offer [email protected] 32

  • 8/14/2019 Datacamp User Guide

    33/41

    User Management

    Users are being managed through Settings Users:

    Create User

    1. Open user management page

    2. Click on New User button

    3. Fill-out form:

    4. Set-up user roles and rights (see section about Roles and Rights)

    5. Confirm new user data

    knowerce|consulting

    Offer [email protected] 33

  • 8/14/2019 Datacamp User Guide

    34/41

    Edit User

    1. open User Management page

    2. click on a user

    3. change user properties

    4. submit changes

    Roles and Rights

    There are parts of the applications where particular rights are required to be able to access them or

    to be able to perform cer tain actions. Each user can be assigned number of rights.

    There are many rights in the application. To simplify right assignment, some rights are grouped into

    roles and each user can be assigned any number of roles. Roles are preferred to assigning separate

    rights. Administrators should assign roles and then tune them by assigning additional rights.

    The Roles and Rights Page:

    knowerce|consulting

    Offer [email protected] 34

  • 8/14/2019 Datacamp User Guide

    35/41

    List of Rights and Roles

    Application uses following user rights:

    Roles

    Right Category Right Data EditorDatastoreManager Power User

    UserManager

    Records Edit record

    Records Create record

    Records Edit record metadata

    Records Edit locked record

    Records Import from file

    Data store Edit dataset description

    Data store Create dataset

    Data store Destroy dataset

    Users Manage users

    Users Block users

    Users Grant rights

    Information View hidden fields

    Information View hidden records

    Information Search in hidden fields

    Information Search in hidden records

    System Use features under development

    Super-user

    There is special kind of user named super-user. Super-user does not need to have any rights nor roles

    assigned, he is allowed to do and access anything in the application. Actions available only to suer-

    users:

    make another user a super-user

    change dataset or field identifier

    destroy record

    knowerce|consulting

    Offer [email protected] 35

  • 8/14/2019 Datacamp User Guide

    36/41

    AppendicesAdditional information, such as technical notes and concepts.

    knowerce|consulting

    Offer [email protected] 36

  • 8/14/2019 Datacamp User Guide

    37/41

    A. Dataset Implementation

    Datasets

    Currently datasets are being created as tables in relational database in datastore schema. The table

    name is constructed from prefix ds_ and dataset identifier. For example, dataset with identifier

    public_procurements will have table name ds_public_procurements.

    Metadata for dataset records are stored in the same table as records. This might change in the future.

    Fields

    Fields are implemented as relational table columns

    Summary

    Object/Concept Implementation Reference

    data store database schema database name

    dataset table table name is prefixed dataset identifier

    field table column table column is the same as dataset field

    record table row dataset unique record id number (in _record_id column)

    record metadata dataset table columns metadata identifier in the same dataset table and same row as

    the record

    Important note: Implementation of datasets might change in the future, therefore you should not rely

    on this structure and use Datastore API instead for all dataset and record operations.

    knowerce|consulting

    Offer [email protected] 37

  • 8/14/2019 Datacamp User Guide

    38/41

    B. External Sources and ETL

    Data from external sources are being loaded into the application using Datacamp ETL. With current

    database implementation of datasets the process look like depicted on the following diagram:

    See Datacamp ETL project for more information:

    http://github.com/Stiivi/Datacamp-ETL

    External Sources

    Staging Data Store

    Dataset Store

    Extraction

    Transformation

    tables fromexternal sources

    Loading

    result table

    temporary tables

    foreign database

    dataset table with metadata(data dictionary and system compliant)

    web

    temporary files

    knowerce|consulting

    Offer [email protected] 38

    http://github.com/Stiivi/Datacamp-ETLhttp://github.com/Stiivi/Datacamp-ETL
  • 8/14/2019 Datacamp User Guide

    39/41

    C. Search Engine

    Datacamp contains simple predicate based search engine. Each action of searching uses a query

    which is composed of predicates.

    Search Engine is separate module, therefore it can be replaced as needed, either by more

    sophisticated engine or engine that is part of another kind of datastore.

    Predicates

    Data Type Allowed Predicates

    integer greater, less, greater or equal, less or equal, equal, not equal,

    date within last days, within last weeks, within last months, greater, less,

    greater or equal, less or equal, equal, not equal,

    string contains, begins with, ends with, does not contain, matches,

    any is set, is not set

    knowerce|consulting

    Offer [email protected] 39

  • 8/14/2019 Datacamp User Guide

    40/41

    D. API Shell Script

    Full listing of the API shell script Bash source:

    #!/bin/bash## Datacamp API Tool## Type: datacamp -h for more information

    DATACAMP_BASE_URL=${DATACAMP_BASE_URL:-http://localhost:3000}DATACAMP_GET_METHOD=${DATACAMP_GET_METHOD:-curl}

    function datacamp_request_url() {METHOD=$1shift

    if [ "${DATACAMP_FORMAT}x" != "x" ]; thenMETHOD="${METHOD}.${DATACAMP_FORMAT}"

    fi

    ARGS="api_key=${DATACAMP_API_KEY}"

    if [ $# -gt 0 ]; thenwhile [ $# -gt 0 ]; do

    ARG="$1"shiftARGS="${ARGS}&${ARG}"

    donefi

    CALL_URL="${DATACAMP_BASE_URL}/api/${METHOD}"if [ "$ARGS" != "" ]; then

    URL="${CALL_URL}?${ARGS}"else

    URL="${CALL_URL}"fiecho $URL

    }

    function datacamp_request() {URL="$(datacamp_request_url $*)"

    # FIXME: implement wget

    case $DATACAMP_GET_METHOD incurl)

    COMMAND="curl";;wget)

    COMMAND="wget -q -O - ";;*)

    echo "ERROR: Unknown get method ${DATACAMP_GET_METHOD}" >&2exit 1;;

    esac

    knowerce|consulting

    Offer [email protected] 40

  • 8/14/2019 Datacamp User Guide

    41/41

    echo REQUEST: $URL >&2$COMMAND $URL

    }

    function print_help() {cat >&2 &2

    exit 1fi

    datacamp_request $@

    knowerce|consulting