incremental load

Download Incremental Load

Post on 28-Oct-2015

17 views

Category:

Documents

6 download

Embed Size (px)

TRANSCRIPT

  • Incremental Loadusing qvd files

  • Incremental LoadIs sometimes calledIncremental Load Differential LoadDelta Load

  • Incremental LoadGoal:Load only the new or the changed records from the database. The rest should already be available, one way or another.

  • Comments on Buffer LoadBuffer (Incremental) Load is a solution only for Log files (text files), but not for DBs.Buffer (Stale after 7 days) Select is not a good solution. It makes a full Load after 7 days. And nothing in between

  • Incremental LoadLoad new data from Database table (slow, but few records)Load old data from QVD file (many records, but fast)Create new QVD file Procedure must be repeated for each table

  • Different DB-changesIf source allows Append only. (Logfiles)Insert only. (No Update or Delete)Insert and Update. (No Delete)Insert, Update and Delete.

  • 1) Append onlyMust be Log fileLoads records added in the end of the file

  • 1) Append only

    Buffer (Incremental) Load * From LogFile.txt(ansi, txt, delimiter is '\t', embedded labels);

    Done! But it should be renamed to Buffer (Append) Load

  • 2) Insert onlyCan be any DBLoads INSERTed recordsNeeds the field ModificationDate

  • 2) Insert only

    QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;

  • 2) Insert only

    QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;

    ConcatenateLOAD PrimaryKey, X, Y FROM File.QVD;

  • 2) Insert only

    QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;

    ConcatenateLOAD PrimaryKey, X, Y FROM File.QVD;

    STORE QV_Table INTO File.QVD;

    Almost done But there is a small chance that a record gets loaded twice

  • 2) Insert only

    QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)# AND ModificationTime < #$(BeginningThisExecTime)#;

    ConcatenateLOAD PrimaryKey, X, Y FROM File.QVD;

    STORE QV_Table INTO File.QVD;

    Done!

  • 3) Insert and UpdateCan be any DBLoads INSERTed and UPDATEd recordsNeeds the fields ModificationDate and PrimaryKey

  • 3) Insert and Update

    QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;

    ConcatenateLOAD PrimaryKey, X, Y FROM File.QVDWHERE NOT Exists(PrimaryKey);

    STORE QV_Table INTO File.QVD;

    Done!

  • 4) Insert, Update and DeleteCan be any DBLoads INSERTed and UPDATEd recordsRemoves DELETEd recordsNeeds the fields ModificationDate and PrimaryKeyTricky to implement

  • 4) Insert, Update and Delete

    QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;

    ConcatenateLOAD PrimaryKey, X, Y FROM File.QVDWHERE NOT EXISTS(PrimaryKey);

    Inner JoinSQL SELECT PrimaryKey FROM DB_TABLE;

    STORE QV_Table INTO File.QVD;

    OK, but slow

  • 4) Insert, Update and DeleteListOfDeletedEntries:SQL SELECT PrimaryKey AS Deleted FROM DB_TABLEWHERE DeletionFlag = 1 and ModificationTime >= #$(LastExecTime)#;

    QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)#;

    Concatenate LOAD PrimaryKey, X, Y FROM File.QVDWHERE NOT Exists(PrimaryKey) AND NOT Exists(Deleted,PrimaryKey);

    Drop Table ListOfDeletedEntries;

    STORE QV_Table INTO File.QVD;

    OK, but needs a DeletionFlag

  • LastExecutionTime & Error handlingLet ThisExecTime = Now();

    { Load sequence }

    If ScriptErrorCount = 0 then Let LastExecTime = ThisExecTime; End If

  • Final ScriptLet ThisExecTime = Now();

    QV_Table:SQL SELECT PrimaryKey, X, Y FROM DB_TABLEWHERE ModificationTime >= #$(LastExecTime)# AND ModificationTime < #$(ThisExecTime)#;

    Concatenate LOAD PrimaryKey, X, Y FROM File.QVDWHERE NOT EXISTS(PrimaryKey);

    Inner Join SQL SELECT PrimaryKey FROM DB_TABLE;

    If ScriptErrorCount = 0 then STORE QV_Table INTO File.QVD;Let LastExecTime = ThisExecTime; End If

  • Summary 1Incremental Load possible forAppend only. (Logfiles)Yes!Insert only. (No Update or Delete)Yes!Insert and Update. (No Delete)Yes!Insert, Update and Delete.Slow, or demands DeletionFlag

  • Summary 2Incremental Load normally not equivalent to Buffer (Incremental) Load

  • CommentThe solutions above are (alone) probably not robust enough. In addition, the complete table should probably be reloaded regularly, perhaps once a month.

  • *************************