version control thesis
DESCRIPTION
this is the thesis presentation of my master degree in 31-12-2009TRANSCRIPT
Version Control
Dr-Ahmed Abou El-Fetouh SalehDr-Samir El-Desouky El-Mougy
Information System Department,Faculty of computers and Information Systems
Mansoura university
Computer Science Department,Faculty of computers and Information Systems
Mansoura university
By Researcher:Waleed Mohamed Mahmoud Al-Adrousy
Computer Science Department,Faculty of computers and Information Systems
Mansoura University
2
Agenda● Version Control introduction● Objectives● Previous work● Applied Algorithms and Technologies:
– Suggested load balancing architecture– Suggested Differencing Algorithm
● Testing results.● Future work
3
Agenda● Version Control introduction● Objectives● Previous work● Applied Algorithms and Technologies:
– Suggested load balancing architecture– Suggested Differencing Algorithm
● Testing results.● Future work
4
Version Control Definition● Network based system ● Controls access to computer files ● Track modifications for current and back-up
files● Tracks History.● Synchronizes Concurrent Access to files.
5
How Version Control work?
6
Without Version Control
7
With version control (Lock-Modify-Unlock) Model
8
With version Control(Copy-Modify-Merge) Model
9
(Copy-Modify-Merge) Model (Cont.)
10
Version Control Types
11
Agenda● Version Control introduction● Objectives● Previous work● Applied Algorithms and Technologies:
– Suggested load balancing architecture– Suggested Differencing Algorithm
● Testing results.● Future work
12
Objectives● Part 1
– Better load balancing based on behavior analysis.– Optimization of synchronization process.– Dynamic clustering of work.– Compromise centralized and distributed models.
● Part 2– Grammar based difference calculation.– Difference computation speed.– Adding on-line support for syntax differencing.– Application on java language.
13
Agenda● Version Control introduction● Objectives● Previous work● Applied Algorithms and Technologies:
– Suggested load balancing architecture– Suggested Differencing Algorithm
● Testing results.● Future work
14
Previous works
Technologies:● File Sharing Protocols:
– FTP● File Synchronization Protocols
– WebDav– DeltaV– RSync Algorithm– IP-RSync
● Basically need Network protocol level support
15
Previous Work (Continue)● Dick Grune in 1986 (CVS)● CollabNet Inc in 2000 (subversion)● Peer to peer (P2P) technolgies in late 1990s
16
Previous Work (Cont.)● Language Modeling is :
internal source code representation for processing ● Some famous language modeling techniques:
Famix Model XML representation standard of java source code JavaML standard
17
Agenda● Version Control introduction● Objectives● Previous work● Applied Algorithms and Technologies:
– Suggested load balancing architecture– Suggested Differencing Algorithm
● Testing results.● Future work
18
Agenda● Version Control introduction● Objectives● Previous work● Applied Algorithms and Technologies:
– Suggested load balancing architecture– Suggested Differencing Algorithm
● Testing results.● Future work
19
First Part Semi Distributed Version Control
Using Web Data Mining
20
Part 1 Objectives ● Better load balancing based on behavior analysis.● Optimization of synchronization process.● Dynamic clustering of work.● Getting both advantages of centralized and
distributed models.
21
Suggested Architecture
22
Case Study
23
Web Data Mining● Definition...● 3 Types of Algorithms:
– Centrality and Closeness – Ranking– Clustering
24
Graph● Definition:
a set of vertices and a set of edges . Edges are specified as a pair, (v1, v2), where v1 and v2 are two vertices in the graph. A vertex can also have a weight, sometimes also called a cost.
● Types– Directed → like project dependencies– Undirected → like communication between developers
25
Simple Clustering Algorithm
26
Simple Clustering Algorithm(Cont.)
27
Structured Similarity Algorithm
28
Structured Similarity Algorithm (Cont.)
29
Semi-Distributed Architecture Algorithm
30
31
Agenda● Version Control introduction● Objectives● Previous work● Applied Algorithms and Technologies:
– Suggested load balancing architecture– Suggested Differencing Algorithm
● Testing results.● Future work
32
Second PartStructured Differencing For Web
Based Version Control
33
Existing Differencing Algorithms Structure BasedLine Based Comparison
DiffX, Xdiff and Xydiff LCS Example
StructureLineComparison unit
HardEasyImplementation Difficulty
Deals with it
(helpful for developers)Ignores it
Dealing with logical nature of code that consists of
(classes, methods, objects,...etc)
34
Part 2 Objectives● Adding grammar based difference calculation.● Enhancing difference computation speed.● Adding on-line support for syntax differencing.● Application on java language.
35
Abstract Syntax Tree (AST)● Important for parsers to model source code as
structure instead of plain text/lines● Many parser generators for java , ANTLR is chosen
36
XML Standard● Known Structured data representation format.● Used widely for interoperability ● Used in many protocols in Internet and web
services.
37
Convert AST to XML
38
Differencing● Changes can be:
– ADD– Delete– Modify – Move
39
Syntax aware Differencing Algorithm
40
Syntax aware Differencing Algorithm (Cont.)
41
Agenda● Version Control introduction● Objectives● Previous work● Applied Algorithms and Technologies:
– Suggested load balancing architecture– Suggested Differencing Algorithm
● Testing results.● Future work
42
Simulation Results● Two Subsystems are simulated
– Semi-structured version controlJfreeChart, Jung,MyJTable, and Piccolo Tools
– Syntax Aware Diff For Web Based Version Control Systems
AJAX, GWT, XML, ANTLR, JDOM, XML unit and java servlets
● Note : the following results are based on custom simulation not real-life data according to limitation of human team to apply tests .
43
Part 1 Results
44
45
46
47
48
Part 2 Results
49
50
51
52
53
54
55
Agenda● Version Control introduction● Objectives● Previous work● Applied Algorithms and Technologies:
– Suggested load balancing architecture– Suggested Differencing Algorithm
● Testing results.● Future work
56
Future work
57
Part 1 Future work (for our semi-distributed proposed algorithm)
● Making a real-world case study.● Integration with coding Environment- IDE.● Considering some aspects like security and backup.● Testing in many network platforms on
heterogeneous devices.
58
Part 2 Future work(Syntax-Aware Differencing)
● Merge the semi-distributed algorithm with differencing algorithm.● Reduce the long representation of short Java source code.● Enhance the readability of the differencing results.● Integrate with an existing IDE● Enhance the Visualization of Graphical user Interface (GUI) of the web tool.● Enhance the used model of asynchronous web page design (Rich Online
IDE).● Develop as a web service.● Port this algorithm to other languages rather than Java by replacing the
modeling part to read the other language grammar.
59
Word For History
60
Thanks