measurement of software similarity
TRANSCRIPT
-
7/22/2019 Measurement of Software Similarity
1/46
MEASUREMENT OF SOFTWARE SIMILARITY
UNDER THE SUPERVISION OF
PROF. ARITRA PAN
In the Year of 2010
Group NO: 13
SAMRAT GUPTA (ROLL-071531012037)
SANJUKTA MITRA (ROLL-071531012009)
MOU MONDAL (ROLL- 071531012064)
MOITRAYEE MONDAL (ROLL-071531012090)
SUSHOVAN POLLEY (ROLL-071531012065
SYAMAPARASAD INSTITUTE OF TECHNOLOGY &
MANAGEMENT7, Raja Ram Mohon Ray Road Kolkata: 41West
Bengal India
Syamaprasad Institute of Technology and Management
-
7/22/2019 Measurement of Software Similarity
2/46
2
SYAMAPARASAD INSTITUTE OF TECHNOLOGY &
MANAGEMENT
7, Raja Ram Mohon Ray Road Kolkata:41 West Bengal,
India
Certificate
The work presented in this report is the united effort of
Sanjukta Mitra, Samrat Gupta & Mou Mondal, Moitrayee Mondal and
Sushovan Polley that any work of others that was used during the
execution of the project or is included in the report has been
suitably acknowledgement through the, standard practice of citing
references and stating appropriate acknowledgements.
We hereby forward the project
entitled MEASUREMENT OF SOFTWARE SIMILARITY, presented by
Samrat Gupta(Roll No: 071531012037 Reg. NO:
071531012101037), Sanjukta Mitra (Roll No: 071531012009
Reg. NO: : 071531012201009) & Mou Mondal (Roll No:
071531012064Reg. NO: 071531012201064) & Moitrayee Mondal
(Roll No: 071531012090 Reg No: 071531012201090)&
Sushovan Polley(Roll No: 071531012065 Reg. NO:
071531012101065 ) of 2007-2008 of 6th semester , Bachelor Of
Computer Application under the guidance in partial fulfillment of
the requirements for the degree of Bachelor Of Computer
Application of this college.
Prof. Aritra Pan
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
3/46
(Project Supervisor)Associate Professor.Dept. of BCA, SITM
SYAMAPARASAD INSTITUTE OF TECHNOLOGY &
MANAGEMENT
7, Raja Ram Mohon Ray Road Kolkata:41 West Bengal,
India
Certificate Of Approval
The forgoing project report is hereby
approved as a creditable study of Bachelor in Computer Application
in a manner satisfactory to warrant its acceptance as a prerequisite
to the degree for which it has been submitted. It is understood that
by this approval the undersigned do not necessarily endorse or
approve any statement made, opinion expressed or conclusion
therein but approve this project report only for the purpose for
which it is submitted.
.
(External Examiners)
Prof. Aritra Pan Prof. Manikaustabh
Goswami
(Project Supervisor) Teacher In - Charge
Associate Professor SITM
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
4/46
-
7/22/2019 Measurement of Software Similarity
5/46
4.FLOW CHART28-29
4.1.FLOW CHART OF THE CHARACTER MATCHING30-31
4.2. PROGRAM OF THE CHARACTER MATCHING....32-334.3. FLOW CHART OF THE STRING MATCHING.34-35
4.4. PROGRAM OF THE STRING MATCHING............36-37
5. HARDWARE & SOFTWARE.....
5.1. NECESSITY OF HARDWARE AND SOFTWARE...38
6. ADVANTAGES39
7. FUTURE SCOPE......40
8. PROBLEMS.41
9. REFERENCES.42-54
10. CONCLUSION..55
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
6/46
ACKNOWLEDGMENT
We would like to thank our project Supervisor Prof. Aritra Pan for her
moral support and guidance to complete our synopsis on time.
WE express our gratitude to all our friends and classmates for their
support and help in this project.
Last, but not the least we wish to express our gratitude to God
almighty for his abundant blessings without which this synopsis would
not have been successful.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
7/46
ABSTRACT
Program assignments are traditionally an area of
serious concern in maintaining the integrity of the educational process.
Systematic inspection of all solutions for possible plagiarism has
generally required unrealistic amounts of time and effort. The Measure
Of Software Similarity tool developed by Alex Aiken at UC Berkeley
makes it possible
to objectively and automatically check all solutions
for evidence of plagiarism. We have used MOSS in several large sections
of a C programming course. (MOSS can also handle
a variety of other languages.) We feel that MOSS is a
major innovation for faculty who teach programming and recommend
that it be used routinely to screen for plagiarism.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
8/46
1. INTRODUCTION
Probably every instructor of a programming course
has been concerned about possible plagiarism in the program
solutions turned in by students. Instances of cheating are found, but
traditionally only on an ad hoc basis. For example, the instructor
may notice that two programs have the same idiosyncrasy in their
I/O interface, or the same pattern of failures with certain test cases.
With suspicions raised, the programs may be examined further and
the plagiarism discovered. Obviously, this leaves much to chance.The larger the class, and the more different people involved in the
grading, the less the chance that a given instance of plagiarism will
be detected. For students who know about various instances of
cheating, which instances are detected and which are not may
seem (in fact, may be) random. A policy of comparing all pairs of
solutions against each other for evidence of plagiarism seems like
the correct approach. But a simple file diff would of course detect
only the most obvious attempts at cheating. The standard dumb
attempt at cheating on a program assignment is to obtain a copy of
a working program and then change statement spacing, variable
names, I/O prompts and comments. This has been enough to
require a careful manual comparison for detection, which simply
becomes infeasible for large classes with regular assignments.
Thus, programming classes have been in need of an automated
tool which allows reliable and objective detection of plagiarism.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
9/46
1.1What is Moss
Moss (for a Measurement Of Software Similarity) is an automatic system
for determining the similarity of programs. To date, the main application
of Moss has been in detecting plagiarism in programming classes. Since
its development in 1994, Moss has been very effective in this role. The
algorithm behind moss is a significant improvement over other cheating
detection algorithms (at least, over those known to us. Measure Of
Software Similarity (MOSS) is a tool for determining similarities among
software programs, As of now MOSS can be used to detect similarities inC, C++, Java, Pascal, Ado, ML, Lisp and Scheme programs. MOSS is
primarily used for detecting plagiarism in programming assignments in
computer science and other engineering courses, though several text
formats are supported as well. The latest MOSS script can be
downloaded from the MOSS site. MOSS can execute on all UNIX, Linux
systems which have Perl, Mail etc. After downloading the MOSS script,
copy it to the directory consisting of the student programs. Then run
the moss script in that directory. After execution, the script sends the
data to MOSS server at Berkeley. MOSS server sends back a webpage
address, which is displayed at the prompt. This webpage consists of the
results. Results are available on the MOSS server for 14 days. The script
can also be run with one more options which handle more complicated
situations like comparing programs from different directories, excluding
Certain part of a program from the comparison etc.Our project namely is
developed on C programme code .At first we are developing a C
programme based on String Similarity, that checks two strings in two
separate array, if they are similar or not. When this is done successfully,
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
10/46
10
the same checking is done on two separate files using C programme
code.
1.2 FEATURES
This Measure of Software Similarity program can detect the similarity of
any program, text, and file.
This project can be used in any institute to prevent copy any assignment
from other.
This program can also applicable in search for any duplicating
information from same program is executing in different machine which
is connected to the main server.
t is also applicable in online duplication.
This project can also be used to avoid plagiarism.
It can also be used to eliminate redundancy of data.
It also helps to reduce the cost of a particular project.
This project namely Measurement of Software Similarity, helps to detect
data redundancy of any software, programme, text or file. One of the
biggest disadvantages of data redundancy is that it increases the size of
the database unnecessarily. Also data redundancy might cause the same
result to be returned as multiple search results when searching the
database causing confusion in the results. This also wastes a lot of space
thus incurring extra cost.
Another problem that can be met with Plagiarism is the act of taking
credit for someone else's work. This particular project helps to eliminatethis drawback.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
11/46
1
1.3 plagiarism
Plagiarism, as defined in the 1995 Random House Compact Unabridged
Dictionary, is the "use or close imitation of the language and thoughts ofanother author and the representation of them as one's own original
work". Within academia, plagiarism by students, professors, or
researchers is considered academic dishonesty or academic fraud and
offenders are subject to academic censure, up to and including
expulsion. In journalism, plagiarism is considered a breach of
journalistic ethics, and reporters caught plagiarizing typically face
disciplinary measures ranging from suspension to termination of
employment. Some individuals caught plagiarizing in academic or
journalistic contexts claim that they plagiarized unintentionally, by failing
to include quotations or give the appropriate citation. While plagiarism
in scholarship and journalism has a centuries-old history, the
development of the Internet, where articles appear as electronic text,
has made the physical act of copying the work of others much easier.
Plagiarism is not the same as copyright infringement. While both
terms may apply to a particular act, they are different transgressions.
Copyright infringement is a violation of the rights of a copyright holder,
when material protected by copyright is used without consent. On the
other hand, plagiarism is concerned with the unearned increment to the
plagiarizing author's reputation that is achieved through false claims of
authorship.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
12/46
12
1.3.1 Plagiarism prevention
Plagiarism cannot be eliminated completely but som
preventive measures may reduce plagiarism to minimum. There are threemain strategies to prevent plagiarism. First is the Trust Method, wherei
the students are told that we trust them and they are mature enough to
know that the test is for their benefit and cheating will prohibit thei
chances to see how well they mastered a particular concept. Thus, th
Trust Method trusts the learners to obey the rules and is implemented b
making the learners sign a Honor code before appearing for the test
Second is the Fence Method, which aims at making cheating impossible
This is implemented by tightening the security during tests, differen
questions for different students etc. Third Method is the Threat Method
which threatens the learners with the punishments that they will have to
face if plagiarism is detected. This is done, by announcing the penalt
before the assignment submissions or tests have started. Ideally one o
more of the above methods can be used as a preventive measure. Th
Instructor has to decide as to which method/methods to adopt based o
the purpose of the test. If the test is a part of the final exam of a course o
degree, then Fence Method or Threat method or both could b
implemented. If the test were a practice test for the self-assessment o
learners, Trust Method would be the best. These methods are to b
implemented before the commencement of the test/assignmen
submission. Additionally preventive measures can also be taken whil
conducting the test. If we have a test running parallel in a number o
remote centers, we can have authorised proctors to inspect the exam a
respective centers. These proctors can make sure that only authorised
students are taking the test at proper time, without any unauthorized
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
13/46
13
help. The tests at these centers can also be supervised by observing eac
center by Video Conferencing from a coordinating center. 1.3.2
Plagiarism Detection
In the previous section we saw the preventive measures fo
plagiarism. Now, after the test is conducted and the results are out, the
tough job starts. The questions becoming increasingly important in thi
context are can we trust the results that the machine has given us i.e
Does it mean that a student has understood a concept just because hi
score says so. If not then how do we differentiate between genuin
attempts and copies. In other words how to we detect copies among the
number of assignments submitted. Detecting plagiarism in a test fo
which n students had appeared, involves comparing each solution with
other n-1 solutions and this is not a trivial task. Let us see some attempt
to detect plagiarism in programming tests, which have evolved over time
Traditional attempts to detect plagiarism have been ad-hoc, typicall
involving manual checking, of programming assignments, for plagiarismThis manual checking too mostly happens only for suspected program
like two programs failing for same testcases, two programs looking ver
similar by structure etc. Also, the plagiarism detection is limited t
programs, which look alike or verbatim copy. Manually checking all th
programs in all possible combinations of plagiarism requires fair amoun
of time and manpower, especially when the number of programs to be
tested is large. Inspecting all the possible combinations for more complex
attempts of plagiarism (beyond verbatim copy), in such a scenario, is a
tougher job. The inconvenience and limitations of traditional attempts fo
detecting plagiarism led the instructors to exploit some advance
methods to do the same. Instructors eventually started using available
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
14/46
14
tools (e.g Unix utilities like diff, cmp, etc) to automate the task of
detecting all possible combinations of plagiarism among a large set o
programs. Use of these tools minimized the time and effort, howeve
plagiarism detection was yet limited to verbatim copies.
1.4 Ways to handle technologyenhanced cheating
Focus on the process of writing - observe and coach thprocess. Require a thesis statement, an initial bibliography, an outlinenotes, a first draft etc.
Avoid "choose any topic" papers. Tie the topic to the goals o
the course.
Use a few papers from "cheat sites" as examples. Provide agrade for these and use as reference material. Students will be hesitant touse a service you know about.
Be clear and comprehensive regarding plagiarism policies. Thmore students know the less likely they will be to attempt plagiarism.
Require students to use material from class lectures
presentations, discussions etc in their graded assignments. This makefinding "matching" papers more difficult.
Require students to conduct an original survey or interview apart of the assignment. The survey or transcripts of the interview arincluded as an appendix.
Require an annotated bibliography as part of the process owriting the assignment. These are difficult to plagiarize.
Require an abstract of the paper where appropriate. Writing aaccurate synopsis of a plagiarized paper is difficult.
Require a description of the research process with the finadraft.
Get to know your students. Require a writing sample during thfirst week of class. Have the students do this in their "best written style
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
15/46
1
and make it personalized and customized to them individually. Keep thion record for comparison purposes.
Use Plagiarism.org or Plagiarism.com to check submitted wor(links below).
Use MOSS (Measure of Software Similarity) which detectplagiarism in programming classes (link below).
Make assignments relatively difficult. This makes it mordifficult to get casual, though ongoing, help during the semester.
Frequent assessments also make getting help logisticalldifficult.
Use master type questions and case studies rather tha"memorization" questions.
If using online quizzes - give different questions to differenstudents - i.e. use a test bank. Add a short answer question that will begraded by hand.
If using online tests or quizzes limit the amount of time the tesis available.
Use alternate means of assessment, portfolios and multiplmeasures of mastery.
Use proctored exams (only if absolutely necessary).
If you suspect plagiarism, look carefully at the paper and gentlyconfront the student with your concerns. Frequently this is enough to
uncover or deter plagiarism.
Require raw materials of the research process. For examplecopies of the cited works.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
16/46
16
1.5 Teaching activities to preventcheating
Quizzes: Create regular, frequent (weekly or daily) quizzes fo
students.
Discussion: Create discussions and use participation in discussionas an aid in measuring student progress.
Request feedback: Randomly e-mail all the students in the clasand request a comment or two on some subject.
Variance analysis: Check the regular quiz scores to see if there i
a sudden change. For example, a student flunks five quizzes and thenhires someone to take the final online exam and gets an A.
Spot calls: If a teacher has any concerns about a particulaindividual, she or he can call the student and have a short discussion. Iwill quickly reveal whether the student knows the course material.
Online chat exams: The instructor can conduct an oral charoom exam with each student to interactively test the studentsknowledge of the course material.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
17/46
17
1.6 Additional security techniques
First, many of the same problems regarding the authenticity of astudents work and plagiarism exist in the traditional classroom as well
To get someones help through an entire online program would taksubstantial effort. For most students it is just not possible to havconsistent help through many tests at many different times. Besideswho would consent to putting in so much work for someone else and noget credit for it?
Use a log-in/password system (but of course, a student could jusgive the username and password to someone else).
Make exercises difficult enough so that the person who hasnt donethe previous work in your course will not be able to complete thassignment.
Give many short exams that are embedded in class exercises sothat it would be difficult for a student to have "help" there all the time.
Ask mastery-type questions so that a student must know thmaterial himself/herself in order to answer the question (i.e. case studieVs memorization questions).
Ask students to relate the subject matter to their owpersonal/professional/life experiences so their answers are personalizeand difficult to replicate.
Require students to submit an outline and rough draft of termpapers and essays before the final paper is due. This way, a professocan see the work in progress.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
18/46
1
Give different questions to different students construct a large seof questions from which an automated testing program can randomlselect (i.e. a database of 50 questions with 10 randomly chosen).
Limit the times when the online test is available; ensure that th
test is taken in a certain amount of time. Some automated testingprograms allow this feature.
Provide online exam practice sample questions, self-studquestions with answers and feedback, and require a proctored, nononline examination for course credit (i.e. on campus, at a testing centerlibrary, etc.)
Finally, remember that testing should never be the only means by
which you assess the abilities of your students. If they are evaluated witvarious different methods, you have the best way of ensuring that theris real learning taking place. As with a traditional classroom, the besway to assess student and course progress is to know the studenthrough the student's work and pay attention to student feedback.
The American Association of Higher Education has devisenine principles of good practice for assessing student learning. Thescan also be helpful when thinking about how to avoid plagiarism andcheating in online courses. The principles are:
The assessment of student learning begins with educational valuesAssessment is not an end in itself but a vehicle for educationaimprovement. Its effective practice, then, begins with and enacts a visionof the kinds of learning we most value for students and strive to helthem achieve. Educational values should drive not only whatwe choosto assess but also how we do so. Where questions about educationamission and values are skipped over, assessment threatens to be aexercise in measuring what's easy, rather than a process of improvingwhat we really care about.
Assessment is most effective when it reflects an understanding olearning as multidimensional, integrated, and revealed in performanceover time. Learning is a complex process. It entails not only whastudents know but what they can do with what they know; it involves noonly knowledge and abilities but values, attitudes, and habits of mindthat affect both academic success and performance beyond thclassroom. Assessment should reflect these understandings b
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
19/46
19
employing a diverse array of methods, including those that call for actuaperformance, using them over time so as to reveal change, growth, andincreasing degrees of integration. Such an approach aims for a morecomplete and accurate picture of learning, and therefore firmer bases foimproving our students' educational experience.
Assessment works best when the programs it seeks to improvhave clear, explicitly stated purposes. Assessment is a goal-orienteprocess. It entails comparing educational performance with educationapurposes and expectations -- those derived from the institution'mission, from faculty intentions in program and course design, and fromknowledge of students' own goals. Where program purposes lacspecificity or agreement, assessment as a process pushes a camputoward clarity about where to aim and what standards to applyassessment also prompts attention to where and how program goals wi
be taught and learned. Clear, shared, implementable goals are thcornerstone for assessment that is focused and useful.
Assessment requires attention to outcomes but also to thexperiences that lead to those outcomes.Information about outcomes iof high importance; where students "end up" matters greatly. But timprove outcomes, we need to know about student experience along theway -- about the curricula, teaching, and kind of student effort that leadto particular outcomes. Assessment can help us understand whic
students learn best under what conditions; with such knowledge comethe capacity to improve the whole of their learning.
Assessment works best when it is ongoing not episodic. Assessmenis a process whose power is cumulative. Though isolated, "one-shotassessment can be better than none, improvement is best fostered whenassessment entails a linked series of activities undertaken over time
This may mean tracking the process of individual students, or of cohortof students; it may mean collecting the same examples of studenperformance or using the same instrument semester after semester. The
point is to monitor progress toward intended goals in a spirit ocontinuous improvement. Along the way, the assessment process itseshould be evaluated and refined in light of emerging insights.
Assessment fosters wider improvement when representatives fromacross the educational community are involved. Student learning is campus-wide responsibility, and assessment is a way of enacting tharesponsibility. Thus, while assessment efforts may start small, the aim
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
20/46
20
over time is to involve people from across the educational communityFaculty members play an especially important role, but assessment'questions can't be fully addressed without participation by studentaffairs educators, librarians, administrators, and students. Assessmenmay also involve individuals from beyond the campus (alumni/aetrustees, employers) whose experience can enrich the sense o
appropriate aims and standards for learning. Thus understoodassessment is not a task for small groups of experts but a collaborativeactivity; its aim is wider, better-informed attention to student learning byall parties with a stake in its improvement.
Assessment makes a difference when it begins with issues of useand illuminates questions that people really care about. Assessmenrecognizes the value of information in the process of improvement. Buto be useful, information must be connected to issues or questions thapeople really care about. This implies assessment approaches tha
produce evidence that relevant parties will find credible, suggestive, andapplicable to decisions that need to be made. It means thinking in
advance about how the information will be used, and by whom. Thepoint of assessment is not to gather data and return "results"; it is aprocess that starts with the questions of decision-makers, that involvethem in the gathering and interpreting of data, and that informs andhelps guide continuous improvement.
Assessment is most likely to lead to improvement when it is part o
a larger set of conditions that promote change. Assessment alonchanges little. Its greatest contribution comes on campuses where thquality of teaching and learning is visibly valued and worked at. On succampuses, the push to improve educational performance is a visible andprimary goal of leadership; improving the quality of undergraduateducation is central to the institution's planning, budgeting, anpersonnel decisions. On such campuses, information about learninoutcomes is seen as an integral part of decision making, and avidlsought.
9.Through assessment, educators meet responsibilities to studentand to the public. There is a compelling public stake in education. Aeducators, we have a responsibility to the public that supports odepends on us to provide information about the ways in which oustudents meet goals and expectations. But that responsibility goebeyond the reporting of such information; our deeper obligation -- toourselves, our students, and society -- is to improve. Those to whom
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
21/46
2
educators are accountable have a corresponding obligation to supporsuch attempts at improvement.
2. PlatformProcedural programming can sometimes be used a
a synonym for imperative programming (specifying the steps thprogram must take to reach the desired state), but can also refer (as inthis article) to a programming paradigm, derived from structureprogramming, based upon the concept of theprocedure call. Proceduresalso known as routines, subroutines, methods, or functions (not to beconfused with mathematical functions, but similar to those used ifunctional programming) simply contain a series of computational stepto be carried out. Any given procedure might be called at any poinduring a program's execution, including by other procedures or itself. Aprocedural programming language provides a programmer a mean
to define precisely each step in the performance of a task. Thprogrammer knows what is to be accomplished and provides through thelanguage step-by-step instructions on how the task is to be done. Using aprocedural language, the programmer specifies language statements tperform a sequence of algorithmic steps. Procedural programming ioften a better choice than simple sequential or unstructureprogramming in many situations which involve moderate complexity owhich require significant ease of maintainability.
Possible benefits:
The ability to re-use the same code at different placein the program without copying it.An easier way to keep track of program flow than a
collection of "GOTO" or "JUMP" statements (which can turn a largecomplicated program into spaghetti code).
The ability to be strongly modular or structured.Emphasis is on doing things algorithm.Employs top-down approach in program design.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
22/46
22
Large programs are divided into smaller programknown as functions.
3. What is Algorithm
In mathematics, computer science, and relatesubjects, an algorithm is an effective method for solving a problemusing a finite sequence of instructions. Algorithms are used focalculation, data processing, and many other fields. Each algorithm is alist of well-defined instructions for completing a task. Starting from ainitial state, the instructions describe a computation that proceedthrough a well-defined series of successive states, eventuallterminating in a final ending state. The transition from one state to th
next is not necessarily deterministic; some algorithms, known arandomized algorithms, incorporate randomness. If you sit down in fronof a computer and try to write a program to solve a problem, you will btrying to do four out of five things at once.
These are:
ANALYSE THE PROBLEM
DESIGN A SOLUTION/PROGRAM
CODE/ENTER THE PROGRAM
TEST THE PROGRAM
5. EVALUATE THE SOLUTION
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
23/46
23
To begin with we will look at three methods used icreating an algorithm, these are
STEPPING
LOOPING
CHOOSING
3.1 ALGORITHM OF THE CHARACTERMATCHING
STEP 1: Begin
STEP 2: We take two file names in two pointer variable fn1 andfn2
STEP 3: fopen(fn1)
STEP 4: If fn1 not opened then
Print Cannot open first file
Return
Else
Print File is open
STEP 5: c=0
STEP 6: Repeat 6 TO 16 as long as !feof(f1)
STEP 7: str1= fgetc(f1)
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
24/46
24
STEP 8: f=0
STEP 9: for i=0; i=0) thenFlag=1i=i+1
STEP 15: if(flag==1) thenPrint match
STEP 16: fclose(f1)
STEP 17: fclose(f2)
STEP 18: END
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
25/46
2
3.2 Algorithm of the string Matching
Step 1: Begin
Step 2: We take two files name in two pointer variable fn1 & fn2
Step 3: fopen (fn1)
Step 4: If fn1 not opened thenPrint cannot open first fileReturnElsePrint file is open
Step 5: Repeat Steps 6 TO 24 as long as! Feof (f1)
Step 6: i =0
Step 7: str1=NULL
Step 8: ch= fgetc (f1)
Step 9: Repeat Steps 10 to 12 as long as ch! = and ch! = \nand ch!= EOF
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
26/46
26
Step 10: str1[i]= ch
Step 11: ch=fgetc(f1)
Step 12: i=i+1
Step 13: fopen(fn2)
Step 14: if fn2 not open thenPrint cannot open second fileexit
Step 15: repeat steps 16 to 24 as long as! Feof (f2)
Step 16: i=0
Step 17: str2=NULL
Step 18: ch=fgetc (f2)
Step 19: repeat steps 20 to 22 as long as ch= and ch! \nand ch! =EOF
Step 20:str2 [i] =ch
Step 21: ch=fgetc (f2)
Step 22: i=i+1
Step 23: str1=str2 thenPrint match
Step 24: fclose (f1)
Step 25: fclose (f2)
Step 26: END
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
27/46
27
4. FLOW-CHART
What is Flow-Chart
A flowchart is a pictorial representation of an algorithm. It is the layout, i
a visual, two-dimensional format, of the plan to be followed when the correspondin
algorithm is converted into a program by writing in a programming language. It acts like
roadmap like a programmer and guides him/her on how to go from the starting point to th
final point while converting the algorithm into a computer program.Flow Chart is the pictorial representation of separate steps o
a process.Using Flow-Chart one can easily design, analyze, prepar
documentation or manage a process running in a system
Why we use Flow-Chart
Normally, an algorithm is first represented in the form of aflowchart and the flowchart is then expressed in some programminlanguage to prepare a computer program. The main advantage of thestwo step approach in programming writing is that while drawing flowchart, a programmer is not concern with the details of the elementof programming language. Hence, he/she can fully concentrate on thlogic of the procedure. Moreover, since a flowchart shows the flow ooperations in pictorial form, any error in the logic of the procedure cabe detected more easily than in the case of a program. Once th
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
28/46
2
flowchart is ready the programmer can forget about the logic and canconcentrate only on coding the operations in each box of the flowchart interms of the statements of the programming language. This will normallyensure an error-free program.
The symbols used in a Flow-Chart are shown below:
SYMBOLNAMEDESCRIPTION
TerminatorTo indicate The start or stop of a Flow-Chart
Input/OutputTo take any input or output In th Flow-Chart
ProcessTo represent a running process in the Flow-Chart
Decision BoxTo make a decision
ConnectorTo connect one part of the flow chart to the other, oto continue flow Chart from one page to another.
4.1 Flow chart of Character Matching
Syamaprasad Institute Of Technology & Managemen
Input fn1, fn2
C = 0
Fopen (fn1)
Start
-
7/22/2019 Measurement of Software Similarity
29/46
29
Yes
Syamaprasad Institute Of Technology & Managemen
Str 1 = fgetc (f1)
f = 0
l = 0
If
match[i
] =STR
1
f = 1
I =i+1
If
i
-
7/22/2019 Measurement of Software Similarity
30/46
30
4.2 PROGRAM OF THE CHARACTERMATCHING
Syamaprasad Institute Of Technology & Managemen
Str2=fgetc (f2)
If (str1 =
str2) and
(str1 >=0)
Flag =1
I = i+1
If? Feof
(f2)
If flag
==1?
Prientf (tMatch = %c: appeared %d times,
str1, i)
Fclose (f2)
If!
Feof
(f1)
Prientf (End of the program..);
Fclose (f1)
4
Stop
Yes
Yes
No
YesNo
No
-
7/22/2019 Measurement of Software Similarity
31/46
3
#include #include #include void main()
{ FILE *f1,*f2;char ch,*fn1,*fn2,str1,str2,*match;int i,len,flag,c,f;clrscr();printf(\n\t Enter 1st file name with extension: );gets(fn1);if((f1=fopen(fn1,r))==NULL){
printf(Cannot open first file.\n);getch();
return;}else
printf(%s File is opend,fn1);fflush(stdin);printf(\n\tEnter 2nd file name with extension: );gets(fn2);c=0;
while (!feof (f1)){
Str1=fgetc(f1);F=0;For(i=0;i
-
7/22/2019 Measurement of Software Similarity
32/46
32
getch();Return;
}if(f==0){Flag=0;
i=0;While(!feof(f2));{
str2=fgetc(f2);if(str1==str2 && str1>=0)
{flag=1;i++;
}
}if(flag==1)
{printf(\n\n\tMatch = %c; appeared %d times,str1,i);fclose(f2);
}}printf(\n\n\t\tEnd of the program.);fclose(f1);
fflush(stdin);getch();}
4.3 Flow chart of the string Matching
Syamaprasad Institute Of Technology & Managemen
Start
Input fn1, fn2
C = 0
Fopen (fn1)
-
7/22/2019 Measurement of Software Similarity
33/46
33
Yes
Syamaprasad Institute Of Technology & Managemen
i = o: str1= NULL: ch= fgetc = (f1)
=o
Str1 [i] = ch
i= i+1
ch = fgetc (f1)
If ch! =
NULL &&
ch! =\n &&
ch! = EOF?
Fopen (f2)
i = o: str1= NULL: ch= fgetc = (f2)
=o
Step2[i] =ch : ch=fgetc(f2): I++
=o If ch! = NULL&& ch! =\n
&& ch! = EOF?
2
1
If
STR
=st2?
1
Printf ("\n\n\match = %s", str2);
Yes
No
-
7/22/2019 Measurement of Software Similarity
34/46
34
4.4 Program of the String Matching
#include#include#include
void main()
Syamaprasad Institute Of Technology & Managemen
! Feof
(f2) ?
Fclose (f2)
! Feof
(f1)?2
3
Printf (End of the program..)
Fclose (f1)
Stop
Yes
No
No
Yes
-
7/22/2019 Measurement of Software Similarity
35/46
3
{FILE *f1, *f2;char ch,*fn1,*fn2,*str1,*str2;int i,len;clrscr();printf("\n\tEnter 1st file name with extension : ");
i=0;gets(fn1);if ((f1 = fopen(fn1, "r")) == NULL)
{printf("Cannot open first file.\n");getch();return ;
}else
printf("%s file is opened",fn1);fflush(stdin);printf("\n\tEnter 2nd file name with extension : ");i=0;gets(fn2);while (!feof(f1)){i=0;str1="";
ch=fgetc(f1);while(ch!=' ' && ch!='\n' && ch!=EOF){str1[i]=ch;ch=fgetc(f1);i++;
}str1[i]='\0';printf("\nfile 1 string : %s",str1);
if ((f2=fopen(fn2, "r+")) == NULL){
printf("Cannot open second file.\n");getch();return ;
}while (!feof(f2)){
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
36/46
36
i=0;str2="";ch=fgetc(f2);while(ch!=' ' && ch!='\n' && ch!=EOF){
str2[i]=ch;
ch=fgetc(f2);i++;
}str2[i]='\0';printf("\nFile 2 string : %s ",str2);if(strcmp(str1,str2)==0)printf("\n\n\tMatch = %s",str2);
}fclose(f2);
}printf("\n\n\t\tEnd of the program....");fclose(f1);fflush(stdin);getch();
}
5.1 Necessary hardware
The project is designed so that it is compatible with server basmachine with operating system Windows (XP, Windows server 2000)Moreover the project is being computerized, as because computerizedsystem are worth-mentioning.
The hardware requirements for the project are follows: -
Motherboard-Intel OriginalProcessor-core 2 QuadOperating System- Windows server 2000RAM-DDR3 4GB 800 MHzHDD 1 TB
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
37/46
37
5.2 Necessary software
Operating System Windows XP
Software - Turbo C++: full installation minimum 5MB
Compiler.
6. AdvantagesMoss (for a Measurement Of Software Similarity) is an automati
system for determining the similarity of programrs.The system allows for a variety of more complicated situations
For example, it allows for a base file. The base file might be a programoutline or partial solution handed out by the instructor.
MOSS makes it easy to examine the corresponding portions of aprogram pair. Clicking on a program pair in the results summary bringup side-by-side frames containing the program sources.
MOSS just as easily uncovers more sophisticated attempts acheating. Multiple distinct similar sections separated by sections witdifferences are still found and given color-coded highlighting.
Traditional attempts to detect plagiarism have been ad-hoctypically involving manual checking, of programming assignments, foplagiarism.
There was strongly a need for more sophisticated mechanismwhich would automate the task to a large extent as well as detect fairlycomplex attempts of copies.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
38/46
3
7. FUTURE Scope
The future scope of the project is that it enables to detecseveral software which are very much similar to each other. Hence i
helps to increase the efficiency of the project. Hence it helps to preventhe duplicacy of any software. Analogy software estimation is based oassumption. Similar software projects have similar software effort. Buincomplete and noisy data, measurement and similarity assessmenuncertainty, complex interaction between attributes, data type ordinaand nominal scale.
8.Problems
Two projects that may seem similar may indeed be different ia critical way. The uncertainty in assessing similarities and differencemeans that two different estimators could significantly develop differenviews and effort estimates.
The uncertainty stem form:Data collection tool.
The type of information available.Attribute measurement.Skill of estimator.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
39/46
39
8. REFERENCE
Help with Cheating
Plagiarism.org includes software to detect plagiarism and allowa free trial. http://www.plagiarism.org/
Plagiarism.com is more plagiarism software, also has a sedetection test (http://www.plagiarism.com/self.detect.htm) to hel
students spot plagiarism in their work. http://www.plagiarism.com/
Plagiarism Webliography for Faculty
An extensive list of the websites, resources and detection toolshttp://www.utpb.edu/library/plagiarism.htmlMOSS (Measure of Software Similarity) Detects plagiarism i
programming classes http://theory.stanford.edu/~aiken/moss/
2. Word Check Systems "checks keyword uses and keywordfrequencies in electronic documents and presents a "percentage omatch" between compared data.http://www.wordchecksystems.com/
Cheat Sites
1.Direct Essays: http://directessays.com/
Syamaprasad Institute Of Technology & Managemen
http://www.plagiarism.org/http://www.plagiarism.com/http://www.wordchecksystems.com/http://www.plagiarism.org/http://www.plagiarism.com/http://www.wordchecksystems.com/ -
7/22/2019 Measurement of Software Similarity
40/46
40
A 1 Term Paper: http://www.a1-termpaper.com/
Fast Papers: http://www.fastpapers.com/
Student Network Resources: http://www.snrinfo.com/
Schoolsucks: http://www.schoolsucks.com/
Cheathouse: http://www.cheathouse.com/
EZwrite: http://www.ezwrite.com/
Term Papers on File: http://www.termpapers-on-file.com/
Research Assistance: http://www.research-assistance.com/^ J. MacQueen, 1967^ Yi Lu, Shiyong Lu, Farshad Fotouhi, Youping Deng, and Susa
Brown, "FGKA: A Fast Genetic K-means Algorithm", in Proc. of the 19tACM Symposium on Applied Computing, pp. 162-163, Nicosia, CyprusMarch, 2004.
^ Yi Lu, Shiyong Lu, Farshad Fotouhi, Youping Deng, and SusaBrown, "Incremental Genetic K-means Algorithm and its Application i
Gene Expression Data Analysis", BMC Bioinformatics, 5(172), 2004.^ Bezdek, James C. (1981), Pattern Recognition with FuzzyObjective Function Algorithms, ISBN0306406713
^Google News personalization: scalable online collaborativfiltering
^ Basak S.C., Magnuson V.R., Niemi C.J., Regal R.R. "DetermingStructural Similarity of Chemicals Using Graph Theoretic Indices". Discr
Appl. Math., 19, 1988: 17-44.^ E. B. Fowlkes & C. L. Mallows (September 1983). "A Method
for Comparing Two Hierarchical Clusterings".Journal of the American
Statistical Association78(384): 553584. doi:10.2307/2288117.^ Alexander Kraskov, Harald Stgbauer, Ralph G. Andrzejak
and Peter Grassberger, "Hierarchical Clustering Based on MutuaInformation", (2003)ArXiv q-bio/0311039
^ David J. Marchette, Random Graphs for Statistical PatternRecognition, Wiley-Interscience, 2004.
^ Jiyeon Choo, Rachsuda Jiamthapthaksin, Chun-sheng ChenOner Ulvi Celepcikay, Christian Giusti, and Christoph F. Eick, "MOSAIC: A
Syamaprasad Institute Of Technology & Managemen
http://www.a1-termpaper.com/http://www.ezwrite.com/http://www.termpapers-on-file.com/http://www.research-assistance.com/http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-0%23cite_ref-0http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-1%23cite_ref-1http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-2%23cite_ref-2http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-Bezdek1981_3-0%23cite_ref-Bezdek1981_3-0http://en.wikipedia.org/wiki/International_Standard_Book_Numberhttp://en.wikipedia.org/wiki/Special:BookSources/0306406713http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-4%23cite_ref-4http://www2007.org/program/paper.php?id=570http://www2007.org/program/paper.php?id=570http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-5%23cite_ref-5http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-6%23cite_ref-6http://en.wikipedia.org/wiki/Journal_of_the_American_Statistical_Associationhttp://en.wikipedia.org/wiki/Journal_of_the_American_Statistical_Associationhttp://en.wikipedia.org/wiki/Digital_object_identifierhttp://dx.doi.org/10.2307%2F2288117http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-7%23cite_ref-7http://arxiv.org/abs/q-bio/0311039http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-8%23cite_ref-8http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-9%23cite_ref-9http://www.a1-termpaper.com/http://www.ezwrite.com/http://www.termpapers-on-file.com/http://www.research-assistance.com/http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-0%23cite_ref-0http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-1%23cite_ref-1http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-2%23cite_ref-2http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-Bezdek1981_3-0%23cite_ref-Bezdek1981_3-0http://en.wikipedia.org/wiki/International_Standard_Book_Numberhttp://en.wikipedia.org/wiki/Special:BookSources/0306406713http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-4%23cite_ref-4http://www2007.org/program/paper.php?id=570http://www2007.org/program/paper.php?id=570http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-5%23cite_ref-5http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-6%23cite_ref-6http://en.wikipedia.org/wiki/Journal_of_the_American_Statistical_Associationhttp://en.wikipedia.org/wiki/Journal_of_the_American_Statistical_Associationhttp://en.wikipedia.org/wiki/Digital_object_identifierhttp://dx.doi.org/10.2307%2F2288117http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-7%23cite_ref-7http://arxiv.org/abs/q-bio/0311039http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-8%23cite_ref-8http://en.wikipedia.org/wiki/Cluster_analysis#cite_ref-9%23cite_ref-9 -
7/22/2019 Measurement of Software Similarity
41/46
4
proximity graph approach for agglomerative clustering," Proceedings 9thInternational Conference on Data Warehousing and KnowledgeDiscovery (DaWaK), Regensbug Germany, September 2007.
.Clatworthy, J., Buick, D., Hankins, M., Weinman, J., & Horne, R(2005). The use and reporting of cluster analysis in health psychology: Areview. British Journal of Health Psychology10: 329-358.
Cole, A. J. & Wishart, D. (1970). An improved algorithm for thJardine-Sibson method of generating overlapping clusters. The ComputeJournal 13(2):156-163.
Ester, M., Kriegel, H.P., Sander, J., and Xu, X. 1996. A densitybased algorithm for discovering clusters in large spatial databases witnoise. Proceedings of the 2nd International Conference on KnowledgDiscovery and Data Mining, Portland, Oregon, USA: AAAI Press, pp. 226231.
Heyer, L.J., Kruglyak, S. and Yooseph, S., Exploring Expressio
Data: Identification and Analysis of Coexpressed Genes, GenomeResearch 9:1106-1115.
S. Kotsiantis, P. Pintelas, Recent Advances in Clustering: A BrieSurvey, WSEAS Transactions on Information Science and ApplicationsVol 1, No 1 (73-81), 2004.
Huang, Z. (1998). Extensions to the K-means Algorithm foClustering Large Datasets with Categorical Values. Data Mining andKnowledge Discovery, 2, p. 283-304.
Wong, W., Liu, W. & Bennamoun, M. Tree-Traversing An
Algorithm for Term Clustering based on Featureless Similarities. In: DatMining and Knowledge Discovery, Volume 15, Issue 3, Pages 349381. doi: 10.1007/s10618-007-0073-y. A demo of this term clusteringalgorithm is available here
Jardine, N. & Sibson, R. (1968). The construction of hierarchiand non-hierarchic classifications. The Computer Journal 11:177.
The on-line textbook: Information Theory, Inference, anLearning Algorithms, by David J.C. MacKay includes chapters on k-meanclustering, soft k-means clustering, and derivations including the E-Malgorithm and the variational view of the E-M algorithm.
MacQueen, J. B. (1967). Some Methods for classification anAnalysis of Multivariate Observations, Proceedings of 5-th BerkeleSymposium on Mathematical Statistics and Probability, BerkeleyUniversity of California Press, 1:281-297
Ng, R.T. and Han, J. 1994. Efficient and effective clusterinmethods for spatial data mining. Proceedings of the 20th VLDConference, Santiago, Chile, pp. 144155.
Syamaprasad Institute Of Technology & Managemen
http://dx.doi.org/10.1007/s10618-007-0073-yhttp://explorer.csse.uwa.edu.au/research/algorithm_tta.plhttp://www.inference.phy.cam.ac.uk/mackay/itila/http://www.inference.phy.cam.ac.uk/mackay/itila/http://en.wikipedia.org/wiki/David_J.C._MacKayhttp://dx.doi.org/10.1007/s10618-007-0073-yhttp://explorer.csse.uwa.edu.au/research/algorithm_tta.plhttp://www.inference.phy.cam.ac.uk/mackay/itila/http://www.inference.phy.cam.ac.uk/mackay/itila/http://en.wikipedia.org/wiki/David_J.C._MacKay -
7/22/2019 Measurement of Software Similarity
42/46
42
Prinzie A., D. Van den Poel (2006), Incorporating sequentiainformation into traditional classification models by using aelement/position-sensitive SAM. Decision Support Systems 42 (2): 508526.
Rivera, C. G., Vakil, R. M. & Bader, J. S. NeMo: Network Moduleidentification in Cytoscape. BMC Bioinformatics 2010, 11(Supp
1):S61.doi: 10.1186/1471-2105-11-S1-S61. The plugin can bdownloaded in Cytoscape or here.
Romesburg, H. Clarles, Cluster Analysis for Researchers, 2004340 pp. ISBN 1-4116-0617-5, reprint of 1990 edition published by KriegePub. Co... A Japanese language translation is available from UchidRokakuho Publishing Co., Ltd., Tokyo, Japan.
Sheppard, A. G. (1996). The sequence of factor analysis ancluster analysis: Differences in segmentation and dimensionality throughthe use of raw and factor scores. Tourism Analysis, 1(Inaugural Volume)
49-57.Sergios Theodoridis, Konstantinos Koutroumbas (2009) "Patter
Recognition" , 4th Edition, Academic Press, ISBN: 978-1-59749-272-0.Zhang, T., Ramakrishnan, R., and Livny, M. 1996. BIRCH: A
efficient data clustering method for very large databases. Proceedings oACM SIGMOD Conference, Montreal, Canada, pp. 103114.
Nguyen Xuan Vinh, Epps, J. and Bailey, J., 'Information TheoretiMeasures for Clusterings Comparison: Is a Correction for ChancNecessary?', in Procs. the 26th International Conference on Machin
Learning (ICML'09).Jianbo Shi and Jitendra Malik, "Normalized Cuts and ImagSegmentation", IEEE Transactions on Pattern Analysis and MachinIntelligence, 22(8), 888-905, August 2000. Available onJitendra Malik'homepage
Marina Meila and Jianbo Shi, "Learning Segmentation witRandom Walk", Neural Information Processing Systems, NIPS, 2001Available fromJianbo Shi's homepage
see referenced articles atLuigidragone.comKernel MDL to Determine the Number of Clusters,MLDM, pp
203-217, 2007.Stan Salvador and Philip Chan, Determining the Number o
Clusters/Segments in Hierarchical Clustering/Segmentation AlgorithmsProc. 16th IEEE Intl. Conf. on Tools with AI, pp. 576584, 2004.
Can, F., Ozkarahan, E. A. (1990) "Concepts and effectiveness othe cover coefficient-based clustering methodology for text databases.ACM Transactions on Database Systems. 15 (4) 483-517
Aldenderfer, M.S., Blashfield, R.K, Cluster Analysis, (1984), Newbury Par(CA): Sage.
Syamaprasad Institute Of Technology & Managemen
http://econpapers.repec.org/paper/rugrugwps/05_2F292.htmhttp://econpapers.repec.org/paper/rugrugwps/05_2F292.htmhttp://econpapers.repec.org/paper/rugrugwps/05_2F292.htmhttp://econpapers.repec.org/paper/rugrugwps/05_2F292.htmhttp://www.biomedcentral.com/1471-2105/11/S1/S61/abstracthttp://baderlab.bme.jhu.edu/baderlab/index.php/NeMohttp://en.wikipedia.org/wiki/Special:BookSources/1411606175http://en.wikipedia.org/w/index.php?title=Krieger_Pub._Co.&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Krieger_Pub._Co.&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Uchida_Rokakuho_Publishing_Co.&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Uchida_Rokakuho_Publishing_Co.&action=edit&redlink=1http://www.cs.berkeley.edu/~malik/malik-pubs-ptrs.htmlhttp://www.cs.berkeley.edu/~malik/malik-pubs-ptrs.htmlhttp://www.cis.upenn.edu/~jshi/jshi_publication.htmhttp://www.luigidragone.com/datamining/spectral-clustering.html#referenceshttp://www.tsi.enst.fr/~kyrgyzov/publications.htmlhttp://www.springerlink.com/content/j646uqx4p435j530/http://www.springerlink.com/content/j646uqx4p435j530/http://cs.fit.edu/~pkc/papers/ictai04salvador.pdfhttp://cs.fit.edu/~pkc/papers/ictai04salvador.pdfhttp://econpapers.repec.org/paper/rugrugwps/05_2F292.htmhttp://econpapers.repec.org/paper/rugrugwps/05_2F292.htmhttp://econpapers.repec.org/paper/rugrugwps/05_2F292.htmhttp://www.biomedcentral.com/1471-2105/11/S1/S61/abstracthttp://baderlab.bme.jhu.edu/baderlab/index.php/NeMohttp://en.wikipedia.org/wiki/Special:BookSources/1411606175http://en.wikipedia.org/w/index.php?title=Krieger_Pub._Co.&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Krieger_Pub._Co.&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Uchida_Rokakuho_Publishing_Co.&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Uchida_Rokakuho_Publishing_Co.&action=edit&redlink=1http://www.cs.berkeley.edu/~malik/malik-pubs-ptrs.htmlhttp://www.cs.berkeley.edu/~malik/malik-pubs-ptrs.htmlhttp://www.cis.upenn.edu/~jshi/jshi_publication.htmhttp://www.luigidragone.com/datamining/spectral-clustering.html#referenceshttp://www.tsi.enst.fr/~kyrgyzov/publications.htmlhttp://www.springerlink.com/content/j646uqx4p435j530/http://www.springerlink.com/content/j646uqx4p435j530/http://cs.fit.edu/~pkc/papers/ictai04salvador.pdfhttp://cs.fit.edu/~pkc/papers/ictai04salvador.pdf -
7/22/2019 Measurement of Software Similarity
43/46
43
[edit]External links
This article's use ofexternal links may not follow Wikipedia'spolicies orguidelines
Pleaseimprove this article by removing excessive and inappropriate external links or by converting linkintofootnote references. (May 2009)Citeseerx.ist.psu.edu, When Is "Nearest Neighbor" Meaningful?
P. Berkhin, Citeseer.ist.psu.edu Survey of Clustering Data Mining Techniques, Accru
Software, 2002.Jain, Murty and Flynn: Citeseer.ist.psu.edu Data Clustering: A Review, ACM Comp. Surv
1999.for another presentation of hierarchical, k-means and fuzzy c-means see the introduction t
clustering on home.dei.polimi.it. It also has an explanation on mixture ofGaussians.David Dowe, csse.monash.edu.au, Mixture Modelling page - other clustering and mixtur
model links.
Gauss.nmsu.edu, A tutorial on clustering.Inference.phy.cam.ac.uk, The on-line textbook: Information Theory, Inference, an
Learning Algorithms, byDavid J.C. MacKay includes chapters on k-means clustering, soft k-mean
clustering, and derivations including the E-M algorithm and the variational view of the E-M algorithm.People.revoledu.com, Numerical example of Hierarchical Clustering.
Cran.r-project.org, kernlab - R package for kernel based machine learning (include
spectral clustering implementation)Home.dei.polimi.it - Tutorial with introduction of Clustering Algorithms (k-means, fuzzy
c-means, hierarchical, mixture of gaussians) + some interactive demos (java applets)
Data Mining Software at the Open Directory Project
Machine Learning Software at the Open Directory ProjectHomepages.feis.herts.ac.ukJava, Competitive Learning Application, a suite o
Unsupervised Neural Networks for clustering. Written in Java. Complete with all source code.
Factominer.free.fr, FactoMineR (free exploratory multivariate data analysis software linketo R)
AI4r.rubyforge.org, Data clustering algorithms implementation in Ruby (AI4R)
PMML Representation - Standard way to represent clustering models.
1C. Alexander. Notes on the Synthesis of Form. Harvard U. Press, 1964. 2Edward B. Allen , Sampat
Gottipati , Rajiv Govindarajan, Measuring size, complexity, and coupling of hypergraph abstractions osoftware: An information-theory approach, Software Quality Control, v.15 n.2, p.179-212, Jun
2007 [doi>10.1007/s11219-006-9010-3] 3Tom Arbuckle, Visually Summarising Software ChangProceedings of the 2008 12th International Conference Information Visualisation, p.559-568, July 09-11
2008 [doi>10.1109/IV.2008.58] 4T. Arbuckle, A. Balaban, D. K. Peters, and M. Lawford. Softwar
documents: Comparison and measurement. In Proc. 18th Int. Conf. on Software Eng.&Knowledge Engpages 740--745, July 2007. 5C. H. Bennett, P. Gcs, M. Li, P. Vitnyi, and W. H. Zurek. Informatio
distance. IEEE Trans. Information Theory, 44(4):1407--1423, 1998. 6M. Cebrin, M. Alfonseca, and A
Ortega. Common pitfalls using the normalized compression distance: What to watch out for in compressor. Comms. Info. Sys., 5(4):367--384, 2005. 7Gregory J. Chaitin, On the Length of Program
Syamaprasad Institute Of Technology & Managemen
http://en.wikipedia.org/w/index.php?title=Cluster_analysis&action=edit§ion=24http://en.wikipedia.org/wiki/Wikipedia:External_linkshttp://en.wikipedia.org/wiki/Wikipedia:What_Wikipedia_is_not#Wikipedia_is_not_a_mirror_or_a_repository_of_links.2C_images.2C_or_media_fileshttp://en.wikipedia.org/wiki/Wikipedia:What_Wikipedia_is_not#Wikipedia_is_not_a_mirror_or_a_repository_of_links.2C_images.2C_or_media_fileshttp://en.wikipedia.org/wiki/Wikipedia:External_linkshttp://en.wikipedia.org/wiki/Wikipedia:External_linkshttp://en.wikipedia.org/w/index.php?title=Cluster_analysis&action=edithttp://en.wikipedia.org/wiki/Wikipedia:Citing_sourceshttp://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.1422http://citeseer.ist.psu.edu/berkhin02survey.htmlhttp://citeseer.ist.psu.edu/jain99data.htmlhttp://home.dei.polimi.it/matteucc/Clustering/tutorial_html/index.htmlhttp://en.wikipedia.org/wiki/Normal_distributionhttp://en.wikipedia.org/wiki/Normal_distributionhttp://www.csse.monash.edu.au/~dld/cluster.htmlhttp://gauss.nmsu.edu/~lludeman/video/ch6pr.htmlhttp://www.inference.phy.cam.ac.uk/mackay/itila/http://en.wikipedia.org/wiki/David_J.C._MacKayhttp://en.wikipedia.org/wiki/David_J.C._MacKayhttp://people.revoledu.com/kardi/tutorial/Clustering/index.htmlhttp://cran.r-project.org/web/packages/kernlab/index.htmlhttp://home.dei.polimi.it/matteucc/Clustering/tutorial_html/http://www.dmoz.org/Computers/Software/Databases/Data_Mining/Public_Domain_Software/http://en.wikipedia.org/wiki/Open_Directory_Projecthttp://www.dmoz.org/Artificial_Intelligence/Machine_Learning/Software/http://en.wikipedia.org/wiki/Open_Directory_Projecthttp://homepages.feis.herts.ac.uk/~nngroup/software.phphttp://factominer.free.fr/http://en.wikipedia.org/wiki/R_programming_languagehttp://ai4r.rubyforge.org/index.htmlhttp://www.dmg.org/v4-0/ClusteringModel.htmlhttp://portal.acm.org/citation.cfm?id=1232687&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1232687&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1232687&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1232687&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1007/s11219-006-9010-3http://portal.acm.org/citation.cfm?id=1440202&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1440202&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1440202&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/IV.2008.58http://portal.acm.org/citation.cfm?id=321506&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://en.wikipedia.org/w/index.php?title=Cluster_analysis&action=edit§ion=24http://en.wikipedia.org/wiki/Wikipedia:External_linkshttp://en.wikipedia.org/wiki/Wikipedia:What_Wikipedia_is_not#Wikipedia_is_not_a_mirror_or_a_repository_of_links.2C_images.2C_or_media_fileshttp://en.wikipedia.org/wiki/Wikipedia:External_linkshttp://en.wikipedia.org/w/index.php?title=Cluster_analysis&action=edithttp://en.wikipedia.org/wiki/Wikipedia:Citing_sourceshttp://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.1422http://citeseer.ist.psu.edu/berkhin02survey.htmlhttp://citeseer.ist.psu.edu/jain99data.htmlhttp://home.dei.polimi.it/matteucc/Clustering/tutorial_html/index.htmlhttp://en.wikipedia.org/wiki/Normal_distributionhttp://www.csse.monash.edu.au/~dld/cluster.htmlhttp://gauss.nmsu.edu/~lludeman/video/ch6pr.htmlhttp://www.inference.phy.cam.ac.uk/mackay/itila/http://en.wikipedia.org/wiki/David_J.C._MacKayhttp://people.revoledu.com/kardi/tutorial/Clustering/index.htmlhttp://cran.r-project.org/web/packages/kernlab/index.htmlhttp://home.dei.polimi.it/matteucc/Clustering/tutorial_html/http://www.dmoz.org/Computers/Software/Databases/Data_Mining/Public_Domain_Software/http://en.wikipedia.org/wiki/Open_Directory_Projecthttp://www.dmoz.org/Artificial_Intelligence/Machine_Learning/Software/http://en.wikipedia.org/wiki/Open_Directory_Projecthttp://homepages.feis.herts.ac.uk/~nngroup/software.phphttp://factominer.free.fr/http://en.wikipedia.org/wiki/R_programming_languagehttp://ai4r.rubyforge.org/index.htmlhttp://www.dmg.org/v4-0/ClusteringModel.htmlhttp://portal.acm.org/citation.cfm?id=1232687&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1232687&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1232687&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1007/s11219-006-9010-3http://portal.acm.org/citation.cfm?id=1440202&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1440202&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1440202&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/IV.2008.58http://portal.acm.org/citation.cfm?id=321506&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483 -
7/22/2019 Measurement of Software Similarity
44/46
44
for Computing Finite Binary Sequences: statistical considerations, Journal of the ACM (JACM), v.16 n.
p.145-159, Jan. 1969 [doi>10.1145/321495.321506] 8Robert Noyes Chanon, On a measure o
program structure., Carnegie Mellon University, Pittsburgh, PA, 1974 9R. N. Chanon, On a measure oprogram structure, Programming Symposium, Proceedings Colloque sur la Programmation, p.9-16, Apr
09-11, 1974 10S. R. Chidamber , C. F. Kemerer, A Metrics Suite for Object Oriented Design, IEE
Transactions on Software Engineering, v.20 n.6, p.476-493, June 1994 [doi>10.1109/32.295895] 11RCilibrasi. The CompLearn Toolkit. {Online} http://complearn.sourceforge.net/, 2003. 12R. Cilibrasi and P
Vitnyi. Clustering by compression. IEEE Trans. Information Theory, 51(4):1523--1545, Apr2005. 13David Clark , Sebastian Hunt , Pasquale Malacaria, Quantitative Information Flow, Relation
and Polymorphic Types, Journal of Logic and Computation, v.15 n.2, p.181-199, Apr2005 [doi>10.1093/logcom/exi009] 14Norman Fenton, When a software measure is not a measur
Software Engineering Journal, v.7 n.5, p.357-362, Sept., 1992 15Maurice H. Halstead, Elements o
Software Science (Operating and programming systems series), Elsevier Science Inc., New York, NY1977 16Mark Harman, The Current State and Future of Search Based Software Engineering, 200
Future of Software Engineering, p.342-357, May 23-25, 2007 [doi>10.1109/FOSE.2007.29] 17Mar
Harman, Search Based Software Engineering for Program Comprehension, Proceedings of the 15th IEEInternational Conference on Program Comprehension, p.3-13, June 26-2
2007 [doi>10.1109/ICPC.2007.35] 18L. Hellerman, A Measure of Computational Work, IEE
Transactions on Computers, v.21 n.5, p.439-446, May 1972 [doi>10.1109/T-C.1972.223539] 19MJackson. The Name and Nature of Software Engineering, pages 1--38. LNCS. 2008. 20Dennis KafurA survey of software metrics, Proceedings of the 1985 ACM annual conference on The range of computin
: mid-80's perspective: mid-80's perspective, p.502-506, October 1985, Denver, Colorado, Unite
States [doi>10.1145/320435.320583] 21Huzefa Kagdi , Michael L. Collard , Jonathan I. Maletic, Asurvey and taxonomy of approaches for mining software repositories in the context of software evolution
Journal of Software Maintenance and Evolution: Research and Practice, v.19 n.2, p.77-131, Marc
2007 [doi>10.1002/smr.344] 22T. M. Khoshgoftaar and E. B. Allen. Applications of informatiotheory to software engineering measurement. Software Quality Journal, 3(2):79--103, June 1994. 23A. N
Kolmogorov. Three approaches to the quantitative definition of information. Probl. Inform. Trans., 1(1):1
7, 1965. 24G. Kroah-Hartman and K. Sievers. udev. {Online} http://www.kernel.org/, 2003. 25M. Li, X
Chen, X. Li, B. Ma, and P. Vitnyi. The similarity metric. IEEE Trans. Information Theory, 50(12):32503264, 2004. 26Rudi Lutz, Evolving good hierarchical decompositions of complex systems, Journal o
Systems Architecture: the EUROMICRO Journal, v.47 n.7, p.613-634, July 2001 [doi>10.1016/S1383
7621(01)00019-4] 27Thomas J. McCabe, A complexity measure, Proceedings of the 2nd internationconference on Software engineering, p.407, October 13-15, 1976, San Francisco, California, Unite
States 28Stephen McCamant , Michael D. Ernst, Quantitative information flow as network flow
capacity, Proceedings of the 2008 ACM SIGPLAN conference on Programming language design animplementation, June 07-13, 2008, Tucson, AZ, USA [doi>10.1145/1375581.1375606] 29Stacy
Prowell , Jesse H. Poore, Foundations of Sequence-Based Software Specification, IEEE Transactions o
Software Engineering, v.29 n.5, p.417-429, May 2003 [doi>10.1109/TSE.2003.1199071] 30C. EShannon. A mathematical theory of communication. The Bell System Technical Journal, 27:379--423 an
623--656, 1948. 31H. A. Simon and A. Ando. Aggregation of variables in dynamic systems. Econometrica29:111--138, 1961. 32R. J. Solomonoff. A formal theory of inductive inference. part I and part I
Information and Control, 7(1 and 2):1--22 and 224--254, 1964. 33M. H. van Emden. Hierarchicdecomposition of complexity. Machine Intelligence, 5:361--380, 1969. 34M. H. van Emden. An Analys
of Complexity. PhD thesis, Mathematisches Zentrum, Amsterdam, 1971. 35Horst Zuse, Softwar
Complexity: Measures and Methods, Walter de Gruyter & Co., Hawthorne, NJ, 1990
Syamaprasad Institute Of Technology & Managemen
http://portal.acm.org/citation.cfm?id=321506&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=321506&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://doi.acm.org/10.1145/321495.321506http://portal.acm.org/citation.cfm?id=906949&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=906949&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=721517&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=721517&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=721517&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=631131&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=631131&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/32.295895http://portal.acm.org/citation.cfm?id=1094516&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1094516&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1094516&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1093/logcom/exi009http://portal.acm.org/citation.cfm?id=146592&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=146592&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=540137&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=540137&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=540137&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1254729&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1254729&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/FOSE.2007.29http://portal.acm.org/citation.cfm?id=1271341&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1271341&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1271341&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1271341&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/ICPC.2007.35http://portal.acm.org/citation.cfm?id=1310458&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1310458&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/T-C.1972.223539http://portal.acm.org/citation.cfm?id=320583&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=320583&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=320583&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=320583&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://doi.acm.org/10.1145/320435.320583http://portal.acm.org/citation.cfm?id=1345057&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1345057&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1345057&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1345057&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1002/smr.344http://portal.acm.org/citation.cfm?id=543292&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=543292&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1016/S1383-7621(01)00019-4http://dx.doi.org/10.1016/S1383-7621(01)00019-4http://portal.acm.org/citation.cfm?id=807712&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=807712&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=807712&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1375606&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1375606&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1375606&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://doi.acm.org/10.1145/1375581.1375606http://portal.acm.org/citation.cfm?id=776809&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=776809&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=776809&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/TSE.2003.1199071http://portal.acm.org/citation.cfm?id=533784&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=533784&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=321506&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=321506&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://doi.acm.org/10.1145/321495.321506http://portal.acm.org/citation.cfm?id=906949&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=906949&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=721517&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=721517&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=721517&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=631131&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=631131&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/32.295895http://portal.acm.org/citation.cfm?id=1094516&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1094516&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1094516&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1093/logcom/exi009http://portal.acm.org/citation.cfm?id=146592&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=146592&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=540137&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=540137&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=540137&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1254729&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1254729&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/FOSE.2007.29http://portal.acm.org/citation.cfm?id=1271341&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1271341&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1271341&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/ICPC.2007.35http://portal.acm.org/citation.cfm?id=1310458&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1310458&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/T-C.1972.223539http://portal.acm.org/citation.cfm?id=320583&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=320583&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=320583&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://doi.acm.org/10.1145/320435.320583http://portal.acm.org/citation.cfm?id=1345057&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1345057&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1345057&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1002/smr.344http://portal.acm.org/citation.cfm?id=543292&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=543292&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1016/S1383-7621(01)00019-4http://dx.doi.org/10.1016/S1383-7621(01)00019-4http://portal.acm.org/citation.cfm?id=807712&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=807712&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=807712&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1375606&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1375606&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=1375606&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://doi.acm.org/10.1145/1375581.1375606http://portal.acm.org/citation.cfm?id=776809&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=776809&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=776809&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://dx.doi.org/10.1109/TSE.2003.1199071http://portal.acm.org/citation.cfm?id=533784&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483http://portal.acm.org/citation.cfm?id=533784&dl=GUIDE&coll=GUIDE&CFID=85807985&CFTOKEN=28673483 -
7/22/2019 Measurement of Software Similarity
45/46
4
Conclusion
Our project namely is developed on C programme code. At first we are developing a
programme based on String Similarity, that checks two strings in two separate array, if they are similar onot. When this is done successfully, the same checking is done on two separate files using C programm
code.
Syamaprasad Institute Of Technology & Managemen
-
7/22/2019 Measurement of Software Similarity
46/46
46