multi-parametric models for project data mining and project planning
TRANSCRIPT
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
1/15
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
2/15
PM World Today August 2010 (Vol XII, Issue VIII)
Project data, despite its unsystematic nature as a result of a random data collection process, is a
unique and important source of information based on which one can build truly effective
quantitative methods of project management.
Really efficient scientific methodology of project data mining should become a heart of the newquantitative project management, an area that is currently in deep crisis.
Hushing up of this crisis state in the field of quantitative project management might be beneficial
to some universities and especially companies that are directly involved in this business but it is
not beneficial for the industry as whole.
Moreover, the consequences of this hushing up have been a disaster for the industry, where the
losses associated with multiple, even massive failures of projects are huge.
Worldwide incidence of this phenomenon leads people to a broad discussion of the problem,
particularly its financial aspects, but unfortunately, these huge losses have not been associatedwith crises in the quantitative project management.
But the thing is that, along with other causes of massive failures of projects, lack of a truly
scientific methodology of planning and execution of projects is one of the major reasons for thisstate of affairs.
Statistical methodology of project data mining, which is the ideological basis of the modernquantitative project management, in fact, has long proved a failure. Numerous statistical
relations between the parameters of the projects, intended for estimation and prediction of
projects, have unacceptably low accuracy and, in fact, are unsuitable for the purposes of the
industry.
All these relations are the result of direct processing of the project data using methods of
regression analysis. As stated in [1], such an approach to data processing has a number of seriousshortcomings.
First, a single approximating curve cannot cover the entire range of the parameters of projects.This means that for more accurate estimation of projects and for more reliable description of
relationships between the parameters of the projects instead of one approximating curve we need
to use families of curves.
Secondly, the data dependent direct determination of the mathematical form of approximatingcurves may not reflect the essence of the functional relationships between specific parameters of
the project correctly. In other words, data processing bottom-up methodology without taking intoaccount the internal logic of data may not reflect correctly the essence of the phenomena under
study.
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 2
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
3/15
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
4/15
PM World Today August 2010 (Vol XII, Issue VIII)
This means that if the input parameters contain errors W , P and R and the corresponding
formulas 1f , 2f and 3f are inadequate, it may cause formation of the errors of output parameters
E , T and avN .
Errors in estimates of project parameters E , T and avN mathematically can be interpreted as
their differentials.
Correspondingly for the differentials of project effort E , cycle time T and average number of
working people avN one can have.
RR
RPWfP
P
RPWfW
W
RPWfR
R
EP
P
EW
W
EE
+
+
=
+
+
=
),,(),,(),,( 111 , (4)
RR
RPWfPP
RPWfWW
RPWfRRTP
PTW
WTT
+
+
=
+
+
= ),,(),,(),,( 222 , (5)
RR
RPWfP
P
RPWfW
W
RPWfR
R
NP
P
NW
W
NN avavavav
+
+
=
+
+
=
),,(),,(),,( 333 . (6)
As it can be seen from these expressions, the accuracy of input parameters and the accuracy of
estimating formulas affect the accuracy of the output parameters independently.
That is, regardless of the accuracy of input parameters W , P , and R , increasing the
accuracy of the formulas (1), (2) and (3) may improve the accuracy of estimates of project
parameters.
Fig.2 presents all the increments W , P , R , E , T and avN in the multi-dimensional
flat project space with the aid of TRANSCALE tool.
Remarkable curves in the project space generated by the condition of the constancy of
Time/People ratio R
Usage of different by nature families of curves in the project space is crucial for the analysis ofprojects similarity.
Such a system of curves can be generated by the conditions of the constancy of the parameters ofprojects [4], or by the conditions of constancy of their various combinations. Also these curves
can be generated by different extreme conditions such as minimization of effort and risk [5].
Like other conditions in the form of constancy of parameters of the projects, or combinations ofthem, the condition of constancy of the priorities of project objectives can generate a system of
similarity curves in the project space too.
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 4
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
5/15
PM World Today August 2010 (Vol XII, Issue VIII)
Fig.2 Estimation errors of project parameters in multi-dimensional flat project space
Fig.3 presents three projects with the same complexity and Time/People ratio. As it is seen from
the picture project points in the different fields are placed along with special curves which are the
direct consequences of the condition of constancy of Time/People ratio.
Fig.3 Implementation of the same project with three different development teams
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 5
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
6/15
PM World Today August 2010 (Vol XII, Issue VIII)
tConsW tan= (7)
and
tConsN
TR
av
tan== (8)
Such remarkable curves can also be generated for the projects with constant complexities
tConsW tan= and different Time/People ratios by the means of transformation of projects in
the project space [4].
As can be seen from Fig.3, each field or coordinate system of project parameters has its own
remarkable curve.
Lets consider these curves separately with the aid of the state equation of projects [6].
Equations of the remarkable curves for different project fields
Two conditions of constancy (7) and (8) represent the hyperbola in the [E,P] field and a
straight line in the field [ avN ,T] correspondingly.
For the obtaining equations of the remarkable curves in the fields of [P, avN ] and [E,T] one
can use the state equation of projects [6].
WNTP av =** (9)
For finding the equation of the curve in the [P, avN ] field, that reflects the condition
tConsN
TR
av
tan== , we can represent (9) in the form
WRNPN
TNP
N
NNTP av
av
av
av
avav ===
22 **** (10)
From here one can obtain the desired relationship between team productivity and average team
size in this form
2
avRN
WP= . (11)
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 6
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
7/15
PM World Today August 2010 (Vol XII, Issue VIII)
This means that for each constant values of W and R functional relationship between team
productivity and average staffing has the form of a quadratic hyperbola, which is very important
for project data mining and interpretation.
Similarly for the relationship between project effort and its duration for constant Time/People
ratio R we can have
R
T
T
TNTNT
P
WE avav
2
** ==== . (12)
This means that for the constant values of Time/People ratio functional relationship between
effort and project duration has a parabolic form.
Project similarity zones
The exact definition of Time/People ratio, in connection with the many uncertainties in practice,
faces various difficulties.
As a consequence, it is more appropriate to speak not about the exact value of this ratio but aboutthe being of this value in a certain range R , the minimum value of which is determined by the
above uncertainties.
avN
1 0 2 0 3 0 4 0
1
2
3
4
T
Fig.5 Similarity zone of projects for a small interval R of Time/People ratioR
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 7
R
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
8/15
PM World Today August 2010 (Vol XII, Issue VIII)
Experimental project points that are located within this interval R are the basis for the refined
regression between the parameters under study.
For exact definition of the interval R it is always necessary to find a trade-off between the
considerations of reliability and accuracy, since the increase of the interval increases theprobability of finding the expected value of the ratio in the given interval, and the reduction of
the interval increases the accuracy of regression.
Presentation of project similarity by the families of curves in different coordinate systems
The set of project data can be divided into groups according to the principle of relative constancy
of the values of their Time/People ratio.
As a result, each group of projects will cover a certain range of values of this ratio. The set ofTime/People ratios in different systems of coordinate will generate different families of curves,
which will cover the entire range of project parameters.
1. Family of project similarity curves in the coordinate system of [ avN ,T]
In the general case of nonlinear scaling of projects Time/People ratio is a nonlinear function ofproject complexity.
With the reduction of project complexity that nonlinear dependence is gradually weakened andthe Time/People ratio becomes a constant. With increasing complexity, depending on the market
pressure, this dependence can take different forms. The study of this nonlinear dependence of
Time/People ratio on the project scaling is very important for the study of giga-projects and is a
subject of a special paper.
The specific case of the constancy of Time/People ratio is presented in Fig.6 in the form of the
family of linear relationships between project average staffing avN and project durationT.
avN
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 8
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
9/15
PM World Today August 2010 (Vol XII, Issue VIII)
1 0 2 0 3 0 4 0
2
4
6
8
T
Fig.6 Zones of project similarity in the coordinate system [ T, avN ]
2. Family of project similarity curves in the coordinate system of [P, avN ]
Functional relationship between team productivity P and team size avN is the subject of debates
for a long time and continues to be in the center of attention of professionals. Despite this, there
is still no clarity as to what this functional relationship is.
P
0 2 4 6 8
5
1 0
1 5
2 0
avN
Fig.7 Zones of project similarity in the coordinate system of [P, avN ]
The fact is that, depending on additional conditions may exist several such functionalrelationships. In particular in [5] for the derivation of this functional relationship is used the
conditional minimum of project effort. In [7] the derivation of the said functional relationship isbased on the considerations about the communication between human beings and their
interactions. In this paper, the same functional relation is obtained based on the condition of the
constancy of Time/People ratio. The result is the expression (11).
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 9
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
10/15
PM World Today August 2010 (Vol XII, Issue VIII)
P
0 2 0 4 0 6 0 8 0 1 00 0 1 20 0
5
1 0
1 5
2 0
E
Fig.8 Zones of project similarity in the coordinate system [P,E]
With the aid of that expression we can divide the field of [Team productivity, Team size] into the
zones of project similarity (Fig.7).
3. Family of project similarity curves in the field of [P,E]
Dividing project database into the groups by the condition of the constancy of their complexity,
one can represent the functional relationship between team productivity P and project effort E
in the form of the family of hyperbolas (Fig.8).
4. Family of project similarity curves in the coordinate system of [E,T]
As we saw above, the condition of the constancy of Time/People ratio allows obtaining thefunctional relationship between project effort and its duration in the form of the expression (12).Fig.9 presents this relationship for different constant values of Time/People ratio R in the form
of a family of parabolic curves.
E
0 1 0 2 0 3 0 4 00
2 0
4 0
6 0
8 0
1 00 0
1 20 0
T
Fig.9 Zones of project similarity in the coordinate system [E, T]
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 10
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
11/15
PM World Today August 2010 (Vol XII, Issue VIII)
Presentation of the zones of project similarity with the aid of the TRANSCALE tool
Thus, dividing project databases into groups by using various criteria of similarity it is possible
to improve dramatically the accuracy of project estimation.
This is achieved by the fact that each group of projects has its own regression equation, theaccuracy of which is much higher in comparison with the case when the entire system of project
data is replaced by a single regression curve.
Graphical presentation of project similarity zones is very useful for selecting groups of similar
projects in the planning of a new project. For multivariate comparisons and project similarity
analysis it is convenient to use TRANSCALE tool.
Fig.10 is a joint presentation of a project database in the multi-parametrical flat project space
with the zones of project similarity.
Accuracy analysis of multi-parametric models of projects
As an example lets consider functional relationship between project total effort and project
duration. This relationship can be analyzed using some project database and breaking it into
different number of groups. The grouping of project database is performed by similarity ofTime/People ratio R .
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 11
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
12/15
PM World Today August 2010 (Vol XII, Issue VIII)
Fig.10 Multi-parametrical presentation of projects with the zones of their similarity
Fig.11 contains the results of project grouping and approximation for the number of groups of
projects from one to four. As can be seen from the picture, an increase in the number of groupsof projects increases the accuracy of approximation.
In addition, the numerical value of the approximating power law exponents are approaching to
the figure 2, which is close to the theoretical value of this exponent.
It is also important to analyze the possibilities of using the Time/People ratio as an input
parameter for estimating the parameters of projects. From this point of view one can analyze the
fitting curves with higher values of 2r .
For example, if we analyze the average curve in the case with three approximating curves with
989.02 =r , the numerical values of Time/People ratios in the corresponding group are ranging
from 7,786 to 12. This means that even the usage of rough estimates for Time/People ratio cansignificantly improve the quality of project estimations.
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 12
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
13/15
PM World Today August 2010 (Vol XII, Issue VIII)
Fig.11 Four options of grouping of the same project data
Conclusions
1. In spite of their non-systematic character, project databases are important sources of
information about the functional relationships between project parameters,2. Statistical methodology of project data mining, which is the ideological basis of modern
quantitative project management, in fact, has long proved a failure,
3. Numerous statistical relationships between project parameters intended for projectestimation have extremely low accuracy and therefore cannot be used for everyday
practical purposes of the industry,
4. To extract correct functional relationships between project parameters from the projectdata it is necessary to develop special non-statistical methods,
5. Efficient scientific project data mining should be the core of the quantitative project
management, an area which is currently in deep crisis,6. Given the large scatter of project parameters in the databases, to ensure acceptable
accuracy of relationships, obtained from these data, it is necessary to cover the data bythe families of approximating curves,
7. Identification of project objectives and their priorities are directly related to the accuracyof estimates of its parameters,
8. Quantitative description of the priorities of project objectives is the key to improve the
accuracy of project estimates,
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 13
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
14/15
PM World Today August 2010 (Vol XII, Issue VIII)
9. Analysis of project data indicates that even the usage of rough estimates for the priorities
of project objectives can significantly improve the quality of project estimates,
10. New methodology of project data mining can serve as an ideological basis for the
development of project synthesis new methodology and algorithms.
References
1. Pavel Barseghyan (2010) Project Nonlinear Scaling and Transformation Methodology and
TRANSCALE Tool.PM World Today May 2010 (Vol XII, Issue V).
2. Pavel Barseghyan (2009) Problems of the Mathematical Theory of Human Work(Principles of mathematical modeling in project management).PM World Today
August 2009 (Vol XI, Issue VIII).
3. Pavel Barseghyan (2010) Project Data Mining and Project Estimation Top-down
Methodology with TRANSCALE Tool.PM World Today June 2010 (Vol XII, IssueVI).
4. Pavel Barseghyan (2010) Similarity of Projects: Methodology and Analysis withTRANSCALE Tool.PM World Today July 2010 (Vol XII, Issue VII).
5. Pavel Barseghyan (2009) Principles of Top-Down Quantitative Analysis of Projects: Part 2
Analytical Derivation of Functional Relationships between Project Parameters
without Project Data. PM World Today June 2009 (Vol XI, Issue VI).6. Pavel Barseghyan. (2009). Principles of Top-Down Quantitative Analysis of Projects. Part
1: State Equation of Projects and Project Change Analysis. PM World Today May 2009
(Vol XI, Issue V).7. Pavel Barseghyan. (2009). Quantitative Analysis of Team Size and its Hierarchical
Structure. PM World Today July 2009 (Vol XI, Issue VII).
About the Author
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 14
-
7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning
15/15
PM World Today August 2010 (Vol XII, Issue VIII)
Pavel Barseghyan, PhD
Author
Dr. Pavel Barseghyan is a consultant in the field of
quantitative project management, project datamining and organizational science. He is the founder
of Systemic PM, LLC, a project managementcompany. Has over 40 years experience in academia, the electronics industry,
the EDA industry and Project Management Research and tools development.
During the period of 1999-2010 he was the Vice President of Research forNumetrics Management Systems. Prior to joining Numetrics, Dr. Barseghyanworked as an R&D manager at Infinite Technology Corp. in Texas. He was also a
founder and the president of an EDA start-up company, DAN Technologies, Ltd.that focused on high-level chip design planning and RTL structural floor planning
technologies. Before joining ITC, Dr. Barseghyan was head of the ElectronicDesign and CAD department at the State Engineering University of Armenia,
focusing on development of the Theory of Massively Interconnected Systems andits applications to electronic design. During the period of 1975-1990, he was alsoa member of the University Educational Policy Commission for Electronic Designand CAD Direction in the Higher Education Ministry of the former USSR. Earlier in
his career he was a senior researcher in Yerevan Research and DevelopmentInstitute of Mathematical Machines (Armenia). He is an author of nine
monographs and textbooks and more than 100 scientific articles in the area ofquantitative project management, mathematical theory of human work,
electronic design and EDA methodologies, and tools development. More than 10Ph.D. degrees have been awarded under his supervision. Dr. Barseghyan holdsan MS in Electrical Engineering (1967) and Ph.D. (1972) and Doctor of TechnicalSciences (1990) in Computer Engineering from Yerevan Polytechnic Institute
(Armenia). Pavel can be contacted at [email protected].
PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 15
mailto:[email protected]:[email protected]