multi-parametric models for project data mining and project planning

Upload: pavel-barseghyan

Post on 04-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    1/15

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    2/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    Project data, despite its unsystematic nature as a result of a random data collection process, is a

    unique and important source of information based on which one can build truly effective

    quantitative methods of project management.

    Really efficient scientific methodology of project data mining should become a heart of the newquantitative project management, an area that is currently in deep crisis.

    Hushing up of this crisis state in the field of quantitative project management might be beneficial

    to some universities and especially companies that are directly involved in this business but it is

    not beneficial for the industry as whole.

    Moreover, the consequences of this hushing up have been a disaster for the industry, where the

    losses associated with multiple, even massive failures of projects are huge.

    Worldwide incidence of this phenomenon leads people to a broad discussion of the problem,

    particularly its financial aspects, but unfortunately, these huge losses have not been associatedwith crises in the quantitative project management.

    But the thing is that, along with other causes of massive failures of projects, lack of a truly

    scientific methodology of planning and execution of projects is one of the major reasons for thisstate of affairs.

    Statistical methodology of project data mining, which is the ideological basis of the modernquantitative project management, in fact, has long proved a failure. Numerous statistical

    relations between the parameters of the projects, intended for estimation and prediction of

    projects, have unacceptably low accuracy and, in fact, are unsuitable for the purposes of the

    industry.

    All these relations are the result of direct processing of the project data using methods of

    regression analysis. As stated in [1], such an approach to data processing has a number of seriousshortcomings.

    First, a single approximating curve cannot cover the entire range of the parameters of projects.This means that for more accurate estimation of projects and for more reliable description of

    relationships between the parameters of the projects instead of one approximating curve we need

    to use families of curves.

    Secondly, the data dependent direct determination of the mathematical form of approximatingcurves may not reflect the essence of the functional relationships between specific parameters of

    the project correctly. In other words, data processing bottom-up methodology without taking intoaccount the internal logic of data may not reflect correctly the essence of the phenomena under

    study.

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 2

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    3/15

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    4/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    This means that if the input parameters contain errors W , P and R and the corresponding

    formulas 1f , 2f and 3f are inadequate, it may cause formation of the errors of output parameters

    E , T and avN .

    Errors in estimates of project parameters E , T and avN mathematically can be interpreted as

    their differentials.

    Correspondingly for the differentials of project effort E , cycle time T and average number of

    working people avN one can have.

    RR

    RPWfP

    P

    RPWfW

    W

    RPWfR

    R

    EP

    P

    EW

    W

    EE

    +

    +

    =

    +

    +

    =

    ),,(),,(),,( 111 , (4)

    RR

    RPWfPP

    RPWfWW

    RPWfRRTP

    PTW

    WTT

    +

    +

    =

    +

    +

    = ),,(),,(),,( 222 , (5)

    RR

    RPWfP

    P

    RPWfW

    W

    RPWfR

    R

    NP

    P

    NW

    W

    NN avavavav

    +

    +

    =

    +

    +

    =

    ),,(),,(),,( 333 . (6)

    As it can be seen from these expressions, the accuracy of input parameters and the accuracy of

    estimating formulas affect the accuracy of the output parameters independently.

    That is, regardless of the accuracy of input parameters W , P , and R , increasing the

    accuracy of the formulas (1), (2) and (3) may improve the accuracy of estimates of project

    parameters.

    Fig.2 presents all the increments W , P , R , E , T and avN in the multi-dimensional

    flat project space with the aid of TRANSCALE tool.

    Remarkable curves in the project space generated by the condition of the constancy of

    Time/People ratio R

    Usage of different by nature families of curves in the project space is crucial for the analysis ofprojects similarity.

    Such a system of curves can be generated by the conditions of the constancy of the parameters ofprojects [4], or by the conditions of constancy of their various combinations. Also these curves

    can be generated by different extreme conditions such as minimization of effort and risk [5].

    Like other conditions in the form of constancy of parameters of the projects, or combinations ofthem, the condition of constancy of the priorities of project objectives can generate a system of

    similarity curves in the project space too.

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 4

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    5/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    Fig.2 Estimation errors of project parameters in multi-dimensional flat project space

    Fig.3 presents three projects with the same complexity and Time/People ratio. As it is seen from

    the picture project points in the different fields are placed along with special curves which are the

    direct consequences of the condition of constancy of Time/People ratio.

    Fig.3 Implementation of the same project with three different development teams

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 5

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    6/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    tConsW tan= (7)

    and

    tConsN

    TR

    av

    tan== (8)

    Such remarkable curves can also be generated for the projects with constant complexities

    tConsW tan= and different Time/People ratios by the means of transformation of projects in

    the project space [4].

    As can be seen from Fig.3, each field or coordinate system of project parameters has its own

    remarkable curve.

    Lets consider these curves separately with the aid of the state equation of projects [6].

    Equations of the remarkable curves for different project fields

    Two conditions of constancy (7) and (8) represent the hyperbola in the [E,P] field and a

    straight line in the field [ avN ,T] correspondingly.

    For the obtaining equations of the remarkable curves in the fields of [P, avN ] and [E,T] one

    can use the state equation of projects [6].

    WNTP av =** (9)

    For finding the equation of the curve in the [P, avN ] field, that reflects the condition

    tConsN

    TR

    av

    tan== , we can represent (9) in the form

    WRNPN

    TNP

    N

    NNTP av

    av

    av

    av

    avav ===

    22 **** (10)

    From here one can obtain the desired relationship between team productivity and average team

    size in this form

    2

    avRN

    WP= . (11)

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 6

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    7/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    This means that for each constant values of W and R functional relationship between team

    productivity and average staffing has the form of a quadratic hyperbola, which is very important

    for project data mining and interpretation.

    Similarly for the relationship between project effort and its duration for constant Time/People

    ratio R we can have

    R

    T

    T

    TNTNT

    P

    WE avav

    2

    ** ==== . (12)

    This means that for the constant values of Time/People ratio functional relationship between

    effort and project duration has a parabolic form.

    Project similarity zones

    The exact definition of Time/People ratio, in connection with the many uncertainties in practice,

    faces various difficulties.

    As a consequence, it is more appropriate to speak not about the exact value of this ratio but aboutthe being of this value in a certain range R , the minimum value of which is determined by the

    above uncertainties.

    avN

    1 0 2 0 3 0 4 0

    1

    2

    3

    4

    T

    Fig.5 Similarity zone of projects for a small interval R of Time/People ratioR

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 7

    R

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    8/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    Experimental project points that are located within this interval R are the basis for the refined

    regression between the parameters under study.

    For exact definition of the interval R it is always necessary to find a trade-off between the

    considerations of reliability and accuracy, since the increase of the interval increases theprobability of finding the expected value of the ratio in the given interval, and the reduction of

    the interval increases the accuracy of regression.

    Presentation of project similarity by the families of curves in different coordinate systems

    The set of project data can be divided into groups according to the principle of relative constancy

    of the values of their Time/People ratio.

    As a result, each group of projects will cover a certain range of values of this ratio. The set ofTime/People ratios in different systems of coordinate will generate different families of curves,

    which will cover the entire range of project parameters.

    1. Family of project similarity curves in the coordinate system of [ avN ,T]

    In the general case of nonlinear scaling of projects Time/People ratio is a nonlinear function ofproject complexity.

    With the reduction of project complexity that nonlinear dependence is gradually weakened andthe Time/People ratio becomes a constant. With increasing complexity, depending on the market

    pressure, this dependence can take different forms. The study of this nonlinear dependence of

    Time/People ratio on the project scaling is very important for the study of giga-projects and is a

    subject of a special paper.

    The specific case of the constancy of Time/People ratio is presented in Fig.6 in the form of the

    family of linear relationships between project average staffing avN and project durationT.

    avN

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 8

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    9/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    1 0 2 0 3 0 4 0

    2

    4

    6

    8

    T

    Fig.6 Zones of project similarity in the coordinate system [ T, avN ]

    2. Family of project similarity curves in the coordinate system of [P, avN ]

    Functional relationship between team productivity P and team size avN is the subject of debates

    for a long time and continues to be in the center of attention of professionals. Despite this, there

    is still no clarity as to what this functional relationship is.

    P

    0 2 4 6 8

    5

    1 0

    1 5

    2 0

    avN

    Fig.7 Zones of project similarity in the coordinate system of [P, avN ]

    The fact is that, depending on additional conditions may exist several such functionalrelationships. In particular in [5] for the derivation of this functional relationship is used the

    conditional minimum of project effort. In [7] the derivation of the said functional relationship isbased on the considerations about the communication between human beings and their

    interactions. In this paper, the same functional relation is obtained based on the condition of the

    constancy of Time/People ratio. The result is the expression (11).

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 9

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    10/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    P

    0 2 0 4 0 6 0 8 0 1 00 0 1 20 0

    5

    1 0

    1 5

    2 0

    E

    Fig.8 Zones of project similarity in the coordinate system [P,E]

    With the aid of that expression we can divide the field of [Team productivity, Team size] into the

    zones of project similarity (Fig.7).

    3. Family of project similarity curves in the field of [P,E]

    Dividing project database into the groups by the condition of the constancy of their complexity,

    one can represent the functional relationship between team productivity P and project effort E

    in the form of the family of hyperbolas (Fig.8).

    4. Family of project similarity curves in the coordinate system of [E,T]

    As we saw above, the condition of the constancy of Time/People ratio allows obtaining thefunctional relationship between project effort and its duration in the form of the expression (12).Fig.9 presents this relationship for different constant values of Time/People ratio R in the form

    of a family of parabolic curves.

    E

    0 1 0 2 0 3 0 4 00

    2 0

    4 0

    6 0

    8 0

    1 00 0

    1 20 0

    T

    Fig.9 Zones of project similarity in the coordinate system [E, T]

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 10

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    11/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    Presentation of the zones of project similarity with the aid of the TRANSCALE tool

    Thus, dividing project databases into groups by using various criteria of similarity it is possible

    to improve dramatically the accuracy of project estimation.

    This is achieved by the fact that each group of projects has its own regression equation, theaccuracy of which is much higher in comparison with the case when the entire system of project

    data is replaced by a single regression curve.

    Graphical presentation of project similarity zones is very useful for selecting groups of similar

    projects in the planning of a new project. For multivariate comparisons and project similarity

    analysis it is convenient to use TRANSCALE tool.

    Fig.10 is a joint presentation of a project database in the multi-parametrical flat project space

    with the zones of project similarity.

    Accuracy analysis of multi-parametric models of projects

    As an example lets consider functional relationship between project total effort and project

    duration. This relationship can be analyzed using some project database and breaking it into

    different number of groups. The grouping of project database is performed by similarity ofTime/People ratio R .

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 11

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    12/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    Fig.10 Multi-parametrical presentation of projects with the zones of their similarity

    Fig.11 contains the results of project grouping and approximation for the number of groups of

    projects from one to four. As can be seen from the picture, an increase in the number of groupsof projects increases the accuracy of approximation.

    In addition, the numerical value of the approximating power law exponents are approaching to

    the figure 2, which is close to the theoretical value of this exponent.

    It is also important to analyze the possibilities of using the Time/People ratio as an input

    parameter for estimating the parameters of projects. From this point of view one can analyze the

    fitting curves with higher values of 2r .

    For example, if we analyze the average curve in the case with three approximating curves with

    989.02 =r , the numerical values of Time/People ratios in the corresponding group are ranging

    from 7,786 to 12. This means that even the usage of rough estimates for Time/People ratio cansignificantly improve the quality of project estimations.

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 12

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    13/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    Fig.11 Four options of grouping of the same project data

    Conclusions

    1. In spite of their non-systematic character, project databases are important sources of

    information about the functional relationships between project parameters,2. Statistical methodology of project data mining, which is the ideological basis of modern

    quantitative project management, in fact, has long proved a failure,

    3. Numerous statistical relationships between project parameters intended for projectestimation have extremely low accuracy and therefore cannot be used for everyday

    practical purposes of the industry,

    4. To extract correct functional relationships between project parameters from the projectdata it is necessary to develop special non-statistical methods,

    5. Efficient scientific project data mining should be the core of the quantitative project

    management, an area which is currently in deep crisis,6. Given the large scatter of project parameters in the databases, to ensure acceptable

    accuracy of relationships, obtained from these data, it is necessary to cover the data bythe families of approximating curves,

    7. Identification of project objectives and their priorities are directly related to the accuracyof estimates of its parameters,

    8. Quantitative description of the priorities of project objectives is the key to improve the

    accuracy of project estimates,

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 13

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    14/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    9. Analysis of project data indicates that even the usage of rough estimates for the priorities

    of project objectives can significantly improve the quality of project estimates,

    10. New methodology of project data mining can serve as an ideological basis for the

    development of project synthesis new methodology and algorithms.

    References

    1. Pavel Barseghyan (2010) Project Nonlinear Scaling and Transformation Methodology and

    TRANSCALE Tool.PM World Today May 2010 (Vol XII, Issue V).

    2. Pavel Barseghyan (2009) Problems of the Mathematical Theory of Human Work(Principles of mathematical modeling in project management).PM World Today

    August 2009 (Vol XI, Issue VIII).

    3. Pavel Barseghyan (2010) Project Data Mining and Project Estimation Top-down

    Methodology with TRANSCALE Tool.PM World Today June 2010 (Vol XII, IssueVI).

    4. Pavel Barseghyan (2010) Similarity of Projects: Methodology and Analysis withTRANSCALE Tool.PM World Today July 2010 (Vol XII, Issue VII).

    5. Pavel Barseghyan (2009) Principles of Top-Down Quantitative Analysis of Projects: Part 2

    Analytical Derivation of Functional Relationships between Project Parameters

    without Project Data. PM World Today June 2009 (Vol XI, Issue VI).6. Pavel Barseghyan. (2009). Principles of Top-Down Quantitative Analysis of Projects. Part

    1: State Equation of Projects and Project Change Analysis. PM World Today May 2009

    (Vol XI, Issue V).7. Pavel Barseghyan. (2009). Quantitative Analysis of Team Size and its Hierarchical

    Structure. PM World Today July 2009 (Vol XI, Issue VII).

    About the Author

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 14

  • 7/30/2019 Multi-Parametric Models for Project Data Mining and Project Planning

    15/15

    PM World Today August 2010 (Vol XII, Issue VIII)

    Pavel Barseghyan, PhD

    Author

    Dr. Pavel Barseghyan is a consultant in the field of

    quantitative project management, project datamining and organizational science. He is the founder

    of Systemic PM, LLC, a project managementcompany. Has over 40 years experience in academia, the electronics industry,

    the EDA industry and Project Management Research and tools development.

    During the period of 1999-2010 he was the Vice President of Research forNumetrics Management Systems. Prior to joining Numetrics, Dr. Barseghyanworked as an R&D manager at Infinite Technology Corp. in Texas. He was also a

    founder and the president of an EDA start-up company, DAN Technologies, Ltd.that focused on high-level chip design planning and RTL structural floor planning

    technologies. Before joining ITC, Dr. Barseghyan was head of the ElectronicDesign and CAD department at the State Engineering University of Armenia,

    focusing on development of the Theory of Massively Interconnected Systems andits applications to electronic design. During the period of 1975-1990, he was alsoa member of the University Educational Policy Commission for Electronic Designand CAD Direction in the Higher Education Ministry of the former USSR. Earlier in

    his career he was a senior researcher in Yerevan Research and DevelopmentInstitute of Mathematical Machines (Armenia). He is an author of nine

    monographs and textbooks and more than 100 scientific articles in the area ofquantitative project management, mathematical theory of human work,

    electronic design and EDA methodologies, and tools development. More than 10Ph.D. degrees have been awarded under his supervision. Dr. Barseghyan holdsan MS in Electrical Engineering (1967) and Ph.D. (1972) and Doctor of TechnicalSciences (1990) in Computer Engineering from Yerevan Polytechnic Institute

    (Armenia). Pavel can be contacted at [email protected].

    PM World Todayis a free monthly eJournal - Subscriptions available at http://www.pmworldtoday.net Page 15

    mailto:[email protected]:[email protected]