mobile open source economic analysisarchive.alt.ac.uk/alt.newsweaver.co.uk/€¦ · components are...
TRANSCRIPT
�™ White Paper
Mobile Open Source Economic Analysis
LiMo Foundation White PaperAugust 2009
™
�™ White Paper
Executive SummaryIt is possible to use quantitative techniques to examine a number of the proposed economic
benefits of open source software. The claimed benefits are a reduction in cost of acquisition,
access to innovation and cost of ownership of software technology.
The quantitative techniques we use to conduct our analysis are based on measuring source
lines of code (SLOC) applied to publicly accessible open source project repositories. To aid
our analysis, we have developed a command line tool to mine information on open source
projects using the ohloh1 web service.
Based on this analysis, there is a strong case for constructive engagement with
open source communities where the corresponding open source software
components are used within a collaboratively developed, open mobile software
platform such as the LiMo Platform™.
There is additionally a case for mobile software platform providers to consider
using certain strategic open source projects as the basis for development of new
functionality on their roadmap.
There is no proven case within this analysis for converting existing proprietary items already
within a mobile software platform to open source. To conduct a cost-benefit analysis of that
scenario would require examination of more factors than SLOC alone.
Based on this analysis, there is a strong case for
constructive engagement with open source
communities...
1http://www.ohloh.net
2™ White Paper
1. IntroductionThe subject of open source is increasingly important in relation to mobile device platforms and in view of this,
it is vital to understand the underlying economic factors driving the use of open source software in a mobile
context. This paper seeks to move beyond opinion-based debate, by identifying the economic case for open
mobile platforms to acknowledge and embrace their use of open source software and to actively contribute
back changes to open source components modified or adapted within their platform.
This white paper attempts to quantify and corroborate the benefits of using open
source software in mobile platforms in relation to key components which
lie below the mobile commodity line. This line, for our purposes, lies approximately
around the UI framework level of a typical mobile software stack. Components
below the line are considered for this analysis to be commodity software. Above
the line lies the domain of differentiation. The approach we use involves applying economic cost-benefit analysis
techniques where applicable in addition to citing relevant authoritative peer-reviewed material. The following areas
of claimed benefit have been analysed in relation to open source mobile software components around or below
the commodity line:
Reduced cost of software acquisition
Access to software innovation
Reduced cost of software ownership
The analysis of this last area involves trying to quantify the cost to a mobile platform provider of failing to
engage with upstream changes.
•
•
•
Moving away from opinion-based
conjecture towards data-based analysis
�™ White Paper
2. Adopting open source to reduce the cost of software acquisition2.1 The COCOMO model
The claim that adopting existing open source technology reduces the cost of software acquisition can be
measured using the COnstructive COst MOdel2 (COCOMO) developed in 19813
by Dr Barry Boehm4, Emeritus Professor of Software Engineering at UCSC and
a leading software engineering academic. COCOMO has since evolved into an
industry standard5 with respect to software cost metrics. The model computes the
cost of software development as a function of the total source lines of code (SLOC)
of the corresponding components.
COCOMO has been significantly refined since its inception to reflect the
intervening changes in software development methodology and techniques,
in particular to acknowledge more iterative approaches which better reflect modern development. The latest
version of the model, COCOMO II, contains a number of further adjusting factors and according to the UCSC
Center for Systems and Software Engineering:
“This new, improved COCOMO is now ready to assist
professional software cost estimators for many years to come”6.
The approach taken by COCOMO II is twofold. First, a hierarchy of three different cost models (organic, semi-
detached and embedded) is introduced which is designed to take into account the overhead of development
depending on the type of project being analysed. Secondly, COCOMO combines the cost model with suitable
annualized engineer cost/productivity figure to yield the equivalent cost of development within a typical
software engineering context. These elements combine in a single regression function as follows:
Effort Applied = a(KLOC)b [man-months7]
Development Time = c(Effort Applied)d [months]
People required = Effort Applied / Development Time [count]
The COCOMO Model, developed at USC
and based on measurement of SLOC,
is widely used for estimating software costs.
2http://sunset.usc.edu/csse/research/COCOMOII/cocomo_main.html3Barry Boehm. Software engineering economics. Englewood Cliffs, NJ: Prentice-Hall, 1981.4http://sunset.usc.edu/Research_Group/barry.html5See US Govt Dept of Defense SoftwareTech estimation site: https://www.thedacs.com/databases/url/key/46http://sunset.usc.edu/csse/research/COCOMOII/cocomo_main.html7http://www.amazon.com/Mythical-Month-Essays-Software-Engineering/dp/0201835959
�™ White Paper
The coefficients in this function vary according to the project type thus:
Software project a b c dOrganic 2.4 1.05 2.5 0.38
Semi-detached 3.0 1.12 2.5 0.35
Embedded 3.6 1.20 2.5 0.32
(source: Software Cost Estimation With Cocomo II)
More detailed information on the COCOMO coefficients is available elsewhere8. For our purposes, COCOMO data
can be viewed as a recognized and respectable starting point to begin an empirical examination of the potential
benefits that open source offers for mobile platform providers in terms of the cost of software acquisition, access
to innovation and cost of software ownership.
2.2 The application of COCOMO to open source software
The applicability of COCOMO models to open source software was introduced in an influential and well-regarded
economic analysis, “Why Open Source? Look At the Numbers!” written by D. Wheeler in 20029 (and updated
regularly since), which remains a widely cited10 paper in relation to the economics
of Linux. The Linux Foundation commissioned some research11 in Oct 2008
updating Wheeler’s work. For the first calculation, they used the basic (i.e. “organic
project”) COCOMO model applied to Fedora 9. Their choice of annualized salary
figure was justified as follows:
“To calculate the costs for these distributions, a base salary was found for computer programmers
from the US Bureau of Labor Statistics. According to the BLS, the average salary for a US programmer
in July, 2008 was $75,662.0810. This was the salary amount used in our SLOC Count run … the
programmer making the average US salary figure of $75,662.08 is actually costing the employer
$97,604.08 in compensation alone. This is just one piece of the total wrap pie.”
We used a loaded cost of $75,000 per engineer per
annum – the same figure used by the Linux Foundation when they updated Wheeler’s work.
8http://www.amazon.com/Software-Cost-Estimation-Cocomo-II/dp/01302669229http://www.dwheeler.com/oss_fs_why.html10For example: http://abstract.cs.washington.edu/wiki/index.php/Open_Source_and_Search,11http://www.linuxfoundation.org/publications/estimatinglinux.php
�™ White Paper
Combining these factors and applying them to the Fedora 9 source base, the research calculated an equivalent
development cost of $10.78 billion for 204.5 million source lines of code (or SLOC) or in other words, $52/SLOC for
its development up to the current state. Table 1 shows the COCOMO figures taken from this paper and how they
were arrived at by using the coefficients for an organic project.
Total Physical Source Lines of Code (SLOC) 204,500,946
Development Effort Estimate, Person-Years (Person-Months) (Basic COCOMO model, Person-Months = 2.4 * (KSLOC**1.05))
59389.53 (712674.36)
Schedule Estimate, Years (Months) (Basic COCOMO model, Months = 2.5 * (person-months**0.38))
24.64 (295.68)
Total Estimated Cost to Develop (average salary = $75,662.08/year, overhead = 2.40).
$10,784,484,309
Table 1: SLOC and estimated production values for Fedora 9 (source: Linux Foundation)
For the Fedora 9 Linux kernel itself, the paper acknowledges that the “organic project” COCOMO model is not
appropriate since:
“the Linux kernel code is typically more complex than an “average” application—among other
things—it requires an analysis that goes beyond the basic COCOMO model. A user space application
like Mozilla, for instance, is much easier to code line by line since it’s abstracted at a much higher
level and has to handle far less tasks. A modern and enterprise-class operating system kernel is
asked to do a great number of extremely complex things, all at once.”
The paper moves on to indicate that an adjusted version of the organic project model is used which takes in the
exponent value from the semi-detached project model instead. The result of this is an upwards revision of the
equivalent cost of development of the 2.6.25 Linux kernel of $1.32 billion for 6.772 million SLOC or $202/SLOC
for its development up to the current state. Table 2 shows the corresponding figures from the paper which
details the use of adjusted COCOMO coefficients:
Total Physical Source Lines of Code (SLOC) 6,772,902
Development Effort Estimate, Person-Years (Person-Months)\ (effort model Person-Months = 4.64607 * (KSLOC**1.12))
7557.4 (90688.77)
Schedule Estimate, Years (Months) (Basic COCOMO model, Months = 2.5 * (person-months**0.38))
15.95 (191.34)
Estimated Average Number of Developers (Effort/Schedule) 473.96
Total Estimated Cost to Develop (average salary = $75,662.08/year, overhead = 2.40).
$1,372,340,206
Table 2: SLOC and estimated production values for Linux 2.6.25 kernel (source: Linux Foundation)
�™ White Paper
Another way of arriving at a cost per SLOC figure would be to consider a similar mobile platform development
initiative such as that of Symbian OS. In very rough terms using publicly available data, approx 100012 staff amortized
over some 13 years from the Psion EPOC days built what is now,the modern Symbian OS. The result is of the order of
20 million lines of source according to the Symbian Foundation13. At an average loaded cost
of $100,000 per resource, this equates to a development cost of $1300 million, which yields
an equivalent figure of $64/SLOC.
Interpolating between the COCOMO figures derived from the Linux Foundation and
our further estimates, but with a slight bias towards the lower one as we are focusing
on acquisition of primarily middleware/user-level code (albeit low-level/commodity)
rather than kernel software, we arrive at an initial cost/SLOC factor of around $50/SLOC for our calculations in
this paper. We believe that this figure can be reasonably applied to other mainstream open source projects
of relevance to a mobile context in order to conduct a first order estimate of the cost of acquisition of their
corresponding components. We will do that once we have addressed the issue of how to generate accurate
information about component SLOC which we will do in the next section.
2.3 ohloh.net open source code analytics web service
The www.ohloh.net service was launched in 2007 with the specific aim of providing accurate and detailed software
metrics on existing open source projects derived from data mining the corresponding open source code bases.
In particular, ohloh yields extensive information about the evolution of corresponding
SLOC over the duration of a project’s lifetime. It is possible to do this with open source
project, because this information is available in the corresponding version control system
logs. The ohloh service has compiled metadata on more than 300,000 major open
source projects including (among many others) GTK, GStreamer, WebKit and Android.
It uses a sophisticated source code parsing engine called ohcount14 for processing the
corresponding source code available in a pubic repository; svn, cvs and git version control
systems are all supported. The ohloh data is available through a comprehensive and well-
documented free to use web service API15 once a visitor signs up for a developer key. Ohloh was recently acquired
by SourceForge.16
All ohloh code metrics are accessible through a RESTful17 web service API which returns data as XML. In order to
support our research for this paper, we developed and have open sourced a command line driven Python-based
12http://www.gillamorstephens.com/content/en/item_details_core.aspx?guid=AD2DB7B8-FE01-4DF8-A75C-492163FE94FD13http://blog.symbian.org/2009/07/28/oscon-impressions/14http://labs.ohloh.net/ohcount15https://www.ohloh.net/api/getting_started16https://www.ohloh.net/announcements/sourceforge_acquires_ohloh17http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm
SLOC for major open source projects can be obtained from ohloh
web services.http://www.ohloh.net
COCOMO gives us indicative figure of $50/
SLOC for user side system code
�™ White Paper
tool18 which is able to reap a variety of information about a particular open source project through the ohloh
web service API, parse that information and format and present it in an auto-generated Excel spreadsheet. The
results retrieved from this tool form the core of the analytical data in this paper.
The remainder of this section examines the code analytics for four important open source projects which are
used within a GNOME based mobile Linux platform such as the LiMo Foundation Platform.
2.4 GTK analysis
Gtk (GNOME ToolKit) is the core application framework used in the LiMo Foundation Platform. It is a mature
project which forms the basis of the GNOME Linux Desktop UI and has had over 700 contributors working on it
over more than a decade. Using the pyohloh script, a graph illustrating the evolution of GTK over the past nine
years to the present day can be generated as shown in Figure 1.
Figure 1: Graph of GTK code history over time (source: www.ohloh.net)
Currently, GTKcomprises some 600,000 SLOC. Using our $50/SLOC factor, this equates to an equivalent
engineering cost of $30 million to develop this technology from scratch.
Note the smooth gradient of this graph over the last decade. This is a clear characteristic
of community-grown source code. It evolves gradually in line with a spiral19 or iterative
development approach. The dips and bumps noticeable on closer examination of the graph
are not analysed in this paper but they would typically reflect refactoring activity20.
18http://sourceforge.net/projects/pyohloh/ 19http://www.computer.org/portal/cms_docs_computer/computer/homepage/misc/Boehm/r5061.pdf20The history of the Telepathy open source project is a good example of this.
It would cost $30M to develop GTK
from scratch
�™ White Paper
2.5 WebKit analysis
WebKit is an open source web rendering engine used within the LiMo Foundation Platform. It is a mature project
with 128 contributors working on it over several years. The project was kick-started in mid-2005 by an injection
of code from Apple (who in turn had bootstrapped from the Konqueror KDE Desktop Browser) and has, since
then, evolved through various versions of the Mac OS X Safari browser and other projects such as Google’s
Chrome. Using the pyohloh script, we were able to generate a graph illustrating the evolution of WebKit’s code
base over its lifetime. This graph is displayed in Figure 2.
Figure 2: Graph of WebKit code history over time (source: www.ohloh.net)
Currently, WebKit comprises some 1.78 million SLOC. Using our $50/SLOC factor, this equates
to an equivalent engineering cost of $89 million to develop this technology from scratch.
It would cost $89M to develop WebKit
from scratch
9™ White Paper
2.6 GStreamer analysis
GStreamer is a media framework for delivering video and audio and is used in the LiMo Foundation Platform. It
is a mature project with over 420 contributors working on it since 2002. The code base has shipped in various
mobile embedded devices including those based on Nokia’s Maemo platform. Using the pyohloh script, we were
able to generate a graph illustrating the evolution of the GStreamer code base over the course of its existence.
This graph is displayed in Figure 3.
Figure 3: Graph of GStreamer code history over time (source: www.ohloh.net)
Currently, GStreamer comprises some 911,000 SLOC. Using our $50/SLOC factor, this equates
to an equivalent engineering cost of $45.5 million to develop this technology from scratch.
It would cost $45M to develop GStreamer
from scratch
�0™ White Paper
2.7 BlueZ analysis
BlueZ is the standard Linux Bluetooth stack which is used as the base Bluetooth stack in the LiMo Foundation
Platform. It is a mature project with 49 contributors working on it since 2002. The code base has shipped in
various mobile embedded devices including those based on Nokia’s Maemo platform. Using the pyohloh
script, we were able to generate a graph illustrating the evolution of the BlueZ code baseover the course of its
existence. This graph is displayed in Figure 4.
Figure 4: Graph of BlueZ code history over time (source: www.ohloh.net)
Currently, BlueZ comprises some 105,000 SLOC. Using our $50/SLOC factor, this equates to an equivalent
engineering cost of $5.25 million to develop this technology from scratch.
2.8 Acquisition benefits for a mobile platform provider
As previously indicated, the four open source components analysed in this section (GTK, WebKit, GStreamer and
BlueZ) are used within the LiMo Platform. Using the figures calculated above, the combined cost of engineering
functionalities implemented by these four components alone from scratch comes close to $170 million. Note this
figure does not include the cost of implementing dependencies.
��™ White Paper
3. Adopting open source to enable access to software innovationThe total number of open source projects being undertaken globally at present is huge21. However, relatively
few from this vast sea of potential will be both: a) active beyond a single developer and b) of direct interest to
mobile device manufacturers today. Nonetheless, it is important to consider this backdrop
as a source of real innovation because what may appear to be an unimportant project today
may become of great significance in relation to future mobile technology in a relatively
short period of time. A good example is WebKit - it has become the de facto standard web
rendering engine on mobile devices within a few years of its inception. Rather than rejecting promising projects
for being incomplete, significant cost savings may be possible by starting from the corresponding source base
rather than beginning from scratch:
“The companies and individuals, who work on Linux-related projects,build this value profit by
sharing the development burden with their peers (and sometimes competitors.) Increasingly it’s
becoming clear that shouldering this research and development burden individually, as Microsoft
has done, is an expensive approach to building software.”22
There are numerous other examples that have evolved to become very important in a mobile context, from
individual components (eg. BlueZ, OpenObex, D-Bus, Telepathy/Farsight) through to entire open source
platforms (eg. Android, Maemo).
In this section we will examine the following three projects in greater detail:
Clutter - open source, advanced UI framework being driven by Intel as a core part of their Moblin platform
oFono – open source telephony framework being driven by Nokia and Intel
GeoClue – open source location framework endorsed by GNOME Mobile
These projects have been chosen as purely indicative examples of innovative work that have the potential to be
included as standard components in future mobile Linux devices. All these selected projects address areas of
technology that are either below the mobile commodity line or are in the process of falling
below it. Our analysis will focus on the development momentum behind these projects and
the potential saving to be gained from using the corresponding source code as a starting
point for further development. It is also worth noting that engaging constructively with a
major field of innovation may result in far greater commercial return than the raw offset in engineering cost.
On the other hand, engineering cost is only one consideration in a decision of this nature; cost of technology
evaluation, selection and engineering learning curve are also factors which we do not take into account here.
•
•
•
Innovation flows from unexpected places
The mobile commodity line is shifting upwards
21ohloh alone indexes more than 300,000 projects22http://www.linuxfoundation.org/publications/estimatinglinux.php
�2™ White Paper
3.1 Clutter analysis
Clutter is an open source library for creating fast, visually rich and animated user interfaces. It forms the basis
of the advanced UI framework in Intel’s Moblin mobile Linux platform. It is a mature project that was started at
leading UK-based open source development house, OpenedHand23 who have since been acquired by Intel24.
Various blog posts by ex-OpenedHand staff suggest that significant development is being done around Clutter
within Intel. Using the pyohloh script, we were able to generate a graph illustrating the evolution of Clutter’s
code base over the course of its existence. This graph is displayed in Figure 5.
Figure 5: Graph of Clutter code history over time (source: www.ohloh.net)
The gradient of this graph suggests a project with significant development velocity
(~35kSLOC/year), inferring it has not been materially affected by the Intel acquisition. This
rate of development constitutes a substantial capital investment on the part of Intel and
Clutter is clearly a project to keep an eye on.
Currently, Clutter comprises some 86,600 SLOC. Using our $50/SLOC factor, this equates to an equivalent
engineering cost of $4.33 million to develop this technology from scratch.
23http://www.o-hand.com 24http://www.linuxtoday.com/developer/2008082802735NWHWSW
$4.3M invested in Clutter to date
��™ White Paper
3.2 oFono analysis
The oFono open source project was recently unveiled25 as a joint collaboration between Intel and Nokia and
has generated significant interest in the mobile industry. The project aims to build a world class open source
telephony stack for mobile Linux devices to be used in Intel’s Moblin platform as well as Nokia’s Maemo platform.
Using the pyohloh script, we were able to generate a graph illustrating the evolution of oFono’s code base over
the course of its existence. This graph is displayed in Figure 6.
Figure 6: Graph of oFono code history over time (source: www.ohloh.net)
The profile of the contribution curve indicates that this project was kick-started by a flurry
of coding and possibly a code contribution. Since its inception, activity has returned to a
more characteristic open source development gradient. One other noteworthy point is that
by examining the contributor data output by our script, we were able to confirm that a key
contributor is Marcel Holtmann, who is also a lead committer to BlueZ. Information relating to top committers is
highlighted in Table 3. Note that we have not refined our tool to examine commit sizes.
Contributor ID Account Name Contributor Name Man months Commits
1457041885407693 Denkenz Denis Kenzior 3 176
1457041885371924 Marcel Holtmann Marcel Holtmann 4 30
1457044032859329 ? Andrzej Zaborowski 2 20
1457041885368986 Rémi Denis-Courmont Rémi Denis-Courmont 2 14
1457041885412800 Akiniemi Aki Niemi 1 10
Table 3: oFono top contributors by commit (source: pyohloh)
Currently, the oFono project comprises 21,912 SLOC. Using our $50/SLOC factor, this corresponds to an
equivalent engineering cost of $1.1 million to develop this technology from scratch. Clearly, in spite of the
impressive commitment, oFono is at a very early stage at present judging by the evolution of the code base to
date and the small number of continuously active committers.
25http://www.unwiredview.com/2009/05/12/oFono-nokia-intel-start-a-new-linux-project-against-android/
$1.1M invested in oFono to date
��™ White Paper
3.3 GeoClue analysis
The GeoClueopen source project delivers a geographic information service via D-Bus to client side applications.
The backend information can potentially come from a number of geo-information sources (eg. GPS or
geoIP address). The project has been used to build utilities such as the Clutter libchamplain26 library and is a
technology earmarked for future inclusion in the GNOME Mobilestack. Using the pyohloh script, we were able to
generate a graph illustrating the evolution of GeoClue’s code baseover the course of its existence. This graph is
displayed in Figure 7.
Figure 7: Graph of GeoClue code history over time (source: www.ohloh.net)
Note that from examination of other active open source projects, a plateau in terms of code activity is typically
an indication of a stalled development rather than a sign that the project is finished. It turns
out from looking at the contributor data that there is only one major developer, who does
not appear to be very active. This was a surprise given that GeoClue is a relatively high
profile GNOME project. Nonetheless, it is valuable to learn this information.
Currently, the GeoClueproject comprises 12,338 SLOC. Using our $50/SLOC factor, this equates to an equivalent
cost of $0.62 million to develop this technology from scratch.
$0.6M invested in GeoClue to date
26http://projects.gnome.org/libchamplain/
��™ White Paper
4. Adopting open source to reduce cost of software ownershipPeer-reviewed literature exists27 to support the claim that maintenance costs dominate software total cost of
ownership (TCO) but our aim is to support this claim by looking at commit data derived from actual open source
projects. In this section, we will continue the forensic analysis of the code base of two of the same open source
projects we examined in section 2 using the output of our pyohloh script to obtain further information about
the number of developers working on these projects, their commits over time, the proportion of changes that
constitute maintenance and the corresponding proportion that could be considered as original development.
27http://users.jyu.fi/~koskinen/smcosts.htm
��™ White Paper
4.1 GTK analysis
In relation to the GTK code history graph highlighted in Figure 1, an important milestone of note was the release
of GTK v2.1228 in Sept 2007. Since that release, as Table 4 illustrates, GTK development has continued. For a
platform that forked GTK 2.12 and chose not to update it with upstream changes this further GTK development
can be considered to constitute unleveraged potential. We can quantify the delta to yield an upper bound of
the value of those subsequent upstream contributions. Note that there is no easy way to differentiate between
maintenance and new features within unleveraged potential; both are form part of the ‘forking tax’ the platform
provider incurs by ignoring upstream.
Month Code Comments Blanks Commits Man Months Delta Man Months
01-09-2007 502697 96897 110101 12140 2752 22
01-10-2007 503262 96956 110247 12195 2771 19
01-11-2007 504825 97593 110449 12290 2799 28
01-12-2007 543111 103835 118349 12435 2844 45
01-01-2008 543764 104006 118521 12520 2868 24
01-02-2008 544540 104081 118681 12623 2889 21
01-03-2008 532912 101912 116092 12788 2924 35
01-04-2008 533430 101959 116160 12833 2941 17
01-05-2008 535693 102461 116699 12991 2976 35
01-06-2008 529833 102411 115387 13359 3021 45
01-07-2008 540055 103289 117049 13505 3059 38
01-08-2008 541936 103932 117410 13701 3098 39
01-09-2008 543399 104243 117692 13824 3131 33
01-10-2008 544106 104206 117846 13916 3156 25
01-11-2008 545548 104931 118125 13978 3173 17
01-12-2008 549378 106453 118903 14165 3194 21
01-01-2009 553943 107572 120014 14433 3219 25
01-02-2009 553931 107685 120011 14572 3245 26
01-03-2009 554353 107715 120101 14611 3262 17
01-04-2009 555396 107830 120303 14672 3286 24
01-05-2009 558435 108139 120894 14741 3313 27
01-06-2009 560721 108810 121376 14860 3341 28
01-07-2009 563135 109647 121930 14957 3361 20
Table 4: GTK month by month commit details since release 2.12 (source: pyohloh)
We can use the data in Table 4 to quantify this unleveraged potential in two ways. First we can look at the delta
code size and associate an engineering cost to it. Secondly we can look at the time spent in terms of delta man-
months.
28http://mail.gnome.org/archives/gtk-devel-list/2007-September/msg00052.html
��™ White Paper
In terms of delta code size, GTK’s code size has increased from 502697 in Sept 2007 to 563135 in July 2009.
Using our $50/SLOC factor, this equates to an engineering cost of $3.02 million to develop this technology
independently. This figure is likely to be on the low side because GTK was already substantially advanced in Sept
2007, so any work to enhance/modify it would be complex by nature.
In terms of delta man-months, it is worth noting that the delta man-month column numbers remains on a
constant curve highlighting the maintenance burden. The overall man-months spent on the project between
GTK 2.12 to the present went from 2752 to 3361, that is 609 man-months. Using the earlier
figure of $75000 per developer per year, this equates to $3.8 million unadjusted and $9.12
million using the COCOMO 2.4 overhead factor. Averaging between the two results gives
us a conservative estimate of $6 million of unleveraged potential between GTK 2.12
and GTKcandidate 2.18. This finally puts a figure to the price of forking GTK from 2.12 and not synchronising/
engaging with upstream development from that point. If a decision to synchronise is made later, there will be an
additional re-engineering cost to make this happen.
If we were to move a year along the development curve, we reach GTK 2.14 released in Sept 200829. Using the
same approach as above, we go from 3131 to 3361 = 230 man months of unleveraged potential since that
point to the current version of GTK at the point of writing. This corresponds to over a third the full unleveraged
potential between Sept 2007 and now or $2.3 million.
As a final data point, we can look at the corresponding developer activity graph in Figure 8 which gives us a relatively
constant maintenance load of around 25 resources per year committing to the GTK mainline over the last couple of
years (equating to approx $1.8 million/year). This graph was generated from the delta man-months column in Table 4.
Figure 8: Graph of GTK developer activity over last two years (source: pyohloh)
Unleveraged potential cost of $6M for GTK
within a 2 year period
29http://mail.gnome.org/archives/gtk-devel-list/2008-September/msg00024.html
��™ White Paper
4.2 WebKit analysis
The WebKit code history graph highlighted in Figure 2 clearly shows the point in mid-2005 when Apple
announced the open sourcing of WebKit. Since then, the graph has illustrated the characteristic upward curve
of a healthy open source project where code is being continuously evolved, enhanced and added to. In fact,
WebKit is an interesting open source project in that it doesn’t operate fixed releases at all but is available as
a continuously moving svn codeline. Nonetheless, if we were to take a fork at Nov 2007 when HTML5 Media
support was added30, as Table 5 illustrates, a manufacturer building on the Nov 2007 base
and not updating with subsequent changes potentially missed out on 1786845 – 1016544 =
770,301 delta SLOC worth of unleveraged potential. Using our $50/SLOC factor, this equates
to an equivalent cost of $38.5 million to develop this technology from scratch on top of
the WebKit v1.0 source base. As with our GTK analysis above, we can look at the delta man-
months too and Table 5 shows 7394 - 4109 = 3285 man months of presumed beneficial
evolution of the source base. Using the earlier figure of $75000 per developer per year, this equates to $20.5
million unadjusted and $49.3 million using the COCOMO 2.4 overhead factor. Averaging between the two
results gives us a conservative estimate of $44 million of unleveraged potential between WebKit at Sept 2007
and the current WebKit head revision.
30http://webkit.org/blog/140/html5-media-support/
Unleveraged potential cost of $44M for WebKit within a
2 year period
�9™ White Paper
Month Code Comments Blanks Commits Man Months Delta Man Months
01-11-2007 1016544 322867 254297 28684 4109 134
01-12-2007 1044098 328069 259268 29508 4252 143
01-01-2008 1071166 340421 264312 30372 4390 138
01-02-2008 1457449 386764 302370 31175 4530 140
01-03-2008 1483810 399789 306929 32074 4667 137
01-04-2008 1506161 404098 311594 33040 4806 139
01-05-2008 1527013 407883 316609 33924 4949 143
01-06-2008 1534969 408878 317762 34698 5091 142
01-07-2008 1551059 412792 320853 35410 5235 144
01-08-2008 1566371 414854 324016 36124 5367 132
01-09-2008 1595954 420546 329897 37325 5528 161
01-10-2008 1607483 422141 332694 38364 5696 168
01-11-2008 1623679 426169 336570 39241 5857 161
01-12-2008 1642825 428862 339465 40070 6007 150
01-01-2009 1678058 440574 348052 41194 6200 193
01-02-2009 1691983 444456 351355 42064 6380 180
01-03-2009 1709246 448433 354963 43086 6573 193
01-04-2009 1731266 452352 359429 44230 6784 211
01-05-2009 1747419 455081 362749 45309 6988 204
01-06-2009 1816948 460252 371543 46488 7216 228
01-07-2009 1786845 469754 378470 47106 7394 178
Table 5: WebKit month by month commit details Nov 2007-July 2009 (source: pyohloh)
20™ White Paper
The unleveraged potential figures are so huge for WebKit that it is clearly very important for an OEM to have a
maintenance strategy in place up front if they want to include WebKit in their product. This is clearly visible by
looking at the corresponding developer activity graph shown in Figure 9 showing an amortized maintenance
load of approximately 200 engineers per year (equating to approximately $15 million/year). This graph
was generated from the delta man-months column in Table 5. Note that the curve is on an upward gradient
demonstrating that WebKit is gaining developer traction.
Figure 9: Graph of WebKit developer activity over last two years (source: pyohloh)
4.3 Maintenance of open source software and community engagement
Using the figures we have uncovered, it is possible to make some quantitatively-backed statements regarding
open source software cost of ownership and the related economic benefits of engaging with corresponding
open source projects and communities. We can now support the following assertions:
Healthy open source projects have a characteristic progressive cost profile in relation to maintenance – in a
sense, they’re never finished but continue evolving ‘upstream’.
The cost of forking and losing connection with upstream development is twofold: i) the
corresponding cost of presumed beneficial unleveraged potential, ii) the further cost of
having to re-engineer modified forked code in the future to accommodate the inevitable
eventual re-sync with upstream. We quantified the former to show that the figures run into
$millions for important components such as GTK, WebKit, GStreamer and BlueZ.
•
•
Emergency “deforking” also
incurs cost!
2�™ White Paper
Accommodating upstream development within the context of an open mobile platform is a key way to reduce
the cost of unleveraged potential.
It is important that mobile industry platform providers engage with the open source communities as early
as possible so that platform maintenance strategy is fully aligned with the upstream development agenda of
these communities, which is far more cost efficient than managing the entire maintenance burden in-house.
In practical terms, a strategy of engagement is bilateral. It involves actively working patches back into
community source and trying to influence the direction of the project.
Nevertheless, we have to acknowledge the reluctance on the part of some major mobile industry players to
depend on an unpredictable and intangible community for a key deliverable when mission critical commercial
release is at stake.
We also need to understand that the benefits of community engagement are not immediately visible or linear
– engagement for purposes of strategic alignment with product development is likely to achieve measurable
benefits only over the medium to longer term. It is more about investing in the relationship
to gain future value. In any case, it is not possible to divest entirely of the need for
engineering resource – some engineers will always be required to integrate, test and modify,
but engagement does offer a mechanism for maximal gain from the community through
maintenance and innovation beyond just initial acquisition. This gain is quantifiable and
bounded in upper terms by the unleveraged potential figures. Nokia seem to understand this and have made
some endorsements to this effect:
“The one who invests most has the biggest influence. If a company has a large group of developers,
it will create more and better proposals and those proposals will take the day.” 31
Nokia’s open source site, http://opensource.nokia.com, is evidence of this in operational practice. There are some
24 showcase open source projects being sponsored and linked from that site including:
S60 WebKit32: Port of WebKit to Nokia S60 platform
PyS6033: Python interpreter for Nokia’s S60 platform
Mobile Web Server34: Nokia’s port of Apache web server to S60 platform
•
•
•
•
•
31http://www.mobilemonday.net/news/nokia-finds-value-in-open-source-community32http://opensource.nokia.com/projects/S60browser/index.html33http://opensource.nokia.com/projects/pythonfors60/index.html34http://opensource.nokia.com/projects/mobile-web-server/index.html
TCO control requires upstream
engagement
22™ White Paper
4.4 Open source maintenance in a commercial mobile context
It is possible to use the ohloh web service to compare the maintenance of commercially driven open source
developments such as oFono, Android and WebKit to more community-sponsored projects such as GTK,
GStreamerand BlueZ. One characteristic difference between them is that commercially developed open source
is often seeded by the injection of large quantities of code into an open repository. This can be clearly seen by
examining Android’s code history shown in Figure 10. Note that the corresponding git repository for this code
history is git://android.git.kernel.org/platform/bionic.git which consists of some 3 million SLOC mainly injected
around two points during the past year.
Figure 10: Graph of Android code history over time (source: www.ohloh.net)
Maintenance beyond the point of injection typically continues to be undertaken mainly by the commercial
entity itself. This was clear from looking at the WebKit contributor details – nearly all the top 25 contributors have
Apple email addresses. It is also interesting to look at the list of Android contributors. The top 8 are highlighted
in Table 8 below. The top committer by far is “The Android Open Source Project” which is a vehicle for an internal
Google engineering team. By doing internet searches, we were able to determine that the remaining individuals
involved also appear to be Google employees and quite probably the key gatekeepers for Google-driven
commits to the project.35
35For example, Jean-Baptiste Queru can be confirmed as a Google employee here: http://www.linkedin.com/in/jbqueru Raphael Moll likewise here: http://www.linkedin.com/pub/raphaël-moll/0/2b9/2ab Xavier Ducrohet likewise here: http://www.linkedin.com/pub/xavier-ducrohet/0/265/4b7 and so on…
2�™ White Paper
Contributor ID Account Name Contributor Name Man months Commits
41650445438243 ? The Android Open Source Project 5 323
41650445447227 ? Jean-Baptiste Queru 7 56
41650445476011 ? Raphael Moll 2 36
41650445473806 ? Xavier Ducrohet 2 30
41650445473002 ? Android Code Review 0 28
41650445476007 ? Dianne Hackborn 3 28
41650445476012 ? Eric Fischer 2 22
41650445476006 ? Jorg Pleumann 2 21
Table 8: Android month by month commit details since release 3.22 (source: pyohloh)
The data suggests what many in the open source world already know from experience, namely that it is not
easy to ‘dump’ commercially developed software as open source and expect to build a
community around it quickly – the process is likely to take a very long time and requires
significant efforts to align with the interests of external developers who often have different
motivations for getting involved. This should not be taken to mean that all such attempts are
doomed, merely that they face formidable challenges as various commentators36 have identified.
What the data we have examined suggests is that if one wishes to engage the community to assist in
maintenance, it is likely to be more effective in those cases where the corresponding components were
community-created in the first instance as with GTK, GStreamer and BlueZ and even then, only if roadmap
alignment can be achieved.
36http://mobileopportunity.blogspot.com/2009/06/symbian-evolving-toward-open.html
‘Dumping’ is inefficient.
2�™ White Paper
5. ConclusionsWhere an open mobile platform is already using key open source projects of critical importance, there is
direct economic value in constructive engagement with the corresponding open source communities. The
rationale is that through such engagement the platform provider can reduce the cost of acquisition of future
innovation as well as reduce the cost of maintenance of that software. The latter requires the platform provider
to work collaboratively with the community to align upstream developments. Good candidates for projects
that fall into this category are GTK, WebKit, GStreamer and BlueZ. Note that this does not mean a full reliance
on the community, which may be untenable in the context of commercial predictability requirements, but a
more blended approach. The practical details of how mobile platform providers and device manufacturers can
effectively engage with existing open source communities and seek to minimize the cost of ownership of open
source software will be the subject of a future LiMo Foundation White Paper.
Where a technology lies below the commodity line and is already in a mobile platform in the form of a
proprietary commercial implementation, open sourcing it is unlikely in the short term to build a significant
community around that code outside of the organizations that built the software in the first place.
Consequently, though it may be viewed as beneficial in terms of industry leadership and reputational value, it
is not necessarily economically beneficial in the short term to open source the technology. The motivation to
do so will be driven by non-economic factors such as a desire to see the technology more widely adopted or
used. In the event that a proprietary technology is open sourced, it is essential that the platform provider has a
practical community-building strategy to follow through on the act.
Where a technology is falling below the commodity line and is not already present to some degree in a
mobile platform, the platform provider should look to adopt relevant open source projects to reduce the cost
of software acquisition and offer opportunities for further scale economies through strategic alignment with
other open source based industry initiatives. Good candidates for projects of this type currently include the
Clutter Advanced UI framework and Telepathy IM Communications framework.
Where a technology lies above the commodity line, open source equivalents are of less strategic value to a
platform provider. This is the area of competitive differentiation for OEMs and operators where their value
propositions reside and where open source software tends in general not to offer a compelling technical
alternative.
•
•
•
•