a study of patent numbers forecasting by linear regression ... · patent analysis itself has become...
TRANSCRIPT
International Journal of Arts and Commerce Vol. 3 No. 8 October, 2014
207
A study of patent numbers forecasting by linear regression on cloud
storage technology
Liu, Kuotsan
Associate Professor
Graduate Institute of Patent
National Taiwan University of Science and Technology
Chen, Yingching
Graduate student of master’s degree
Graduate Institute of Patent
National Taiwan University of Science and Technology
Abstract
A patent numbers forecasting by linear regression is presented in this paper. A popular and short lifecycle
software technology, sharing link on cloud storage, was selected to demonstrate the research and results on
the main patentee diagram and technology-function matrix. The result shows that the linear model based on
numbers of inventors has high coefficient of determination. For a research and development proposal of a
company, how many patents should file could be easily determined by the forecasting patentees diagram and
the forecasting technology-function matrix.
Keyword:patent map, patent analysis, patent forecasting, cloud storage.
1. Introduction
Patentsare powerful to stop competitorsenter claimed scopesbased on their exclusive rights.A company owns
a big amount of patents is normal in modern industry. Famous companies,for example,International Business
Machines Corporation (IBM) and MicrosoftCorporation(Microsoft), each owns more than 100 thousands
patents. To accumulate sufficient number of patents and occupy a higher rank ofmain patentees in special
technical field is important to get a large marketshare.
It is necessary to make patent analysis before a research and development (R&D) project to guarantee no
International Journal of Arts and Commerce ISSN 1929-7106 www.ijac.org.uk
208
block by competitors’ patents. However, this work is difficult and complex because of millions of patents in
database. There are 2.35 millions of patent publications in 2012, 9.2% growth on 2011, all over the world by
statistics of World Intellectual Property Office. Patent analysis itself has become a professional and research
field because its complexity and difficulty.
In patent analysis, patent maps are useful tools to visualize the distribution of patents, monitor the trend of
technological changes, infer the strategy of patent portfolios, and compare competitors by statistical charts or
diagrams. Patent maps shows macroscopic view of patents, and offer a company to determine the direction
of R&D. For example, a main patentees diagram shows the main competitors and their patent numbers, a
technology-function matrixshows the patent density on technical problems and solutions. All patent maps
give the views of patent on the drawing date, but the R&D objective setup is on a couple of years later. A
forecasting patent map is more helpful to determine the budget and the intended number of patent applications
for the R&D proposal.
This paper focus on patent numbers forecasting, a linear regression model was employed to do this work.
A popular software technology, sharing link on cloud storage, was selected to demonstrate the research and
results on the main patentee diagram and technology-function matrix.
Cloud storage has become the spotlight on IT industry. Huge amount of information is keeping
challenging the load of computer system, companies put their data in cloud storages instead of keep in hand,
files sharing link is necessary for them. IT industry is special in its short life time, a new company may
accumulate enough patents and become a main patentee in short term. Patent numbers forecasting would be
more important and usefulin this technology.
2. Methodology and data
2.1 Patent pool by search queries
The technical topicemployed for this study is sharing link in cloud storage. Search queries on US patent
publication database were in the following(search date: Dec.26,2013):
S1= (shar* adj3 (link* or URL or URI or hyperlink*)).DSC. 12,154 hits
S2=S1 not(vehicle* or GPS or sensor*).DSC. 8,975 hits
S3=S2 and (707* or 709*).UCM. 1,996 hits
S4=S3 not (adverti*).DSC. 1,423 hits
Where USPC(United States Patent Classification)707 is data processing: database, data mining, and file
management or data structures, UPC709 is multicomputer data transferring. Both are the most important
classesin cloud storage technology.
International Journal of Arts and Commerce Vol. 3 No. 8 October, 2014
209
2.2 An overview of main patentees and time evolution of patent publications
Table 1 is the top 20 patentees in the technical scope under the search query S4. Microsoft Corporation and
IBMoccupy top two in software technology, and own more than 50% of total patents. Google, Yahoo,
Facebook are famous companies in the world, but large gap behind top two. The rightest column shows
percentages of patent numbers after 2006 comparing to total. It shows that 11 patentees entered this technical
field after 2006, so we further limited the pool after 2006, which is the starting year to form
technology-function matrices in this paper.
Table 1 patent numbers of top 20 patentees
rank Total After
2006
New
rank
Percentage
1 Microsoft Corporation 121 82 1 67.8%
2 International Business Machines Corporation 119 51 2 42.9%
3 Cisco Technology, Inc. 25 15 5 60.0%
4 Google Inc. 17 16 4 94.1%
5 PatentVC Ltd. 16 16 4 100.0%
6 Yahoo! Inc. 14 12 8 85.7%
7 Salesforce.com, Inc. 13 13 6 100.0%
9 Nortel Networks Limited 12 12 8 100.0%
9 Fujitsu Limited 12 7 16 58.3%
11 ACCENTURE GLOBAL SERVICES LIMITED 10 7 16 70.0%
11 AOL Inc. 10 10 9 100.0%
13 Sprint Communications Company L.P. 9 9 10 100.0%
13 Juniper Networks, Inc. 9 8 12 88.9%
14 NetApp, Inc. 8 8 12 100.0%
17 NEC CORPORATION 7 5 19 71.4%
17 Nokia Corporation 7 7 16 100.0%
17 SWsoft Holdings, Ltd. 7 7 16 100.0%
19 FACEBOOK, INC. 5 5 19 100.0%
19 Actifio 5 5 19 100.0%
20 DROPBOX, INC. 3 3 20 100.0%
Fig 1 is time evolution of patent publications, from 2006 to 2013. Light bubbles are patent numbers of data
processing,dark bubbles are multicomputer data transferring. Both are near linear increasing, multicomputer
data transferring has a higher positive slope.
International Journal of Arts and Comme
Fig.1 Time evolution of patent publica
2.3 Linear regression model
The linear model based on numbers of i
yi=α⋅xi + β (1)
whereyiis numbers of patent publication
coefficients of linear model.
3. Technology-function matrices for tw
Fig.2 is a technology function matrix
technologies, or technical solutions on
database maintenance, collaborative doc
Fig.3 is a technology function matrix
including:distributed data processing, c
memory, remote data accessing, n
computer-to-computer(c-to-c) session/co
routing.
45 41
57
69
57
76
0
20
40
60
80
100
120
140
2005 2006 2007 2008 2009
The numbers of patent p
merce ISSN 1929-7106
cations
f inventors was taken to make patent numbers fore
ion on year i, xi is numbers of inventor on year
two segmentson 2013
ix at 2013for data processing. We employed US
n x-axis, including: database and file access, dat
ocument database and workflow, data integrity, a
x at 2013 for multicomputer data transferring, i
computer conferencing, multicomputer data tra
network computer configuring, computer
connection establishing, c-to-c protocol implem
44
5948 51
63
72 6975
88
119
2009 2010 2011 2012 2013 2014
t publications from 2006 to 2012
www.ijac.org.uk
recasting in this paper.
ar i, α and β are two
SPC subclasses to be
atabase design, file or
, and file management.
, its USPC subclasses
transferring via shared
network managing,
lementing, c-to-c data
International Journal of Arts and Commerce Vol. 3 No. 8 October, 2014
211
Fig.2 A technology-function matrix for data processing on 2013
We took 10% samples randomly, to get the problems of technological development after manual reading.
The problems on y-axis for both matrices are improved efficiency, providing a flexible system, simplify
operations, enhance security, tracking or monitor, enhanced system consistency and reliability. The search
query of each problem could be organized at the same time. For example, the query of improved efficiency is
(potim* or efficien* or effective* or (improve* near3(performance* or congest*)) or accelera*).DSC.
We got 653 patent publications under this query, and got the numbers of publications for USPC subclasses or
nodes on matrix. One publication may drop in more than one subclass.
International Journal of Arts and Commerce ISSN 1929-7106 www.ijac.org.uk
212
Fig.3 A technology-function matrix for multicomputer data transferring on 2013
In order to check the reliability of the matrix, we took 10% samples randomly again for each query, and read
samples manually to check whether their technical problems are consistent with the query goal. The correct
percentages for each problem are 75.4%, 78.4%, 70.6%,57.8%, 70.7%,66.0%. The total quality fall in
62.18% to 77.46% for 95% confidence interval.On the average, 70% quality may be not excellent, but 90%
labor cost saving is an important merit.
Fig.2 shows that in data processing, ‘database and file access’ has become the popular tehchnology for each
technical problems, ‘collaborative document database and workflow’ has not yet developed. Fig.3 shows that
in multicomputer data transferring, ‘computer network managing’ was the popular technology.If patent
numbers in ‘database and file access’or ‘computer network managing’are not detail enough, we could employ
UPC lower level subclasses to spread x-axis and get publication numbers on each nodes quickly under this
method.
The technology-function matrix shows patent numbers on all problem-solution nodes clearly. If one
company determines the topics of R&D under the matrix, the next question would be how many patent
applications should file to become a main patentee? It needs make patent numbers forecasting to answer this
question.
International Journal of Arts and Commerce Vol. 3 No. 8 October, 2014
213
4. Patent numbers forecasting by linear regressions
4.1 Linear regressions of three patentees
Patent numbers forecasting for every patentee can be got by using the linear regression model of formula (1).
Three main patentees, Microsoft, IBM, and Yahoo were selected to demonstrate the results and check their
reliability.
Fig.4 is the linear regression diagram onMicrosoft. We put the numbers of inventors and patent
publications from 1999 to 2013 in formula (1) to get α=0.2898, β= -0.3939. In this linear model, the
coefficient of determination R2=0.9342, R square indicates how well data points fit a statistical model, 0≤R
2≤1,
the higher value of R square, the stronger explanation of the linear model. The R square value shows a high
reliability.
Fig.4 Linear regression diagram on patent numbers of Microsoft
Fig.5 is the linear regression diagram of IBM. The coefficients in formula (1) are determined by the same
method ,α=0.3612, β=-0.3478 for Microsoft. In this linear model, the coefficient of determination
R2=0.9133, it also shows a high reliability.
Sample Predictive Value
International Journal of Arts and Commerce ISSN 1929-7106 www.ijac.org.uk
214
Fig.5 Linear regression diagram on patent numbers of IBM
Fig. 6 is the linear regression diagram of Yahoo. The coefficients in formula (1) are α=0.3512, β=0.1045.
In this linear model, the coefficient of determination R2=0.9451, which is high reliability again.
Fig. 6 Linear regression diagram on patent numbers of Yahoo
4.2 Patent numbers forecasting in the next three years
The patent numbers forecasting for thirteen patentees in the next three years can be easily calculated after the
coefficients had been determined. We regarded the average number of inventors in the latest five years as xi to
Sample Predictive Value
Sample Predictive Value
International Journal of Arts and Comme
get the number of patents yi. Fig.7 is pate
In Fig.7, there would be 8.82,9.05,8
publications by IBM. It shows that the
However, the others are not difficult
fourth patentee, less than 20. If one com
least 9 applications in the next three ye
applications. The R&D budget can be d
get a higher rank by the forecasting diag
Fig.7 patent publication forecasting in
4.3Patent numbers forecasting on inte
A forecast technology function matrix
forecasting matrix, three interested node
0 20
Microsoft Corporation
IBM
Cisco
salesforce.com
Yahoo
AOL Inc.
Sprint Comm.
Juniper Networks
ACCENTURE GLOBAL
NEC CORPORATION
DROPBOX
51
16
15
13
12
10
9
8
7
5
5
3
Estimated Applicants dist
2006~2013 Publication
merce Vol. 3 No. 8
atent publication forecasting in 2014-2016, and to
,8.04 publications in the next 3 years by Micro
e top two patentees are difficult to surpass.
lt to catch up,the third patentee will accumulate a
ompany intends to become a top ten patentee in 20
years. If one wishes to become a top five paten
determined under the number of patent applicatio
agram.
in 2014-2016
terested technology and function
ix can be formed by using the linear model of form
des were selected to illustrate the results.
40 60 80 100
82
51
8.82
6.80
9.05
6.65
8.04
7.16
ants distribution of U.S. publications in 2016
blications estimated 2014 estimated 2015 estimated 2
October, 2014
total patents in 2016.
crosoft, 6.80,6.65,7.16
e approximately20, the
2016, he should file at
tentee, it will need 18
tions. One patentee can
ormula (1). Fig. 8 is a
100
8.04
imated 2016
International Journal of Arts and Commerce ISSN 1929-7106 www.ijac.org.uk
216
Fig.8 A forecast technology-function matrix
In Fig.8, the first node is ‘database and file access’ to solve ‘improved efficiency’ problems. On this node,
there will be 19, 20, 19 patents in the next three years, and reach 212 on 2016. The second node is ‘computer
conferencing’ on ‘tracking or monitor,’ this node has a higher increasing rate, will reach 126 on 2016. The
third node is ‘computer network managing’ on ‘enhance security,’ this node has a lower increasing rate, 66
patents on 2016.
The other nodes are not difficult to forecast by the same model. The forecasting matrices year by year
visualize not only patent densities but also patent increasing rates. A quick growing up bubble indicates hot
topic of technology and function.
5. Conclusions
Patent numbers forecasting is important for research and development. A simple but high reliable linear
regression method was illustrated in this paper. It is very helpful to determine how many patent applications
should file in a couple of years, and further determine the R&D budget.
A linear regression model based on the numbers of inventors on main patentees predicts the new ranking in
the next years.
The linear regression model applies to the nodes of technology-function matrix to get a forecast matrix in the
future. Some low density nodes in this year may grow up to high density in the next years. The forecast
matrix can low down the risk of R&D comparing to rely only on a present matrix.
International Journal of Arts and Commerce Vol. 3 No. 8 October, 2014
217
The IT industry is characterized in its short lifetime cycle. Both famous and unknown companies usually
appear in the main patentee diagram. It don’t need a long term to become a main patentee even for a nameless
company. Patent numbers forecasting are more powerful and necessary in this technical field.
Acknowledgement
This study is conducted under the “Cloud computing systems and software development project (3/3)” of
the Institute for Information Industry which is subsidized by the Ministry of Economy Affairs of the
Republic of China.
References
Ernst, Holger(2003),Patent imformation for strategic technology management,World Patent Information, 25,
233-242.
Vouk, Mladen A.(2008), Cloud Computing-Issues, Research and Implementations, Journal of Computing and
Information Technology, CIT 16, pp.235-246.
Trappeya, Charles V., Hsin-Ying Wua, FatanehTaghaboni-Duttab, Amy J.C. Trappeyc(2011), Using patent
data for technology forecasting: China RFID patent analysis, Advanced Engineering Informatics,
25(1), pp.53-64.
Xiea, Zhongquan, KumikoMiyazahia(2013), Evaluating the effectiveness of keyword search strategy for
patent identification, World Patent Information, 35(1), pp.20-30.
Liu, Kuotsan, Yen, Yunxi(2013), A quick approach to get a technology-function matrix for an interested
technical topic of patents. International Journal of Arts and Commerce, 2(6),pp.85-96.
Liu, Kuotsan, Lin, Hanting(2014), A study on the relationship between technical development and
fundamental patents based on US granted patents, European International Journal of Science and
Technology. 2(7),pp.314-327.