china traditional chinese medicine (tcm) patent database
TRANSCRIPT
World Patent Information 26 (2004) 91–96
www.elsevier.com/locate/worpatin
China traditional Chinese Medicine (TCM) Patent Database
Yanhuai Liu *, Yanling Sun
Patent Data Research and Development Center, The Intellectual Property Publishing House, State Intellectual Property Office of the PR China, No. 6,
Xitucheng Road, Haidian District, Beijing 100088, China
Abstract
The deep indexed China Traditional Chinese Medicine (TCM) Patent Database was established by the State Intellectual Property
Office (SIPO) of PR China. The purpose of creating this database was mainly to meet the need of patent examination. The database
has already been put to use in the patent examination department in SIPO since April 2002.
The Chinese version of the database covers TCM related patent applications published from 1985 to current in China. It contains
over 19,000 bibliographic records and over 40,000 TCM formulas. In order to present this database to WIPO, an English demo
version was created and opened to the world through the WIPO gate.
There are 29 search fields in the database that fall into four categories: bibliographic information, subject index terms, uses/
effects, TCM formulas. Rewritten titles and abstracts provide users with more searchable information. The system was built with
multiple search features: quick search, advanced search, TCM formula search, and search history tracing function. Moreover, two
special features created in the system are very useful for improving searching efficiency: cross-file search based on TCM dictionary
and TCM similarity search. The cross-file search enables users to locate a specific TCM in the TCM dictionary file and then cross-file
search that name in the patent bibliographic file for relevant records. TCM similarity search enables users to do one-stop searching
easily for complex search queries.
� 2004 Published by Elsevier Ltd.
Keywords: Chinese patent search system; Traditional Chinese Medicine Patent; TCM patent search; TCM Patent Database; Deep indexing; Herbal
medicine; Natural medicine
1. Introduction
China traditional Chinese medicines (TCM) have
more than 5000 years history. They have played an
important role in treating diseases for humanity. In the
modern world, the cost of developing a new chemical
medicine is very expensive: the average cost for devel-
oping one kind of drug can be from 80 million to severalbillion dollars. Moreover human beings are developing
growing drug resistance. In this situation, people began
to pay close attention to inexpensive and efficient tra-
ditional natural medicines or herbal medicines. The fast
growth of patent applications related to natural medi-
cines have shown this trend clearly. By 2001, the number
of traditional medicine patent publications has exceeded
50,000 in the world. Since the China Patent Law came
*Corresponding author. Tel.: +86-10-8203-4349; fax: +86-10-8203-
4354.
E-mail addresses: [email protected] (Y. Liu), sunyanling@
sipo.gov.cn (Y. Sun).
0172-2190/$ - see front matter � 2004 Published by Elsevier Ltd.
doi:10.1016/S0172-2190(03)00110-8
into force in April 1985, the number of China TCM
patent publications has increased rapidly year by year.
By the end of 2002, its number is approximately 20,000.
At the same time, it is more and more difficult for
patent examiners to search traditional medicine related
patent documents efficiently in patent examination by
using existing databases. Because there are no proper
classifications in IPC for different traditional medicines,and there are lots of different names for each TCM
component, e.g. standard name, scientific name and
other synonyms, Latin name used for the same medicine
in patent documents. In order to solve this problem and
provide an efficient way for patent examiners or other
users to search TCM related patent documents, the
State Intellectual Property Office of PR China decided
to create a deep indexed China TCM Patent Databasewith greatly enhanced search functions. The database
was finished in March 2002 and has been used by patent
examiners in the Chinese Patent Office for one year.
With the user friendly interface and greatly enhanced
search functions, it achieves a high recall and precision,
and the patent examiners in SIPO were satisfied by using
Applicant country/province code PAC
Inventor name INR
Priority number PRN
Main international patent classifica-
tion
IC1
Secondary international patentclassification
IC2
International patent classification (IC)
(searching for IC1 and IC2
together, in �OR’ relationship)
IC
Field name Field code
Biological process BIO
92 Y. Liu, Y. Sun / World Patent Information 26 (2004) 91–96
the China TCM Patent Database in their patent exam-
ination work. In order to be able to present the database
at a WIPO Conference in 2002, 1 an English demo
version of the database was built and has been opened tothe world through the WIPO Web gate.
So far, the Chinese version of the database covers
TCM related patent applications published from 1985 to
current in China. It contains over 19,000 bibliographic
records and over 40,000 TCM formulas that were pub-
lished from April 1985 to December 2002. The English
demo version of the database contains 1761 biblio-
graphic records and 4177 TCM formulas. It covers TCMrelated Chinese patent applications published from 1993
to 1994. A sample database record from the English
version is shown in Fig. 1. The English and Chinese
demo versions of the database have been opened free of
charge on the Web site: http://pharm.cnipr.com. 2
Chemical process CHE
Analytical process ANAExtraction process EXT
Physical process PHY
Formulation process GAL
TCM formula composition MIX
New therapeutic use NUS
Index terms (used to search ANA,
EXT, BIO, CHE, PHY, GAL, MIX,
and NUS together, in �OR’ relation-ship)
IT
2. Features of the China TCM Patent Database
There are 29 search fields in the database, which fall
into the following four categories: bibliographic infor-
mation, subject index terms, uses/effects, and TCM
formulas. These fields were designed for different
searching need, and include the following.
2.1. Bibliographic information
Field name Field code
Title TI
Abstract AB
Application number AP
Publication number PN
Application date AD
Publication date PD
Applicant name PA
Applicant address ADDR
Field name Field code
Therapeutic effect THEF
Side effect TOXI
Diagnostic effect DIAGInteractive effect DINT
Similar effect ANEF
Effect (EFF) (used to search THEF,
TOXI, DIAG, DINT, and ANEF
together, in �OR’ relationship)
EFF
1 The third session of the Intergovernmental Committee on
Intellectual Property and Genetic Resources, Traditional Knowledge
and Folklore held in Geneva June 2002.2 To use the Demo Database on this site, the user name is patentN,
here the N is a number from 1 to 99. The password is also patentN,
here the N is a number from 1 to 99. When logging in to the database,
the same number code should be used in the user name and in the
password. For example, if the user name is patent1, the password is
also patent1. The authors welcome feedback about the database, at
their email addresses ([email protected] and sunyanling@
sipo.gov.cn) or at [email protected].
2.2. Subject index terms
2.3. Uses/effects
2.4. TCM formulas
The database provides special interface and entry for
searching TCM patented formulas (see Section 3.3).
3. Search approaches
There are several approaches for searching and usingthe database: quick search, advanced search, TCM
formula search, and search history.
Fig. 1. Sample record of the China TCM Patent Database.
Y. Liu, Y. Sun / World Patent Information 26 (2004) 91–96 93
3.1. Quick search
The ‘‘quick search’’ facility provides a simple searchinterface with a text search for the entire contents of the
database. Users, especially unprofessional users can use
this interface to conduct searches easily and efficiently.
For instance, to search patents that related to the
medicines for treating hypertension and containing
radix ginseng, one can input hypertension and radix
ginseng into the query box (with comma(s) to separate
keywords/phrases) (see Fig. 2). There are two logicselections for different keywords/phrases: AND/OR.
Users can make logic selections by clicking the buttons
below the query box. Default is AND.
Time range selection can be used to search patents in
a specific time period. There are three types of date:
application date, publication date, and database update
date. Default date type is publication date, and default
time range covers the whole time range from beginningto current.
At the bottom of the page, there are 45 frequently
used TCM names, which can be reset by users. Clicking
on the TCM names will make them appear in the query
box automatically.
The search history can be recalled and used later to
refine the searches.
3.2. Advanced search
The ‘‘advanced search’’ enables users to makenested Boolean searches and field searches. Users may
use keywords/phrases, Boolean operators, and paren-
theses to express complex search query (see Fig. 3).
For instance, to search for medicines containing Aloe
for treating cancer, the query syntax can be as follows:
ðcancer=thef OR A61P035=00=ICÞAND aloe
The functions of logic selection, time range selection,
and frequently used TCM name list are the same as in
the quick search.
The search history can be recalled and used later to
refine the searches.
3.3. TCM formula search
The ‘‘TCM formula search’’ facility includes TCM
formula logic search and TCM formula similarity
search.
3.3.1. TCM formula logic search
TCM formula logic search enables users to search
TCM formulas with nested Boolean search queries. For
example, �Radix Angelicae Dahuricae and Fructus
Fig. 3. Advanced search.
Fig. 2. Quick search.
94 Y. Liu, Y. Sun / World Patent Information 26 (2004) 91–96
Evodiae and Radix Aristolochiae and (Cortex Moutan
or Lignum Santali Albi)’.
TCM formula logic search also features a limit
function which specifies the number range of compo-
nents that are contained in the target formulas.
The search history can be recalled and used later to
refine the searches.
Users may choose to display the search results by
TCM formula or by application number.
3.3.2. TCM formula similarity search
One problem the patent examiners often face is how
to search, for example, all formulas containing any 7 out
of 10 components. With Boolean search, one needs to
Fig. 4. TCM formula similarity search.
Y. Liu, Y. Sun / World Patent Information 26 (2004) 91–96 95
conduct 120 queries for this particular search. Obvi-
ously, this is not feasible for patent examiners. Similarity
search has been developed to fulfill the request with a
single flexible query. So, the TCM formula similarity
search facility is particularly popular with patentexaminers at SIPO (see Fig. 4).
Additionally, the search can be limited with certain
words/phrases which could be defined as curative effect,
international class, Pharmsearch classification, or all
fields. This function makes it possible for patent exam-
iners to do both the novelty search and creativity
(inventive step) search at the same time.
Other functions such as a limit for the number ofmedicines included in the target TCM formulas, display
method, and frequently used TCM name list are also
available.
Search history cannot be saved for TCM formula
similarity searches.
3.4. Search history
Search histories can be saved automatically with the
search history function except for TCM formula simi-
larity search. Users can further define their searches with
this function by using search set numbers, Boolean
operators, parentheses, and other words/phrases to
construct new queries.
For example, a new query of ‘‘#1 and #2 andhypertension/thef’’ would search for all patents that
fulfill both query #1 and query #2, and also have a
keyword ‘‘hypertension’’ in the field of thef (therapeutic
effect).
4. TCM dictionary
The TCM dictionary is an assistant tool for TCM
patent search or TCM formula search. In this file, a
unique record is created for every TCM, which includesTCM’s Latin drug name, English drug name, Latin
plant/animal/mineral name, Chinese standard name,
Chinese synonyms, Chinese pinyin name (see Fig. 5).
Users can find a specific TCM by any of these accesses,
and then using standard Latin drug name (for English
version) or Chinese standard names/Chinese syno-
nyms (for Chinese version) to search the bibliographic
patent file or the TCM formula file. The file crossoversearch function can be implemented by clicking the file
crossover search button after selecting the TCM names.
For example, the phrase ‘‘ground beetle’’ will only
bring out one patent record from the demo database by
using quick search/advance search/formula search.
However, a total of 51 patent records and 153 formula
records can be retrieved by using file crossover search to
search ‘‘Eupolyphaga seu steleophaga’’ which is thestandard Latin drug name for ‘‘ground beetle’’ according
to the dictionary.
5. Conclusion and on going project
After one year’s practice of search application in the
patent examination department in SIPO, our examiners
are well satisfied with its value and the good results
obtained. In addition, the database has been well re-
ceived externally, for instance from attendees at the
conference referred to in Section 1; for example the
Fig. 5. TCM dictionary.
96 Y. Liu, Y. Sun / World Patent Information 26 (2004) 91–96
experts from WIPO and the USPTO commented favour-
ably, based on the English Language Demo Database.
The deep indexed China TCM Patent Database isprovided with high search efficiency, search quality and
powerful search functions. It is one of the world’s ad-
vanced TCM patent databases. We are now translating
the whole database from Chinese into English. The
whole translation work for the English version of the
database will take several months.
In addition to China TCM Patent Database, a more
powerful database search system with worldwide tradi-tional medicine patents, is being planned in China. More
traditional medicine patents, published in Japan, Korea,
India, America and other countries, will be deep indexed
and will be merged to one unified database system. Also
one unified searching system will be built and connected
to all TCM related databases distributed in China. It is
expected that this TCM related databases platform will
provide a new friendly searching interface and powerfulsearch system for both patent examiners and the
worldwide public in the near future.
Acknowledgements
This article has been developed from a presentation
given by the corresponding author, Yanhuai Liu, at the
annual conference of the Patent and Trademark
Depository Library Program (PTDLP), organized by
the USPTO in Crystal City, Arlington, Virginia, USA,in March 2003.
Yanhuai Liu is a researcher and Director atthe Patent Data R&D Center, IntellectualProperty Publishing House, SIPO of PRChina. She started research work in patentdocumentation indexing in 1985 and leadindexing work to Chinese Patent AbstractDatabase during 1987–1993. She is the orga-nizer of a research project for deep indexingChinese chemical patent information and is incharge of establishing deep indexed ChinaTCM Patent Database and China ChemicalMedicine Patent Database. As an Editor-in-Chief, she published the book ‘‘A Guide for
Patent Database Searching in China and Other Countries’’.
Yanling Sun received degrees in Biology andInformation management from Peking Uni-versity in China and is a researcher and dep-uty director at the Patent Data R&D Center,Intellectual Property Publishing House, StateIntellectually Property Office of PR China.She is a specialist in patent data processingand database building. She is leading theChina TCM Patent Database Building, andhas presented the database at meetings orga-nized by WIPO. As an Editor-in-Chief, shepublished the book ‘‘Search Patent Informa-tion on Internet’’.