ii-sdv 2014 design and development of a novel patent alerting service (bayer healthcare, germany)
TRANSCRIPT
Design and development of a novel Patent Alerting System
2014-04-14Dr. Wolfgang Thielemann
Slide No. 2 • 2014-04-14 Wolfgang Thielemann2 • Dr. Ortrud Steinführ • 23-JUN-2011
Agenda/Content
IntroductionWorkflowUsing the platform to Search, Browse and FilterEmail push serviceSummary
Slide No. 3 • 2014-04-14 Wolfgang Thielemann
Introduction:What does this novel service do
and why did we need it?
1
Slide No. 4 • 2014-04-14 Wolfgang Thielemann
Patent Alerting SystemWhat does it do?
Slide No. 5 • 2014-04-14 Wolfgang Thielemann
Patent alerting options existing at the start of pr oject:
• End-Users browse results of commercial alerting services like “Current Patents Gazette”
• End-Users set up alerts in databases (e.g. Scifinder) themselves
• Information professionals create alerts in various added-value databases and/or patent full-text databases
Challenges for a novel, proprietary alerting servic e:
• The number of newly published healthcare & chemistry patents is huge (2000-5000/week)
• The bandwidth of topics to be tracked and the corresponding terminology is huge too
A powerful, precise and focused alerting system is needed
Patent alerting options
Slide No. 6 • 2014-04-14 Wolfgang Thielemann6
Using advanced text mining workflows, new patents have to be categorized into clearly defined, project specific folders
Criteria for categorization should include:
• Drug action / molecular target (PDE 5 inhibitors)
• Specific medical condition (e.g. pulmonary hypertension)
• Specific technology (e.g. positron emission tomography)
• Compound class (e.g. antibodies)
A
The platform must be easy to use and end-user searchableC
Minimal costs for creation, maintenance, uploads, license fees and hardwareD
What did we expect from our new patent alerting system?
The platform has to provide chemical structures which are representative for novel chemical space covered in medicinal chemistry patents
B
Slide No. 7 • 2014-04-14 Wolfgang Thielemann
Workflow
2
Slide No. 8 • 2014-04-14 Wolfgang Thielemann
1. Download of patent full-text
2. Filtering and categorization
3. Adding key content
Overall workflow of novel patent alerting service
Email push-service; RSSfeeds
Searching or Browsing
APIs
other prop. platforms
Enrichment with chemical structures
Patent full-text source
Orbit
Broad healthcare & chemistry related search for all alerts in:
Chemical structure DBs
CAS / Registry
WPIX / DCR
SureChem
patent numbers of substructure hits for selected alerts
Patent alerting platform
Tabular patent sheets enriched & categorized with:
• Project name• Indication• Molecular targets• Chemical structures• Technologies• …
Slide No. 9 • 2014-04-14 Wolfgang Thielemann
The engine within the novel service
1. Download of patent full-text
2. Filtering and categorization
3. Adding key content
1. Download of patent full-text
2. Filtering and categorization
3. Adding key content
Slide No. 10 • 2014-04-14 Wolfgang Thielemann
It’s not a commercial black box!
We have full control over:
• The process• The vocabulary• The rules
… and can adjust it to the needs of our organization!
The engine within the novel service
Slide No. 11 • 2014-04-14 Wolfgang Thielemann
2000-5000 newly published healthcare and chemistry related patent applications per week (first published member of a patent family + first US or EP application)
Details of proprietary categorization, filtering, indexing steps
+ + +
In-depth text mining analysis of the patent full-text to extract, standardize and add key terms (targets, indications, technologies etc.)
Adding key content
typically 0 - 5 patents per alert / week
In-depth text mining analysis of the patent full-text for identification of relevant patents
Filtering & Categorization
Slide No. 12 • 2014-04-14 Wolfgang Thielemann
Added key content
• Alert relevant keywords (e.g. drug action)
• Indication
• Molecular target (official NCBI Gene name + Gene ID)
• Formulation (Route of Admin + Dosage forms)
• Species
• Technologies (e.g. prodrug, freeze drying, pegylation etc.; can be augmented to the needs of the organization)
• Molecule type (small molecules, biologicals, natural products)
• Patent type (compound, formulation, method general, diagnosis, preparation method, combination etc.)
Keywords relating to the following topics are extracted, standardized and added:
+
+
+
+
+
+
+
+
Slide No. 13 • 2014-04-14 Wolfgang Thielemann
Details of enrichment with chemical structures
identifies chemical compounds:
• from names (incl. IUPAC, brand names, generic names, trivial names) within 24 h after publication of a patent (WO, US, EP)
• from drawn structures (only high quality structures without variables) within 2-3 days after publication of a patent (WO, US, EP)
We add these structures to the new patents within the patent alerting workflow
*
*will soon become SureChEMBL
Slide No. 14 • 2014-04-14 Wolfgang Thielemann
Tuesday Wednesday Thursday Friday Saturday Sunday Monday
New WO
New US
New EP
Name2str Name2str Name2strConversion chem drawings
Conversion chem drawingsConversion chem drawings
Generation chemical structures
Timelines: Providing alerts as fast as possible
• Download• Categorization• Text Mining• Upload
Alert
…+ documents from other patent offices
Slide No. 15 • 2014-04-14 Wolfgang Thielemann
Inclusion / Exclusion criteria
The keywords are mentioned:
• 1x in core fields (e.g. title)
• Multiple times in other fields (e.g. description)
The keywords are mentioned:
• a few times in a non-core field
Documents & keywords Chemical structures
• Novel compounds (incl. intermediates) with a global frequency of <= 10 in all patents*which also pass our chemical purging filter
• Common reagents, catalysts, or drugs which are often mentioned in “washing lists”
* General WO, US, EP patent database
Selection criteria for novel patents
Slide No. 16 • 2014-04-14 Wolfgang Thielemann
Using the platform to
Search, Browse and Filter
3
Slide No. 17 • 2014-04-14 Wolfgang Thielemann
Alerting System Main Navigation:Subscriptions & Search
Shows alerts you have subscribed to
All other available alerts
Google like search (keywords or chemical structures ) with facetted filter options
Administration of alerts (only for information prof essionals)
Slide No. 18 • 2014-04-14 Wolfgang Thielemann
Alerting System Main Navigation:Entry page: “My Subscriptions”
Information about scope of the alert Stop subscribing to the alert
Show list of all documents collected for this alert so far
Slide No. 19 • 2014-04-14 Wolfgang Thielemann
Alerting System Main Navigation:Other available alerts
Information about scope of the alert Subscribe to the alert
Show list of all documents collected for this alert so far
Slide No. 20 • 2014-04-14 Wolfgang Thielemann
Document List view
All documents related to Endometriosis alert:
Slide No. 21 • 2014-04-14 Wolfgang Thielemann
Links in Document List View
Link to corresponding enhanced alerting system record with:
• Key content • Chemical structures
Original full-text
Link to corresponding patent record from Thomson World Patent Index
Original PDF Patent DB
Slide No. 22 • 2014-04-14 Wolfgang Thielemann
Document view
added key content
Slide No. 23 • 2014-04-14 Wolfgang Thielemann
browse records + chemical structures
Document view + chemical structures
Slide No. 24 • 2014-04-14 Wolfgang Thielemann
Alerting System Main Navigation:Search
Slide No. 25 • 2014-04-14 Wolfgang Thielemann
Search result list with filter options
Search for: “endometriosis”
Slide No. 26 • 2014-04-14 Wolfgang Thielemann
Logic “OR” within on topic and “AND” between topics
OR AND
Text mining generated, standardized added value terms allow easy filtering:
Faceted Filter Options
Slide No. 27 • 2014-04-14 Wolfgang Thielemann
Faceted Filter Options
Slide No. 28 • 2014-04-14 Wolfgang Thielemann
Email push service
4
Slide No. 29 • 2014-04-14 Wolfgang Thielemann
Email push service
Slide No. 30 • 2014-04-14 Wolfgang Thielemann
Summary
5
Slide No. 31 • 2014-04-14 Wolfgang Thielemann
Advantages of proprietary Patent Alerting System
• Grouping of patents into project specific folders (full flexibility in defining project relevant parameters like indication, target, technology etc.). Faster evaluation by project team members due to high relevance of hits
• Use of already existing, proprietary terminology for search and analysis of patents
• Systems maintained by patent information professionals who are experts in the field of patent specific sources, challenges, pitfalls as well as scientific text mining analysis
• System open to link and exchange information to other internal platforms via automated protocols / APIs
• Fast provision of representative chemical structures allow quick evaluation of novel chemical space as well as easy download & processing of real chemical structures
Thank you!Acknowledgements:
Selected images were licensed from: 123RF©