cas registry: maintaining the gold standard for chemical substance information

43
CAS REGISTRY SM : Maintaining the gold standard for chemical substance information Roger Schenck CAS Marketing Spring 2011 ACS Meeting

Upload: chemical-abstracts-service

Post on 11-Nov-2014

797 views

Category:

Technology


5 download

DESCRIPTION

Presented at the 241st ACS National Meeting & Exposition, March 27-31, 2011

TRANSCRIPT

Page 1: CAS REGISTRY: Maintaining the gold standard for chemical substance information

CAS REGISTRYSM: Maintaining the gold standard for chemical

substance information

Roger SchenckCAS MarketingSpring 2011 ACS Meeting

Page 2: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.2

Agenda•

Social networking and the CAS databases

How has the CAS substance collection grown over the years?

What are the sources of these substances?

How is CAS responding to the challenge of the accelerating discovery of substances?

How does CAS maintain the REGISTRY “gold standard”

of substance information?

Page 3: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.3

Social media is transforming the communication landscape

“scifinder

and I are bff”–

Meta Cullen (@Klav13r, 36 followers)

Page 4: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.4

SciFinder users rely on social media to communicate with CAS

I have 2 suggestions: 1. Have a flag to indicate that the article is not fully indexed yet;

2. Have an option to display the references by relevance…. –

Bruce Slutsky, SCIFINDERTALK, December 4, 2009

…I only now noticed that the arXiv

preprints selected for inclusion in CAPLUS receive full indexing including registry number assignments. Talk about value-added. Way to go, CAS!

A. Ben Wagner, Sciences Librarian

Looked up a steroid structure on Scifinder for my NMR class today. It's interesting I could tell what info I'd get based on publication year. –

@cisforcartilage

(30 followers)

Page 5: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.5

Recent survey* explored where SciFinder®

users network

*Source: SciFinder

Mobile Device and Social Collaboration Survey –

2/10

Page 6: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.6

Social networking sites have both personal and professional uses for SciFinder customers

Page 7: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.7

Customers want more convenient access to other scientists and CAS

Page 8: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.8

The CAS databases are built by scientists around the world

Page 9: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.9

CAS covers chemical reactions from dissertations•

Dissertations are a unique source of synthetic information

In 2010 CAS added nearly 5,000 dissertations to CAplus

CAS continues to investigate persistent sources for dissertations

Page 10: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.10

Growth in published chemistry literature has stayed strong in the last decade

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1,400,000

1,600,000

2003 2004 2005 2006 2007 2008 2009 2010

Publ

icat

ions

in C

Apl

us

Publications from 2003-2010 in CAplus

Year

Page 11: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.11

CAS analyzes global chemical information, including publications from Asia• 10,000 serial journal titles and 61 patent authorities worldwide• 2,100 Asian serial journal titles• All major Asian patent authorities, including offices in

Hong Kong (HK)–

India (IN)

Japan (JP)–

Philippines (PH)

People’s Republic of China (CN)–

Singapore (SG)

South Korea (KR)–

Taiwan (TW)

Country-specific databases for Asian nations (on STN®)–

KOREAPAT

JAPIO–

RUSSIAPAT

Page 12: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.12

22%

31%33% 33%

43%

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%

2000 2007 2008 2009 2010

Chinese, Japanese, and Korean language publications account for 43% of new CAplus database records in 2010

Perc

enta

ges

of P

ublic

atio

ns

Year

New Publications in Chinese, Japanese, and Korean Languages, 2000-2010

Page 13: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.13

Patenting of new chemical research has accelerated, especially patenting of Chinese chemical research

Chemistry Patent Publications, 1999‐present

0

20,000

40,000

60,000

80,000

100,000

120,000

1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010

Doc

umen

ts in

CA

S D

atab

ases

China Japan USA WIPO

Chemistry Patent Publications, 1999-present

Year

Page 14: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.14

CAS continues to uncover new small molecules in significant numbers

0

10

20

30

40

50

60

2003 2004 2005 2006 2007 2008 2009 2010

Smal

l Mol

ecul

es (M

illio

ns)

Year

Small Molecules in CAS REGISTRY, 2003-2010

Page 15: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.15

What were the sources of these molecules in 2010?

0

500,000

1,000,000

1,500,000

2,000,000

2,500,000

3,000,000

3,500,000

4,000,000

Journals and Patents Chemical Libraries Chemical Catalogs Other Sources

Num

ber o

f Sm

all M

olec

ules

Sources for CAS REGISTRY, 2010

Sources

Page 16: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.16

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%

1976 2010

Increasingly, new chemical discoveries are being disclosed through patent activities

Perc

enta

ge o

f tot

al

*CA Database annual average is 23% patents

Percentage of New Compounds added to CAS REGISTRY from Patents

Year

14%

46%

Page 17: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.17

CHEMCATS continues to grow and remains a source of new small molecules

Number of Catalog Products and Unique Substances in the CHEMCATS Database

Num

ber o

f Pro

duct

s/Su

bsta

nces

Number of Unique Substances

02,000,0004,000,0006,000,0008,000,000

10,000,00012,000,00014,000,00016,000,00018,000,00020,000,00022,000,00024,000,00026,000,00028,000,00030,000,00032,000,00034,000,00036,000,00038,000,00040,000,00042,000,00044,000,000

8/1/20

0710

/1/20

0712

/1/20

072/1

/2008

4/1/20

086/1

/2008

8/1/20

0810

/1/20

0812

/1/20

082/1

/2009

4/1/20

096/1

/2009

8/1/20

0910

/1/20

0912

/1/20

092/1

/2010

4/1/20

106/1

/2010

8/1/20

1010

/1/20

1012

/1/20

10

Page 18: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.18

Chemical substances from web-based sources provide a moderate addition to the small molecules in REGISTRY1.6M substances have been captured from Internet substance collections

Larger web-based sources for REGISTRY

Num

ber o

f Sub

stan

ces

050,000

100,000150,000200,000

250,000300,000

350,000400,000

ZINC

Chem

Spid

er

Chem

DB

Broa

d In

st

Ambi

nter

NIST

Mas

s Sp

ec

NCI 3

D

Internet Substance Collections

Page 19: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.19

The CAS databases reveal some small molecule trends•

New substances still come mainly from journals and patents, but more and more new substances are coming from the patent literature

Unique substances are found in chemical catalogs and chemical libraries

Internet sources provide some otherwise undisclosed substance information

The Pacific Rim, especially China, is increasingly productive

Chemists are very inventive –

more new chemical entities, not fewer, are being disclosed every year

Page 20: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.20

What criteria must a substance meet to be included in the CAS REGISTRY?A substance must be

Identified by CAS as coming from a reputable source, including but not limited to patents, journals, chemical catalogs, and web-

based substance collections

Described in largely unambiguous terms

Characterized by physical methods

Described in a patent document example or claim

Consistent with the laws of atomic covalent organization

Page 21: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.21

For complex chemistry, CAS chemists classify substance information and verify graphical processes and structures

2. Create registration record1. Review reaction and structure

Page 22: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.22

CAS chemists interpret when compounds are described in terms other than singular structures or names

Page 23: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.23

Since 1997, patents have provided more new small molecules than have journals

CAS analysis of a typicalPCT application•

917 indexed compounds from Examples and Claims

576 new compounds added to CAS REGISTRY

613 single-step reactions•

5,394 multi-step reactions•

1,029 reaction participants•

2,119 substituent definitions for Markush structures added to MARPAT®

Page 24: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.24

CAS specialists in many fields of chemistry interpret author terminology to register compounds

Author identified this compound only as D4GlcUA-GlcNAc-

(GlcUA-GlcNAc)5-PA

Page 25: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.25

Patents regularly describe substances in ambiguous ways: In WO 2007089907, this “desired product”

is fully

characterized

CAS RN 203796-03-6

Page 26: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.26

Relatively new substance classes can be registered

Metal-organic frameworks show great potential for

capture of H2

or CO2

or in other gas separation

processes

Page 27: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.27

CAS REGISTRY substances are enhanced with spectra, numeric properties, tags, and published sources Spectra

More than 88M calculated NMR spectra (1H, 13C), with 17M added in 2010

More than 700,000 experimental spectra (MS, NMR, IR, Raman), with another 190,000 newly acquired MS added in 2010

Numeric

More than 4.3M experimental property values (melting point, boiling point, optical rotary power, etc.)

10.4M data tags linked to indexed documents

3.0B calculated metrics (bio-concentration, Log P, Lipinski, etc.)

Page 28: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.28

Chemical libraries are the second-largest source of new moleculesWhat are chemical libraries?

Often a collection of drug-like small molecules to be used as leads in high-throughput screening or industrial manufacture

Each substance has associated information stored in some kind of database, such as the –

Chemical structure

Purity –

Quantity

Physiochemical characteristics

Page 29: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.29

Chemical catalogs with products “in stock”

are a growing source of new molecular descriptions

Page 30: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.30

CHEMCATS submissions are carefully checked

3-Chlorobenzoic acid, pregn-4-ene-3,20-diylidenedihydrazideCAS Registry Number 347841-26-3

Page 31: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.31

Other sources of new small molecules are national chemical regulatory inventories

Page 32: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.32

CAS scientists―biologists, chemists, and information scientists―are

substance experts with advanced degrees

Collectively they know 50 different languages

They monitor the entire range of scientific literature that contains chemical information

Page 33: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.33

CAS maintains the REGISTRY gold standard of quality substance information on a daily basis

A recent example

Page 34: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.34

CAS maintains the REGISTRY gold standard of quality substance information on a daily basis

Page 35: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.35

CAS maintains the REGISTRY gold standard of quality substance information on a daily basis

Substance WR319535 is the 1R, 4S

enantiomer

as drawn.

Substance WR319535 is the 1R, 4S

enantiomer

as drawn.

Page 36: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.36

CAS maintains the REGISTRY gold standard of quality substance information on a daily basis

Substance WR319581 is the 1S, 4R

enantiomer

of WR319535.

Substance WR319581 is the 1S, 4R

enantiomer

of WR319535.

Page 37: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.37

Bed bugs have returned

Delores Stewart displays bed bugs found in her home in Columbus, Ohio.

Page 38: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.38

Bed bugs are making their way north in Ohio

http://www.bedbugdatabase.com/

Page 39: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.39

Companies are actively patenting bed bug insecticides

Page 40: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.40

Lots of recent activity

Page 41: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.41

Landscape of the research fronts shows what areas are being commercialized

Page 42: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.42

Summary•

CAS REGISTRY draws on a wide variety of sources, much more than journals and patents

REGISTRY records are rich with supplemental data like spectra, numeric properties, tags, and published sources

CAS scientists add value to substance records by applying subject matter expertise and CAS rules

CAS scientists also ensure that quality keeps up with quantity by correcting records, where necessary

Page 43: CAS REGISTRY: Maintaining the gold standard for chemical substance information

April 21, 2011

CAS is a division of the American Chemical Society. Copyright 2011 American Chemical Society. All rights reserved.43