2005 07 19 ivt integration techniques

29
Integration Techniques for ELNs Simon Coles Co-founder & CTO

Upload: simon-coles

Post on 20-Aug-2015

334 views

Category:

Technology


1 download

TRANSCRIPT

Integration Techniques for ELNs

Simon ColesCo-founder & CTO

http://www.amphora-research.com/

Integration Techniques for ELNs

• My background• Why do we need to integrate ELNs?• Why kinds of integration do we need to do?• What prerequisites are there?• Some examples of technologies and techniques• Summary

• You can download copies of this presentation from our web site

2

http://www.amphora-research.com/

My background

• MEng in Information Systems Engineering• First “ELN” was a consulting project for Kodak

• Started in 1996• Completely electronic, fully integrated• Thousands of users, worldwide

• This grew into Amphora• Merged with PatentPad in 2003

• Paper or electronic records according to legal preference

• Scientists still get an “Electronic” system• Partner with a wide variety of “ELN” vendors

• Member of CENSA, working on long term records, serving on Steering Team

3

http://www.amphora-research.com/

Experience

• Primarily in ELNs for discovery• Where patents are a major concern• I am sure some of this is relevant to regulated areas,

but that’s not my focus• Work a lot with other “ELN” vendors

• Seldom do you buy one system• Which means we end up seeing a lot of integration!

• In a variety of industries, all sizes of deployment• Pharma• Biotech• Chemicals

• Customers around the world, offices in the US & the UK

4

http://www.amphora-research.com/

What’s an ELN?

• The term “ELN” is now used to described a wide variety of systems• Science specific

• Reaction planning tools, Cheminformatics databases, structure drawing tools

• Analysis packages, LIMS• Workflow tools

• General • Knowledge/Document Management• Scientific data management

• Laptop/Tablet computers

5

http://www.amphora-research.com/

Observations

• The term “ELN”• Is so ambiguous it can mean almost anything

(especially to a marketing person)• Doesn’t help us much from a systems architecture

perspective• A company is unlikely to have just one system that

could be called an “ELN”• Those ELNs will need to integrate with your

existing & future systems• Your needs will change with time, so you need to

be able to protect your investment• In data• In tools• In processes

6

http://www.amphora-research.com/

Deconstructing “ELN”

7

“Broad” aspectsSecurity, Collaboration, Patent Protection

etc.

A B C D

• At first sight an ELN project success can look very complex

• ELN functionality can be split into two dimensions• Some aspects are common to everyone• Other requirements are specific to a particular group of

scientists• Splitting out the functionality into these dimensions really

helps to keep you sane

http://www.amphora-research.com/

Benefits

• The corporate functions (Legal, Records, etc.) can buy/provide a system that provides a service to the niche-specific systems• Meet corporate requirements for records etc.• Provide a cross-discipline collaboration

• The individual niches can buy/find systems to support their specific needs• Leverage existing investments• Justified according to the benefits they bring• Removes any need to balance competing requirements• Reduce the need

• Systems can be acquired/purchased in a phased approach tailored to the needs & requirements of the business

• Life is a lot less stressful

8

http://www.amphora-research.com/

Different levels of abstraction

9

“Broad” aspects

A B C D

ProjectsExperiments

ReportsRaw Data

The “Experiment” is generally the boundary between Broad Vs Deep

systems

http://www.amphora-research.com/

Types of integration

10

“Broad” aspects

A B C D

Broad/Deep boundary is often exposed as

network-level services which are relatively

standardized

Integrations between different niche systems is generally custom

http://www.amphora-research.com/

What prerequisites are there?

• From your ELN product(s)• Open Interfaces• Open Data

• Plumbing• Various technologies, some simple, some more complex• Expertise - often in-house, sometimes consultants

• Good news - the Open Source movement is really helpful• Tools & techniques• Drive for openness

• Remember: you need to ask your vendor for all of the “Open” stuff before you sign the order

11

http://www.amphora-research.com/

Open Interfaces

• What’s an “Interface”?• Where one system “prods” another to do something• Or get some information out• Or put some information in• Generally some data is passed back & forth

• What’s “open”?• Something you can use without undue burden or

barrier• This covers both commercial and technical aspects• Concerns are very similar to those involved with Open

Data

12

http://www.amphora-research.com/

Open Data

• This is currently a bit of a blind spot for purchasers of IT systems

• Unfortunately, Open Data is absolutely critical• For long term records• For your ability to build up an integrated system• To protect your IP (partly from a patent perspective, but

mainly from a re-use aspect)• To maintain a balanced relationship with your vendors

• This absolutely needs to be part of the ELN purchasing process

13

http://www.amphora-research.com/14

• Publicly documented• Legally unencumbered

• No patents, copyright concerns etc.• Any patents or copyright must be in the public domain

• Ideally, self documenting (XML is a good start)• Degrade gracefully

• If you can’t the data, at least you can see a picture• Based on more open, primitive formats where

possible• At least two implementations of readers, one of

which is Open Source• Widely used (W3C or IETF standards are good

signs)

“Good” (open) file formats

http://www.amphora-research.com/15

• Good• For text: Plain ASCII, Unicode, HTML, possibly RTF• For graphics: PNG, SVG• For structured data: XML• To preserve appearance: PDF

• Worry about• Storing files in databases

• The database file format is probably undocumented• Store objects on the file system and use the

database to point to them• Anything that is proprietary - there’s no excuse for it,

and it dramatically increases your risk• Binary files generally• Mixing content in files (e.g. embedding XML in PDF)• Proprietary digital signatures

Data formats for the long term

http://www.amphora-research.com/16

IP concerns & data formats

• Companies have always used Proprietary Data Formats as a competitive weapon

• Companies are waking up to the use of IP tools (licenses, patents, copyrights) to reinforce their control over data formats

• Just because a format is published doesn’t mean it is open• The Microsoft Office XML formats are a particularly

bad example• Right now it looks positively radioactive• They’re being very careful what they say which

indicates to me they’re planning something• http://www.groklaw.net/article.php?

story=20050330133833843• (see section: 4. Dissecting Microsoft’s “Patent License”)

http://www.amphora-research.com/17

• There are so many to choose from!• Two key ways of generating “Standards”

• De Facto - dominant supplier/format• De Jure - committee based

• Who gets to “bless” a standard?• What makes a “good standard”

• De Jure process has difficulty keeping up with the real world

• De Facto process has risk of lock-in• Pragmatic approach

• Expect your suppliers to use open file formats• If there is an acceptable standard, use it• Make sure you are using the right kind of format for

each purpose

Standards

http://www.amphora-research.com/

Technologies and techniques

• There are a wide variety of tools you can use to integrate IT systems• Tight Vs Loose coupling• Synchronous Vs Asynchronous• Text Vs Binary• Proprietary Vs Open• Simple Vs Complex

• As a rule• Loose is cheaper than Tight coupling• Asynchronous is easier to manage than

Synchronous• Text is easier to work with, and more flexible than

Binary• Open interfaces are always better than Proprietary• Simple are better Complex approaches

18

http://www.amphora-research.com/

Considerations when picking tools

• Use stable interfaces• Get a commitment from the vendor about what they’ll

keep stable across version upgrades• Use public, documented interfaces• Sample code is really really useful• Pick language-neutral interfaces where possible• Platform-neutrality

• Doesn’t worry (too much) about locking yourself into Windows on the client

• But if you lock yourself to Windows on the server, it is going to hurt

19

http://www.amphora-research.com/

Glue Languages

• There are a number of really useful “Glue” languages around• Python (and Jython, and other relatives)• Perl (although I have some concerns about

maintainability)• Groovy, Beanshell, etc.

• All of them• Play well with XML, http, SOAP etc.• Play well OLE• Are cross platform

• My personal preference is Python• You can learn it in a matter of hours• You can read other people’s code• It does everything I need it to do

20

http://www.amphora-research.com/

Cool stuff

• SOAP/Web Servers• Valuable in many areas• But don’t treat it as a religion• There are lighter alternatives which bring most of the

benefits for much less effort• The whole WS-* effort seems to have got out of control

• REST (XML over http) - a lighter alternative to SOAP

• File swapping (generally, in XML)• HTTP GET/POST

• Wonderfully easy to debug!• Very flexible

21

http://www.amphora-research.com/

Nice things to see

• Integration points exposed as stable URLs• For example, our PatentSafe product, we have

committed to stable URL formats to• Submit a record via http (content & metadata)• Get a record for display to the user

• These can be used by other systems• And also embedded in Word documents...

• Lack of wheel re-invention• e.g. LDAP is The One True place for user information• e.g. RSS/Atom is The One True alerting mechanism

• Example code• In multiple languages

22

http://www.amphora-research.com/

Here be dragons

• OLE - some times it is unavoidable (e.g. UI stuff), but avoid it when you can• Tight coupling• Buggy• Proprietary• Reduces your platform options• File format issues are awful• Version-to-version compatabilty is “interesting”

• Direct database access• Tight coupling• Difficult to guarantee system integrity• If you wrote both systems you might want to do this

23

http://www.amphora-research.com/24

• Definitely one to watch• Not the “Free” lunch you might think, but a

pragmatic business too• Examples

• Linux• Postgres• JBoss, Tomcat etc.• Ghostscript

• Open Source is part of everyone’s infrastructure• Make sure you can run your systems on a variety of

platforms

Open Source

http://www.amphora-research.com/25

• Good for records• Gives you top-to-bottom control

• Good for TCO• We’re finding the Open Source infrastructure easier to

setup and reliable than proprietary alternatives• Enables a better solution

• Transparent systems mean you can do things the original designers didn't think of

• This is especially important for ELNs

Why?

http://www.amphora-research.com/26

• XML generally (what did we ever do without it)• Jabber (as computer messaging and IM framework)• Portals & Portlets

• Especially JSR168, WSRP• Remember you may well want to portalize any useful application

• AJAX• Google is my hero• You can build usable, functional Web Applications• If you haven’t seen GMail I can send you an “invite”

• VMWare - virtualize your world• Wow• Great for serve consolidation, great for testing, great for

development• Wikis

• Beginning to turn into a lightweight application environment

Other stuff to watch

http://www.amphora-research.com/

Trends to watch

• File format nasties• Closed/Private interfaces

• Unlikely to be stable• DMCA and other copyright legislation

27

http://www.amphora-research.com/

Summary

• You’ll be assembling an “ELN System” from a series of components• Some you have, some you’ll build, some you’ll buy

• Get the open stuff before you sign the deal• Open, documented, stable interfaces• Open file formats

• Use open, loosely coupled approaches where possible

• If you can, keep the capability to own the integration issues in-house

28

http://www.amphora-research.com/

Contact information

• Web site: http://www.amphora-research.com• EMail: [email protected]• Phone (US): (513) 697 4764• Phone (UK): +44 (0)845 2300160 x2001• AIM: [email protected]• Skype: sjcoles

29