what’s wrong with research papers - and (how) can we fix it?

Whatʼs wrong with research papers -

and (how) can we fix it?

Anita de WaardDisruptive Technologies Director

Elsevier [email protected]

http://elsatglabs.com/labs/anita

mailto:[email protected]

mailto:[email protected]



The Big Problem:

2

The Big Problem:

1)" There are too many papers

2

The Big Problem:

1)" There are too many papers 2)" We have too little time to read them

2

To address this problem, we make:

3

To address this problem, we make:• databases• text mining tools • nanopublications• data publications• wiki publications• ontologies; ontology integration tools• workflow/data integration systems• executable components• ....and write emails/grants/papers/blogs about this...• ... and we end up with:

3

To address this problem, we make:• databases• text mining tools • nanopublications• data publications• wiki publications• ontologies; ontology integration tools• workflow/data integration systems• executable components• ....and write emails/grants/papers/blogs about this...• ... and we end up with:

3

1)" Even more papers!!2)" Even less time to read them!!

What problems are we solving?

4


• Weʼre mostly improving the format of the research article.

4


• Weʼre mostly improving the format of the research article. • This talk: aspects of the format that are being improved

(and some examples of work to improve them): A.Issues with the paper formatB.Issues pertaining to habits of writingC.Issues inherent to language D.Issues in trying to create connected content

4




• Do any of these address the Big Problem?

4




• Do any of these address the Big Problem?• What shall we do about it?

4

A. Issue: the paper format

5


A1:" Paper is two-dimensional

5


A1:" Paper is two-dimensional A2:" Paper is linear

5


A1:" Paper is two-dimensional A2:" Paper is linear A3: Paper is not interactive

5

6

A1: Issue: paper is two-dimensional

• Some experiments: allow representations of interactive figures (Wolfram Alpha), Utopia, Chem-3d

6


• Some experiments: allow representations of interactive figures (Wolfram Alpha), Utopia, Chem-3d

• Lack of experimentation with formats: the internet is multi-dimensional, so why do we still need page limits?

6


A2: Issue: paper is linear

7


• Read from front to back (although research suggests a quick skim to core parts, but linearity helps us do that)

7



• References are at the end, so your reading is not interrupted

7



• References are at the end, so your reading is not interrupted

• Headers are sequential - and not directly accessible

7

A2: (Old) Experiment: ABCDE

8

A2: (Old) Experiment: ABCDE• LaTeX Stylesheet:

–Annotation–Background–Contribution–Discussion–Entities (references, projects,

terms in ontologies, etc) in RDF–Core sentences create structured

abstract

8




abstract• E.g. in proceedings: collect all core Contribution

components

8




abstract• E.g. in proceedings: collect all core Contribution

components• I still have the stylesheets, if anyone’s interested :-)!

8

A3: Paper is not interactive

9

A3: Paper is not interactive

• Experiment:Executable papers:–Run code within a paper–Experiments: R, SPSS,

Vistrails–Rerender code within a

paper, change algorithm/see effect; run different dataset

–How do you archive software? Satyanarayanan at CMU: Olive, ‘Internet ecosystem of curated virtual machine image collections’

9

http://www.vistrails.org/index.php/User:Tohline/CPM/Levels2and3%20

http://www.vistrails.org/index.php/User:Tohline/CPM/Levels2and3%20

B. Issue: habits of writing

10

B. Issue: habits of writingB1: Cite a paper - not a claim

10

B. Issue: habits of writingB1: Cite a paper - not a claimB2: No precision in describing entities

10

B. Issue: habits of writingB1: Cite a paper - not a claimB2: No precision in describing entitiesB3: We write post-mortems (stories :-)!)

10

B1: Citations create facts:

11

B1: Citations create facts: - Voorhoeve, 2006: “These miRNAs neutralize p53- mediated CDK

inhibition, possibly through direct inhibition of the expression of the tumorsuppressor LATS2.”

11



- Kloosterman and Plasterk, 2006: “In a genetic screen, miR-372 and miR-373 were found to allow proliferation of primary human cells that express oncogenic RAS and active p53, possibly by inhibiting the tumor suppressor LATS2 (Voorhoeve et al., 2006).”

11

http://www.sciencedirect.com/science/article/pii/S1534580706004023%22%20%5Cl%20%22bib101





- Yabuta et al., 2007: “[On the other hand,] two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in testicular germ cell tumors by inhibition of LATS2 expression, which suggests that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006).”

11






- Yabuta et al., 2007: “[On the other hand,] two miRNAs, miRNA-372 and-373, function as potential novel oncogenes in testicular germ cell tumors by inhibition of LATS2 expression, which suggests that Lats2 is an important tumor suppressor (Voorhoeve et al., 2006).”

- Okada et al., 2011: “Two oncogenic miRNAs, miR-372 and miR-373, directly inhibit the expression of Lats2, thereby allowing tumorigenic growth in the presence of p53 (Voorhoeve et al., 2006).”

11



http://jcs.biologists.org/content/124/1/57.full%22%20%5Cl%20%22ref-40




B1: TAC2012: Add authorʼs text to citation

12

B1: TAC2012: Add authorʼs text to citation Voorhoeve, P. M.; le Sage, C et al. (2006). A Genetic Screen Implicates miRNA-372 and miRNA-373 As Oncogenes in Testicular Germ Cell Tumors, Cell 124 (6) pp.1169 - 1181 Citing goal: “To perform genetic screens for novel functions of miRNAs,”− in order to identify miRNAs functionally associated with carcinogenesis − to identify miRNAs that when overexpressed could substitute for p53 loss and allow continued proliferation in the context of Ras activation

12

B1: TAC2012: Add authorʼs text to citation Voorhoeve, P. M.; le Sage, C et al. (2006). A Genetic Screen Implicates miRNA-372 and miRNA-373 As Oncogenes in Testicular Germ Cell Tumors, Cell 124 (6) pp.1169 - 1181 Citing goal: “To perform genetic screens for novel functions of miRNAs,”− in order to identify miRNAs functionally associated with carcinogenesis − to identify miRNAs that when overexpressed could substitute for p53 loss and allow continued proliferation in the context of Ras activationCiting method: “We subsequently created a human miRNA expression library (miR-Lib) by cloning almost all annotated human miRNAs into our vector (Rfam release 6) (Figure S3).”− Voorhoeve et al. (116) employed a novel strategy by combining an miRNA vector library and corresponding bar code array − using a retroviral expression library of miRNAs, − Using a novel retroviral miRNA expression library, Agami and co-workers performed a cell-based screen

12

B1: TAC2012: Add authorʼs text to citation Voorhoeve, P. M.; le Sage, C et al. (2006). A Genetic Screen Implicates miRNA-372 and miRNA-373 As Oncogenes in Testicular Germ Cell Tumors, Cell 124 (6) pp.1169 - 1181 Citing goal: “To perform genetic screens for novel functions of miRNAs,”− in order to identify miRNAs functionally associated with carcinogenesis − to identify miRNAs that when overexpressed could substitute for p53 loss and allow continued proliferation in the context of Ras activationCiting method: “We subsequently created a human miRNA expression library (miR-Lib) by cloning almost all annotated human miRNAs into our vector (Rfam release 6) (Figure S3).”− Voorhoeve et al. (116) employed a novel strategy by combining an miRNA vector library and corresponding bar code array − using a retroviral expression library of miRNAs, − Using a novel retroviral miRNA expression library, Agami and co-workers performed a cell-based screenCiting result: “we identified miR-372-373, each permitting proliferation and tumorigenesis of primary human cells that harbor both oncogenic RAS and active wildtype p53.”− miR-372 and miR-373 were consequently found to permit proliferation and tumorigenesis of these primary cells carrying both oncogenic RAS and wild-type p53, − Voorhoeve et al. (2006) identified miR-372 and miR-373 − miR-372 and miR-373 were found to allow proliferation of primary human cells that express oncogenic RAS and active p53, − miR-372 has been recently described as potential oncogene that collaborate with oncogenic RAS in cellular transformation

12

http://genesdev.cshlp.org/content/25/6/534.full%22%20%5Cl%20%22ref-205

http://genesdev.cshlp.org/content/25/6/534.full%22%20%5Cl%20%22ref-205

B2: Issue: entities in papers are not exact

• Midfrontal cortex tissue samples from neurologically unimpaired subjects (n9) and from subjects with AD (n11) were obtained from the Rapid Autopsy Program

• Immunoblot analysis and antibodies• The following antibodies were used for immunoblotting: -actin mAb (1:10,000 dilution,

Sigma-Aldrich); -tubulin mAb (1:10,000, Abcam); T46 mAb (specific to tau 404–441, 1:1000, Invitrogen); Tau-5 mAb (human tau 218–225, 1:1000, BD Biosciences) (Porzig et al., 2007); AT8 mAb (phospho-tau Ser199, Ser202, and Thr205, 1:500, Innogenetics); PHF-1 mAb (phospho-tau Ser396 and Ser404, 1:250, gift from P. Davies); 12E8 mAb (phospho-tau Ser262 and Ser356, 1:1000, gift from P. Seubert); NMDA receptors 2A, 2B and 2D goat pAbs (C terminus, 1:1000, Santa Cruz Biotechnology)…

Maryann Martone, Jan 2012: 2012 ACM SIGHIT International Health Informatics Symposium (IHI 2012)

http://sites.google.com/site/web2011ihi/participants/panels




B2: Issue: entities in papers are not exact

• Midfrontal cortex tissue samples from neurologically unimpaired subjects (n9) and from subjects with AD (n11) were obtained from the Rapid Autopsy Program

• Immunoblot analysis and antibodies• The following antibodies were used for immunoblotting: -actin mAb (1:10,000 dilution,

Sigma-Aldrich); -tubulin mAb (1:10,000, Abcam); T46 mAb (specific to tau 404–441, 1:1000, Invitrogen); Tau-5 mAb (human tau 218–225, 1:1000, BD Biosciences) (Porzig et al., 2007); AT8 mAb (phospho-tau Ser199, Ser202, and Thr205, 1:500, Innogenetics); PHF-1 mAb (phospho-tau Ser396 and Ser404, 1:250, gift from P. Davies); 12E8 mAb (phospho-tau Ser262 and Ser356, 1:1000, gift from P. Seubert); NMDA receptors 2A, 2B and 2D goat pAbs (C terminus, 1:1000, Santa Cruz Biotechnology)…•95 antibodies were identified in 8 articles

•52 did not contain enough information to determine the antibody used

Maryann Martone, Jan 2012: 2012 ACM SIGHIT International Health Informatics Symposium (IHI 2012)





B3: Issue: methods are written post-mortem

14

B3: Issue: methods are written post-mortem• Yolanda Gil at ISI modeled Bourne et al. paper in Wings

14

B3: Issue: methods are written post-mortem• Yolanda Gil at ISI modeled Bourne et al. paper in Wings• Anecdotal evidence: Phil Bourne couldn’t remember most

of this, even after digging through emails!

14

B3: So why not write the data first and wrap the paper around it??

1. Research: Each item in the system has metadata (including provenance) and relations to other data items added to it.

metadata

metadata

metadata

metadata

metadata



metadata

metadata

metadata

metadata

metadata

2. Workflow: All data items created in the lab are added to a (lab-owned) workflow system.



metadata

metadata

metadata

metadata

metadata


Rats were subjected to two grueling tests(click on fig 2 to see underlying data). These results suggest that the neurological pain pro-‐

3. Authoring: A paper is written in an authoring tool which can pull data with provenance from the workflow tool in the appropriate representation into the document.



metadata

metadata

metadata

metadata

metadata


4. Editing and review: Once the co-authors agree, the paper is ‘exposed’ to the editors, who in turn expose it to reviewers. Reports are stored in the authoring/editing system, the paper gets updated, until it is validated.

Review

Edit

Revise





metadata

metadata

metadata

metadata

metadata

5. Publishing and distribution: When a paper is published, a collection of validated information is exposed to the world. It remains connected to its related data item, and its heritage can be traced.



Review

Edit

Revise




Some other publisher

6. User applications: distributed applications run on this ‘exposed data’ universe.


metadata

metadata

metadata

metadata

metadata

5. Publishing and distribution: When a paper is published, a collection of validated information is exposed to the world. It remains connected to its related data item, and its heritage can be traced.



Review

Edit

Revise




C. Issue: language

16

C. Issue: language

C1:" Language is coherent

16

C. Issue: language

C1:" Language is coherentC2:" Language is narrative

16

C. Issue: language

C1:" Language is coherentC2:" Language is narrativeC3:" Language is abstract

16

C1: Language is coherent: Adding drug-drug interactions to DIKB

17


• Drug-Interaction Knowledge Base: Clinically-oriented, evidence-based knowledge base designed to support adding data to product inserts

17



• Contains quantitative and qualitative assertions about drug mechanisms and pharmacokinetic drug-drug interactions (DDI) for over 60 drugs

17



• Contains quantitative and qualitative assertions about drug mechanisms and pharmacokinetic drug-drug interactions (DDI) for over 60 drugs

• HCLS Sig: Currently working on expanding the DIKB with more content and making a “mash‐up” view of package inserts adding up‐to‐date information

View project: http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.htmlSPARQL endpoint: http://dbmi-icode-01.dbmi.pitt.edu:2020/directory/Drugs

17

http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html

http://dbmi-icode-01.dbmi.pitt.edu/dikb-evidence/front-page.html

https://webmail-uk.elsevier.com/exchweb/bin/redir.asp?URL=http://dbmi-icode-01.dbmi.pitt.edu:2020/directory/Drugs

https://webmail-uk.elsevier.com/exchweb/bin/redir.asp?URL=http://dbmi-icode-01.dbmi.pitt.edu:2020/directory/Drugs

C1: Coherent language is hard to parse

18

C1: Coherent language is hard to parse• Self-reference:

18

R-CT and its metabolites, studied using the same procedures, had properties very similar to those of the corresponding S-enantiomers.


• Reference to external data sources:

18


Average relative in vivo abundances equivalent to the relative activity factors, were estimated using methods described in detail previously (Crespi, 1995; Venkatakrishnan et al., 1998 a,c, 1999, 2000, 2001; von Moltke et al., 1999 a,b; Störmer et al., 2000).



• Ways of describing meant for human eyes

18



Based on established index reactions, S-CT and S-DCT were negligible inhibitors (IC50> 100 µM) of CYP1A2, -2C9, -2C19, -2E1, and -3A, and weakly inhibited CYP2D6 (IC50 = 70–80 µM)



• Ways of describing meant for human eyes

• Many statements wrapped into one:

18



Based on established index reactions, S-CT and S-DCT were negligible inhibitors (IC50> 100 µM) of CYP1A2, -2C9, -2C19, -2E1, and -3A, and weakly inhibited CYP2D6 (IC50 = 70–80 µM)

S-CT was transformed to S-DCT by CYP2C19 (Km = 69 µM), CYP2D6 (Km = 29 µM), and CYP3A4 (Km = 588 µM).

C2: Issue: Language is narrative

19

C2: Issue: Language is narrative• ‘The truth can only be told in stories’

19

C2: Issue: Language is narrative• ‘The truth can only be told in stories’• Complex knowledge such as scientific theories,

findings, conclusions have a narrative/rhetorical structure

19



• Typical pattern: claim/hypothesis, discussion of experimental findings, recap of claim, rebuttals, recap of claim

19



• Typical pattern: claim/hypothesis, discussion of experimental findings, recap of claim, rebuttals, recap of claim

• Roughly the same claim appears 4 or 5 times in a paper

19

20

C2: Experiment:ʻClaimed Knowledge Updatesʼ

C3: Issue: Language is abstract

21

C3: Issue: Language is abstract“These results are consistent with those obtained by RPA and demonstrate that AhR ligands suppress IL-6 mRNA levels by approximately 40–60%.”“Data presented in Figure 5A extend previous studies performed with monocytes by demonstrating that LPS induces NF-κB-DNA binding in bone marrow stromal cells.”“An added incentive for these studies was provided by the observation that the IL-6 gene promoter contains an NF-κB binding site which plays a major role in regulating LPS-induced IL-6 transcription [55-57].”• Purple = deictic/anaphoric markers, pointing to current text• Blue = metalanguage/epistemic evaluation• Green = experimental method• Red = conceptual claim• Orange = claim referred to in other work

21

C3: Formal Language:Biological Exchange Language

In a screen for miRNAs that cooperate with oncogenes in cellular transformation, we identified miR-372 and miR-373, each permitting proliferation and tumorigenesis of primary human cells that harbor both oncogenic RAS and active wild-type p53. Increased abundance of miR-372 increases cell proliferationr(MIR:miR-372) -| bp(GO:”Cell Proliferation”))Increased abundance of miR-372 increases tumorgenesisr(MIR:miR-372) -| bp(GO:Tumorgenesis))

We provide evidence that these miRNAs are potential novel oncogenes participating in the development of human testicular germ cell tumors by numbing the p53 pathway, thus allowing tumorigenic growth in the presence of wild-type p53. Increased abundance of miR-372 decreases activity of TP53r(MIR:miR-372) -| tscript(p(HUGO:Trp53))Context: cancerActivity of TP53 decreases cell growthSET Disease = “Cancer”tscript(p(HUGO:Trp53)) -| bp(GO:”Cell Growth”

22

C3: Experiment: add epistemic evaluation/knowledge attribution to BEL

C3: Experiment: add epistemic evaluation/knowledge attribution to BEL

For a Proposition P, an epistemically marked clause E is an Evaluation of P, EV, B, S(P), with:- V = Value:

3 = Assumed true, 2 = Probable, 1 = Possible, 0 = Unknown, (- 1= possibly untrue, - 2 = probably untrue, -3 = assumed untrue)

- B = Basis:ReasoningData

- S = Source:A = speaker is author A, explicitIA = speaker author, A, implicitN = other author N, explicitNN = other author NN, implicit

D. Collections of papers

24

D. Collections of papers D1:" Canʼt search papers easily

24

D. Collections of papers D1:" Canʼt search papers easilyD2:" Canʼt connect papers well

24

D. Collections of papers D1:" Canʼt search papers easilyD2:" Canʼt connect papers wellD3:" Canʼt combine knowledge from different papers

24

D1: Searching collections of papers

25


• It is relatively easy to find a paper you are looking for: Google Scholar, Google,..., Scopus... (in that order?)

25



• But it is very hard to find if something was done about a certain topic (e.g. ‘citances’)

25




• And it’s impossible to know if nothing was done on a topic

25





• Why aren’t more people working on this?

25





• Why aren’t more people working on this?• What happened to the semantic desktop??

25

D2: How do we connect papers?

26

D2: How do we connect papers?• Papers exist within a con-text: preceding knowledge,

succeeding knowledge, knowledge in your head or on your computer

26

D2: How do we connect papers?• Papers exist within a con-text: preceding knowledge,

succeeding knowledge, knowledge in your head or on your computer

• How can we annotate these relations, maintain connections, explore ones that others have made?

26

!"#$%&'()#*+!"#$%&''()*+,-./01'2#341546!

,$-.#+&+/.#$01!2342/&5#6&!2#!275#8&.0$&!20920-5&.2+&+/.#$&28.0-&*$!!

rdf:type

dct:title

G1

"#$%&''7841%-7.9):0'%7,;0)'<6!pav:contributedBy

"#$%&''7841%-7.9):0'/9=4(0)'<6!

swanrel:referencesAsSupportiveEvidence

G5

G6

D2: Experiment:Annotation in SWAN using DOMEO

27

D3: Tracing the heritage of a statement

28


• On paper, you can’t see whether a claim or a recommendation is valid

28


• On paper, you can’t see whether a claim or a recommendation is valid

• E.g. required to check for clinical recommendations:–Is this statistically valid? –Was it shown for my patient? –Are there other things I need to know (side effects,

funding, etc)

28

29

D3: Experiment: Linking Clinical Guidelines to Evidence

A. Philips’ Electronic PaNent Records B. Elsevier-‐published Clinical Guideline

C. Elsevier (or other publisher’s) Research Report or Data

29




Step 1: PaNent data + diagnosis link to Guideline recommendaNon

29




Step 1: PaNent data + diagnosis link to Guideline recommendaNon

Step 2: Guideline recommendaNon links to research report/data

30

Recommenda)on in Guideline Level Evidence (in the text) Ref Recommenda)on in Reference

5.1. Laboratory tests should include a CBC count with differenNal leukocyte count and platelet count;

A-‐III No evidence in text No reference

5.2. measurement of serum levels of creaNnine and blood urea nitrogen;

A-‐III CBC counts and determinaNon of the levels of serum creaNnine and urea nitrogen are needed to plan supporNve care and to monitor for the possible occurrence of drug toxicity.

No reference

5.3. and measurement of electrolytes, hepaNc transaminase enzymes, and total bilirubin (A-‐III).

A-‐III No evidence in text No reference

Not menNoned: GET ENOUGH BLOOD, IN TWO SEPARATE BOTTLES

The total volume of blood cultured is a crucial determinant of detecNng a bloodstream infecNon [47].

[47] Our data, together with an analysis of previous studies, show that the yield of blood cultures in adults increases approximately 3% per millilitre of blood cultured.

(a ‘‘set’’ consists of 1 venipuncture or catheter access draw of 20 mL of blood divided into 1 aerobic and 1 anaerobic blood culture bogle).

Our data, together with an analysis of previous studies, show that the yield of blood cultures in adults increases approximately 3% per millilitre of blood cultured.

Not menNoned: REPEAT TESTS These tests should be done at least every 3 days during the course of intensive anNbioNc therapy.

At least weekly monitoring of serum transaminase levels is advisable for paNents with complicated courses or suspected hepatocellular injury or

D3: The reality of linking evidence:

In summary:

31

Type Problems Experiments IssuesA. Paper format: A. Paper format: A1 Two-dimensional Utopia, Wolfram CDF Standards, toolsA2 Linear ABCDE Adoption?A3 Not interactive Executable papers AdoptionB. Writing habitsB. Writing habitsB1 Reference to papers TAC: CItance summaries Need to start at authorB2 Inexact entity references NIF antibodies Need mandate!B3 Methods post-mortem Data-centric publishing Change research recording!C. Language: C. Language: C1 Coherent DIKB Hard to parse!C2 Narrative CKUs Fractal nature of paperC3 Abstract BEL Formalize knowledge levelD. Collections of papers: D. Collections of papers: D1 Can’t find Scientific search engines? Is anyone working on this?

D2 Can’t compare DOMEO/SWAN Manual, doesn’t scaleD3 Can’t combine Evidence-based guidelines Inconsistencies!

32

Have we solved the Big Problem?

1) Too many papers?• Do not make publication numbers factor in evaluation• Do not make conference attendance contingent on publication • Write fewer papers! Limit yourself to write only what is

significant and profound (and entertaining!)2)! Too little time to read?• Collectively: change expectation of work in a day• Make grant process less of a waste of time and talent• Reduce burden of administration on (senior) scientists: reinstate

departmental administrators!• Teach administration as a class: Lethbridge journal incubator• Make time to read some new (or old!) interesting work!

32

Have we solved the Big Problem?

So how do we tackle all this?• DERI-Elsevier collaboration - define research projects?• Perhaps under aegis of Force11?

• Dagstuhl Workshop in August of 2011: 35 invited attendees from different parts of science, industry, funding agencies, data centers

• Goal: map main obstacles preventing new models of science publishing and develop ways to overcome them

• Just received funding from Sloan foundation to:–Start online community–Hold next workshop –Collaboratively work on next steps

• Any thoughts? 33

Acknowledgements/collaborations: 1.Executable papers: Juliana Freire, NYU & Matthias Troyer, ETH Zurich

(Vistrails); Micah Altman, Harvard SQSS (R), Gloriana St. Claire & Mahadev Satyanarayanan, CMU (Olive) (pending IMLS grant)

2.Citance summaries: Lucy Vanderwende, Microsoft Research; Hoa Trang, NIST; Eduard Hovy, ISI/USC

3.NIF antibodies: Maryann Martone, NIF/UCSD4.Data-centric publishing: Phil Bourne, UCSD, Yolanda Gil, ISI/USC

(funded in part by Elsevier Labs)5.DIKB: Rich Boyce, U Pittsburgh, Jodi Schneider, DERI, Maria Liakata,

EBI (looking for funding opportunities!)6.CKUs: Agnes Sandor, Xerox Research Europe7.BEL/knowledge attribution: Dexter Pratt, Selventa; Henk Pander Maat,

University Utrecht (funded in part by NWO)8.DOMEO/SWAN:Paolo Ciccarese & Tim Clark, Harvard/MGH (funded in

part by Elsevier Labs)9.Evidence-based guidelines: Paul Groth, Rinke Hoekstra, Frank van

Harmelen, VU; Richard Vdovjak, Philips Research (funded by STW)10.Force11: Phil Bourne, UCSD; Eduard Hovy, ISI/USC; Tim Clark,

Harvard/MGH; Cameron Neylon, PLoS; Ivan Herman, W3C (funded in part by Sloan Foundation) 34

Anything here we can work on?

35

Type Problems Experiments IssuesA. Paper format: A. Paper format: A1 Two-dimensional Utopia, Wolfram CDF Standards, toolsA2 Linear ABCDE Adoption?A3 Not interactive Executable papers AdoptionB. Writing habitsB. Writing habitsB1 Reference to papers TAC: Citance summaries Need to start at authorB2 Inexact entity references NIF antibodies Need mandate!B3 Methods post-mortem Data-centric publishing Change research recording!C. Language: C. Language: C1 Coherent DIKB Hard to parse!C2 Narrative CKUs Fractal nature of paperC3 Abstract BEL Formalize knowledge levelD. Collections of papers: D. Collections of papers: D1 Can’t find Scientific search engines? Is anyone working on this?D2 Can’t compare DOMEO/SWAN Manual, doesn’t scaleD3 Can’t combine Evidence-based guidelines Inconsistencies!Writing less and reading moreWriting less and reading more Force11, perhaps? Social/political/personal!

36

[[1] Bleecker, J. ‘A Manifesto for Networked Objects — Cohabiting with Pigeons, Arphids and Aibos in the Internet of Things http://nearfuturelaboratory.com/2006/02/26/a-manifesto-for-networked-objects/ 2] Bechhofer, S., De Roure, D., Gamble, M., Goble, C. and Buchan, I. (2010) Research Objects: Towards Exchange and Reuse of Digital Knowledge. In: The Future of the Web for Collaborative Science (FWCS 2010), April 2010, Raleigh, NC, USA. http://precedings.nature.com/documents/4626/version/1[3] Neylon, C. ‘Network Enabled Research: Maximise scale and connectivity, minimise friction’, http://cameronneylon.net/blog/network-enabled-research/ ‘

What about writing completely differently?

http://nearfuturelaboratory.com/2006/02/26/a-manifesto-for-networked-objects/


http://precedings.nature.com/documents/4626/version/1


http://cameronneylon.net/blog/network-enabled-research/




36


Internet of things: (Bleecker, [1])Interact with ‘objects that blog’ or ‘Blogjects’, that:track where they are and where they’ve been;have histories of their encounters and experienceshave agency - an assertive voice on the social web [2]










36



Research Objects: (Bechofer et al, [2])Create semantically rich aggregations of resources, that can possess some scientific intent or support some research objective










36



Research Objects: (Bechofer et al, [2])Create semantically rich aggregations of resources, that can possess some scientific intent or support some research objective

Networked Knowledge: (Neylon, [3])If we care about taking advantage of the web and internet for research then we must tackle the building of scholarly communication networks. These networks will have two critical characteristics: scale and a lack of friction. [3]










Networked science in action:

37

• Galaxy Zoo: citizen science: classify galaxies in the comfort of your own home – like Hanny!

• Tim Gowers, Polymath: “This is to normal research as driving is to pushing a car”

• Mathoverflow: virtual network of mathematicians working collectively to answer big/small, clear/fuzzy questions

• Jean-Claude Bradley: ‘short-form chemistry’: tweet/blog about an experiment, Storify into a narrative

• Read Cameron Neylon’s blogon networked science!

http://cameronneylon.net/

http://cameronneylon.net/

Anything here we can work on?

38

Type Problems Experiments IssuesA. Paper format: A. Paper format: A1 Two-dimensional Utopia, Wolfram CDF Standards, toolsA2 Linear ABCDE Adoption?A3 Not interactive Executable papers AdoptionB. Writing habitsB. Writing habitsB1 Reference to papers TAC: Citance summaries Need to start at authorB2 Inexact entity references NIF antibodies Need mandate!B3 Methods post-mortem Data-centric publishing Change research recording!C. Language: C. Language: C1 Coherent DIKB Hard to parse!C2 Narrative CKUs Fractal nature of paperC3 Abstract BEL Formalize knowledge levelD. Collections of papers: D. Collections of papers: D1 Can’t find Scientific search engines? Is anyone working on this?D2 Can’t compare DOMEO/SWAN Manual, doesn’t scaleD3 Can’t combine Evidence-based guidelines Inconsistencies!Networked scienceNetworked science Mathoverflow, Bradley But is it science?Writing less and reading moreWriting less and reading more Force11, perhaps? Social/political/personal!

what’s wrong with research papers - and (how) can we fix it?

Technology

paper experiments

paper formata1

paper format b

issues inherent

big problem

core parts

core contributioncomponents

linear read