the e ect on citation inequality of di erences in citation ... · the e ect on citation inequality...

29

Upload: others

Post on 22-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

The E�ect on Citation Inequality of Di�erences in

Citation Practices across Scienti�c Fields

Juan A. Crespo1, Yunrong Li2, Javier Ruiz-Castillo2

2Universidad Carlos III de Madrid, Spain

October 10, 2013

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 1 / 19

Page 2: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Outline

1 Motivation

2 Model

3 Empirical with Raw Data

4 Normalization of Raw Data

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 2 / 19

Page 3: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Motivation

How to assess the scienti�c in�uence of a research paper?

Citation impact: the number of citations received by the paper within acertain period of time after its publication.

A Scienti�c Field: a collection of papers published in a set of closelyrelated professional journals.

Empirical Regularity: highly skewed citation distributions(Albarrán,2011), citation inequality is very large within a �eld as wellas in all-�elds case.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 3 / 19

Page 4: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Skewness of Citation Distribution

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 4 / 19

Page 5: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Motivation

The large citation inequality may be due to di�erent papers havedi�erent scienti�c in�uence, or papers belong to di�erent �elds.

The �eld dependence of citation impacts:

Size of the �eld: average number of papers per author in a givenperiod of time.

Average number of references per paper.

The speed at which the citation process evolves.

To introduce a simple model to see how important is the �elddependence of citation impacts on citation inequality.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 5 / 19

Page 6: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Motivation

The large citation inequality may be due to di�erent papers havedi�erent scienti�c in�uence, or papers belong to di�erent �elds.

The �eld dependence of citation impacts:

Size of the �eld: average number of papers per author in a givenperiod of time.

Average number of references per paper.

The speed at which the citation process evolves.

To introduce a simple model to see how important is the �elddependence of citation impacts on citation inequality.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 5 / 19

Page 7: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Motivation

The large citation inequality may be due to di�erent papers havedi�erent scienti�c in�uence, or papers belong to di�erent �elds.

The �eld dependence of citation impacts:

Size of the �eld: average number of papers per author in a givenperiod of time.

Average number of references per paper.

The speed at which the citation process evolves.

To introduce a simple model to see how important is the �elddependence of citation impacts on citation inequality.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 5 / 19

Page 8: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Statistics of Field Citation Distribution

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 6 / 19

Page 9: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Roemer´s Income Inequality Model

Roemer´s (1998) model:

Income of individuals depends on e�orts and circumstances (e.g.parents´ wealth, education. ), partition the population by �type�,

Incometi = (typet , effortti ).

E�ort distribution within a type is a characteristic of the type.

A1: Within a type, individuals at the same quantile of e�ortdistribution implement the same �degree� of e�ort.

A2: Within a type, income is monotonic in e�ort. Quantiles of e�ortdistribution correspond to quantiles of income distribution.

Holding constant the degree of e�ort/income, income inequality acrosstypes is due to circumstances, �inequality of opportunity�.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 7 / 19

Page 10: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Analogy of Our Model with Roemer´s

Individuals=⇒Articles

Income=⇒Citation impact

A income distribution=⇒A citation distribution

Circumstances/Types=⇒Fields

E�ort=⇒Scienti�c in�uence

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 8 / 19

Page 11: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Assumptions

Citationfi = (fieldf , Scientific Influencefi ), f = 1, ...,F ; i = 1, ...,N.

A1: Within a �eld, articles at the same quantile of the scienti�cin�uence distribution re�ect the same �degree� of scienti�c in�uence.

A2: Citationfi is monotonic in scientific influencefi .

Quantiles of scienti�c in�uence distribution correspond the quantiles ofcitation distribution.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 9 / 19

Page 12: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Assumptions

Citationfi = (fieldf , Scientific Influencefi ), f = 1, ...,F ; i = 1, ...,N.

A1: Within a �eld, articles at the same quantile of the scienti�cin�uence distribution re�ect the same �degree� of scienti�c in�uence.

A2: Citationfi is monotonic in scientific influencefi .

Quantiles of scienti�c in�uence distribution correspond the quantiles ofcitation distribution.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 9 / 19

Page 13: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Assumptions

Citationfi = (fieldf , Scientific Influencefi ), f = 1, ...,F ; i = 1, ...,N.

A1: Within a �eld, articles at the same quantile of the scienti�cin�uence distribution re�ect the same �degree� of scienti�c in�uence.

A2: Citationfi is monotonic in scientific influencefi .

Quantiles of scienti�c in�uence distribution correspond the quantiles ofcitation distribution.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 9 / 19

Page 14: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Double Partition

Partition the all-�elds citation distribution by �elds f and quantiles π.

Field 11st

2nd...

Πth

quantiles

...

Field F1st

2nd...

Πth

quantiles

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 10 / 19

Page 15: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Double Partition

�sort�, citations are in asending order within each �eld.

�_pctile�, create 1000 quantiles within each �eld.

�merge� to merge all �elds together.

All-�elds citation distribution is a matrix of cells, (f , π).

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 11 / 19

Page 16: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Decomposable Inequality Index

Generalized Entropy (GE) family of inequality indices, the Theil index.

I1(C ) = 1

N ∑lclµ

log clµ

, citation inequality of all-�elds case, µ is the meancitation of all articles.

For articles with 0 ciation, we follow the convention 0∗ log0 = 0.

I1(C ) is decomposable, I1(C ) = W + S + IDCP.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 12 / 19

Page 17: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Decomposable Inequality Index

Generalized Entropy (GE) family of inequality indices, the Theil index.

I1(C ) = 1

N ∑lclµ

log clµ

, citation inequality of all-�elds case, µ is the meancitation of all articles.

For articles with 0 ciation, we follow the convention 0∗ log0 = 0.

I1(C ) is decomposable, I1(C ) = W + S + IDCP.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 12 / 19

Page 18: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Decomposable Inequality Index

Generalized Entropy (GE) family of inequality indices, the Theil index.

I1(C ) = 1

N ∑lclµ

log clµ

, citation inequality of all-�elds case, µ is the meancitation of all articles.

For articles with 0 ciation, we follow the convention 0∗ log0 = 0.

I1(C ) is decomposable, I1(C ) = W + S + IDCP.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 12 / 19

Page 19: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Decomposable Inequality Index

W = ∑π ∑f νπ,f I1(cπ

f ), Within-Group term.

Weight:νπ,f is share of total citations received by articles in cell (f , π).For large Π, W is small.

S = I1(µ1, ...µπ , ...µΠ), Between-Group term.

Each paper is given the mean citation of articles in its own quantile.Citation inequality is due to articles belonging to di�erent quantiles, i.e.skewness of the all-�elds citation distribution.

IDCP = ∑π νπ I1(µπ1 , ...µ

π

f , ...µπ

F ).

Each paper is given the mean citation of articles in its own cell. Withinany quantile π, citation inequality is due to the di�erences in citationpractices across �elds.Weight: νπ is the share of total citations received by articles inquantile π.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 13 / 19

Page 20: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Decomposable Inequality Index

W = ∑π ∑f νπ,f I1(cπ

f ), Within-Group term.

Weight:νπ,f is share of total citations received by articles in cell (f , π).For large Π, W is small.

S = I1(µ1, ...µπ , ...µΠ), Between-Group term.

Each paper is given the mean citation of articles in its own quantile.Citation inequality is due to articles belonging to di�erent quantiles, i.e.skewness of the all-�elds citation distribution.

IDCP = ∑π νπ I1(µπ1 , ...µ

π

f , ...µπ

F ).

Each paper is given the mean citation of articles in its own cell. Withinany quantile π, citation inequality is due to the di�erences in citationpractices across �elds.Weight: νπ is the share of total citations received by articles inquantile π.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 13 / 19

Page 21: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Decomposable Inequality Index

W = ∑π ∑f νπ,f I1(cπ

f ), Within-Group term.

Weight:νπ,f is share of total citations received by articles in cell (f , π).For large Π, W is small.

S = I1(µ1, ...µπ , ...µΠ), Between-Group term.

Each paper is given the mean citation of articles in its own quantile.Citation inequality is due to articles belonging to di�erent quantiles, i.e.skewness of the all-�elds citation distribution.

IDCP = ∑π νπ I1(µπ1 , ...µ

π

f , ...µπ

F ).

Each paper is given the mean citation of articles in its own cell. Withinany quantile π, citation inequality is due to the di�erences in citationpractices across �elds.Weight: νπ is the share of total citations received by articles inquantile π.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 13 / 19

Page 22: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Data

Only research papers. One paper is assigned into only one �eld.

About 4.4 million articles pubilshed in 1998-2003.

A common �ve-year citation window for every year, about 35 millioncitations.

22 broad �elds: 20 for natural sciences and 2 for social sciences,distinguished by Thomson Scienti�c.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 14 / 19

Page 23: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Results with Raw Data

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 15 / 19

Page 24: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Exchange Rates

De�ne �exchange rate�, ef (π), for articles in cell (f , π):

ef (π) =µπ

fµπ , how many citations for an article at quantile π of �eld f

are equivalent on average to one citation in all-�elds case (thereference situation).

If ef (π) varies dramatically across π within a �eld, no common factorof all quantiles can be estimated.

Empirically, ef (π) remain su�ciently constant over quantiles[706th, 998th]. 60-70% of citations in each �eld.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 16 / 19

Page 25: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Exchange Rates

De�ne �exchange rate�, ef (π), for articles in cell (f , π):

ef (π) =µπ

fµπ , how many citations for an article at quantile π of �eld f

are equivalent on average to one citation in all-�elds case (thereference situation).

If ef (π) varies dramatically across π within a �eld, no common factorof all quantiles can be estimated.

Empirically, ef (π) remain su�ciently constant over quantiles[706th, 998th]. 60-70% of citations in each �eld.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 16 / 19

Page 26: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Exchange Rates

De�ne �exchange rate�, ef (π), for articles in cell (f , π):

ef (π) =µπ

fµπ , how many citations for an article at quantile π of �eld f

are equivalent on average to one citation in all-�elds case (thereference situation).

If ef (π) varies dramatically across π within a �eld, no common factorof all quantiles can be estimated.

Empirically, ef (π) remain su�ciently constant over quantiles[706th, 998th]. 60-70% of citations in each �eld.

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 16 / 19

Page 27: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Normalization Factor

De�ne an average-based exchange rate (ER) over [706th, 998th]:

ef = 1πM−πm

∑π ef (π)

Normalize raw citations: c∗fi = cfief

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 17 / 19

Page 28: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Results after Normalization

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 18 / 19

Page 29: The E ect on Citation Inequality of Di erences in Citation ... · The E ect on Citation Inequality of Di erences in Citation Practices across Scienti c Fields Juan A. Crespo 1, unrongY

Thank you!

Juan A. Crespo, Yunrong Li, Javier Ruiz-Castillo (UC3M)Di�erences in Citation Impacts October 10, 2013 19 / 19