dr. steven claeyssens | @sclaeyssens - liber 2016 …...fp 2 bcretart : fulchr die / bp xo.batr...

9
Text and Data Mining: Explaining the Relevance dr. Steven Claeyssens | @sclaeyssens

Upload: others

Post on 03-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: dr. Steven Claeyssens | @sclaeyssens - Liber 2016 …...fp 2 bcretart : fulChr Die / bp xo.batr gtbatv / cnòc bcfe bagtltn 40. Dan pabua tnöe trona/ tñ anbtre ptact; fen gcb;atbt

Text and Data Mining: Explaining the Relevance

dr. Steven Claeyssens | @sclaeyssens

Page 2: dr. Steven Claeyssens | @sclaeyssens - Liber 2016 …...fp 2 bcretart : fulChr Die / bp xo.batr gtbatv / cnòc bcfe bagtltn 40. Dan pabua tnöe trona/ tñ anbtre ptact; fen gcb;atbt
Page 3: dr. Steven Claeyssens | @sclaeyssens - Liber 2016 …...fp 2 bcretart : fulChr Die / bp xo.batr gtbatv / cnòc bcfe bagtltn 40. Dan pabua tnöe trona/ tñ anbtre ptact; fen gcb;atbt

Text and data

= result of

more than 200 years of collecting

over 30 years of digitisation

almost 10 years of collecting born-digital

= machine readable, mostly textual

= structured or semi-structured

= legally as open as possible

Page 4: dr. Steven Claeyssens | @sclaeyssens - Liber 2016 …...fp 2 bcretart : fulChr Die / bp xo.batr gtbatv / cnòc bcfe bagtltn 40. Dan pabua tnöe trona/ tñ anbtre ptact; fen gcb;atbt
Page 5: dr. Steven Claeyssens | @sclaeyssens - Liber 2016 …...fp 2 bcretart : fulChr Die / bp xo.batr gtbatv / cnòc bcfe bagtltn 40. Dan pabua tnöe trona/ tñ anbtre ptact; fen gcb;atbt

www.delpher.nl www.dbnl.org

Page 6: dr. Steven Claeyssens | @sclaeyssens - Liber 2016 …...fp 2 bcretart : fulChr Die / bp xo.batr gtbatv / cnòc bcfe bagtltn 40. Dan pabua tnöe trona/ tñ anbtre ptact; fen gcb;atbt

http://ngramviewer.kbresearch.nl

Page 7: dr. Steven Claeyssens | @sclaeyssens - Liber 2016 …...fp 2 bcretart : fulChr Die / bp xo.batr gtbatv / cnòc bcfe bagtltn 40. Dan pabua tnöe trona/ tñ anbtre ptact; fen gcb;atbt

https://pimhuijnen.com/2015/12/04/from-keyword-searching-to-concept-mining/

Page 8: dr. Steven Claeyssens | @sclaeyssens - Liber 2016 …...fp 2 bcretart : fulChr Die / bp xo.batr gtbatv / cnòc bcfe bagtltn 40. Dan pabua tnöe trona/ tñ anbtre ptact; fen gcb;atbt

Text and Data Mining

= using text corpora in bulk as complex (‘biggish’, structured and semi-structured) data

= using computational techniques (IR, NLP, ML, NER, vector space models, …) to derive information

by computer scientists and (digital) humanities scholars

e.g. historians: track actors (networks), concepts (semantic fields) and ideas over space and time

=> identifying patterns and needles (longue durée and microhistory)

= new ways to help us understand culture, society, humanity

Page 9: dr. Steven Claeyssens | @sclaeyssens - Liber 2016 …...fp 2 bcretart : fulChr Die / bp xo.batr gtbatv / cnòc bcfe bagtltn 40. Dan pabua tnöe trona/ tñ anbtre ptact; fen gcb;atbt

Any questions?

www.kb.nl/dataserviceswww.kb.nl/dh

[email protected]@sclaeyssens