voynich ontcijfering

88
the Voynich Manuscript Kevin Knight Information Sciences Institute University of Southern California Sources for this talk: Mary D’Imperio, The Voynich Manuscript, An Elegant Enigma (1978) Kennedy & Churchill, The Voynich Manuscript (2006) Prescott Currier, Some Important New Statistical Findings (1976) Rene Zandbergen, Currier A and B: Two Different Languages? (1997) Rene Zandbergen, http://www.voynich.nu/ http://www.voynich.ms/forum/ experiments at USC/ISI MIT / September 2009

Upload: arnaud-thoen

Post on 22-Mar-2016

234 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Voynich ontcijfering

the Voynich Manuscript

Kevin KnightInformation Sciences Institute

University of Southern California

Sources for this talk:

Mary D’Imperio, The Voynich Manuscript, An Elegant Enigma (1978)

Kennedy & Churchill, The Voynich Manuscript (2006)

Prescott Currier, Some Important New Statistical Findings (1976)

Rene Zandbergen, Currier A and B: Two Different Languages? (1997)

Rene Zandbergen, http://www.voynich.nu/

http://www.voynich.ms/forum/

experiments at USC/ISI

MIT / September 2009

Page 2: Voynich ontcijfering

Some People Involved with the

Wilfrid Michael Voynichbook dealer

Ethel Boole, daughterof George Boole

Roger Bacon,“first scientist”

William Newbold,Polymath, PhD UPenn

Rudolf IIHoly Roman Emperor

Athanasius Kircher,German Jesuit super-scholar

William Friedman,WWII cryptanalyst

Hans P. Kraus,book dealer

Voynich Manuscript

Page 3: Voynich ontcijfering

Outline

• Voynich Manuscript – VMS, for short– What is it?– Where did it come from?– What does it mean?

Page 4: Voynich ontcijfering

What is it?

• Medieval illustrated manuscript• Approx. 235 pages on vellum material• Color drawings of plants, nymphs, stars,

etc.• Approx. 38,000 words written in an

unknown script• Undeciphered!!! Meaning is unknown• Currently owned by Yale University

Page 5: Voynich ontcijfering

38,000 words of text

Page 6: Voynich ontcijfering

Apparent Sections of VMS

Section “Name” # of word tokensHerbal 11,938Astrological 2,594Biological 6,915Cosmological 679Pharmacological 5,111Pure Text (“Stars”) 10,682

Page 7: Voynich ontcijfering

The Pictures: Herbal

Many pictureslook like grafting.

Sunflower? Would dateVMS as post-1492.

Page 8: Voynich ontcijfering

The Pictures: Astrological

Page 9: Voynich ontcijfering

The Pictures: Astrological

What is this?

Datable clothing?

Page 10: Voynich ontcijfering

The Pictures: Biological

Small nudes in baths

Interconnecting tubes of liquids

Page 11: Voynich ontcijfering

The Pictures:

Pharmacological

medicine jar?

Page 12: Voynich ontcijfering

1864 Ethel Boole born in England1865 WV born in Lithuania1885 WV imprisoned, Polish nationalist1890 WV & EB meet, marry in 19021898 WV publishes first book list1912 WV acquires VMS in “ancient castle”1914 WV moves to USA, opens bookshop1919 WV sends photostatic copies of VMS1919 Copying reveals de Tepencz signature1919 WV writes to Bohemian State Archvs1921 WV presents VMS + Marci letter

mentioning Bacon, $160k price1921 Newbold & WV announce decipherment1930 WV dies. VMS placed in vault, $100k1931 VMS appraised at $19,4001960 Ethel dies, VMS to secretary Ann Nill

“Castle” revealed as Villa Mondragone1961 NY dealer Hans Kraus buys for $24,5001969 Kraus donates VMS to Yale1972 Brumbaugh finds WV letters in BSA200x Zandbergen finds 1639 Baresch letter

in newly online Kircher archive

History of Voynich Manuscript

William Newbold,Polymath, PhD UPenn

Wilfrid Michael Voynichbook dealer

Page 13: Voynich ontcijfering

One-Page Letter Tucked Into VMSReverend and Distinguished Sir; Father in Christ:

This book bequeathed to me by an intimate friend,I destined for you, my very dear Athanasius [Kircher],as soon as it came into my possession, for I wasconvinced that it could be read by no one exceptyourself. The former owner of this book once asked your opinion by letter … Accept now this token …Dr Raphael, tutor in the Bohemian language toFerdinand III, then King of Bohemia, told me the saidbook had belonged to the Emperor Rudolf and thathe presented the bearer who brought him the book600 ducats. He believed the author was Roger Bacon,the Englishman. On this point I suspend judgment …At the command of your reverence,

Joannes Marcus Marci of CronlandPrague, 19 August, 1665(6?)

Kircher,super-scholar,recipient ofthis letter

???,owned VMSbefore Marci

EmperorRudolf,paid 600 ducatsfor VMS

Roger Bacon(1214-94)“first scientist”

“I’m Not Francis Bacon”

Page 14: Voynich ontcijfering

1864 Ethel Boole born in England1865 WV born in Lithuania1885 WV imprisoned, Polish nationalist1890 WV & EB meet, marry in 19021898 WV publishes first book list1912 WV acquires VMS in “ancient castle”1914 WV moves to USA, opens bookshop1919 WV sends photostatic copies of VMS1919 Copying reveals de Tepencz signature1919 WV writes to Bohemian State Archvs1921 WV presents VMS + Marci letter

mentioning Bacon, $160k price1921 Newbold & WV announce decipherment1930 WV dies. VMS placed in vault, $100k1931 VMS appraised at $19,4001960 Ethel dies, VMS to secretary Ann Nill

“Castle” revealed as Villa Mondragone1961 NY dealer Hans Kraus buys for $24,5001969 Kraus donates VMS to Yale1972 Brumbaugh finds WV letters in BSA200x Zandbergen finds 1639 Baresch letter

in newly online Kircher archive

1576-1612 Rudolf II purchases VMS

1608-1622 J. de Tepenecz signs VMSin Bohemian court

1630s George Baresch owns VMSGB sends letter to Kircher

1639 GB writes Kircher again

16xx Marci inherits VMS from GB

1665 Marci sends VMS to Kircherwith letter

1665-80 Kircher owns VMS

1680 Kircher dies

History of Voynich Manuscript

??

Page 15: Voynich ontcijfering

1864 Ethel Boole born in England1865 WV born in Lithuania1885 WV imprisoned, Polish nationalist1890 WV & EB meet, marry in 19021898 WV publishes first book list1912 WV acquires VMS in “ancient castle”1914 WV moves to USA, opens bookshop1919 WV sends photostatic copies of VMS1919 Copying reveals Tepenecz signature1919 WV writes to Bohemian State Archvs1921 WV presents VMS + Marci letter

mentioning Bacon, $160k price1921 Newbold & WV announce decipherment1930 WV dies. VMS placed in vault, $100k1931 VMS appraised at $19,4001960 Ethel dies, VMS to secretary Ann Nill

“Castle” revealed as Villa Mondragone1961 NY dealer Hans Kraus buys for $24,5001969 Kraus donates VMS to Yale1972 Brumbaugh finds WV letters in BSA200x Zandbergen finds 1639 Baresch letter

in newly online Kircher archive

1576-1612 Rudolf II purchases VMS

1608-1622 J. de Tepenecz signs VMSin Bohemian court

1630s George Baresch owns VMSGB sends letter to Kircher

1639 GB writes Kircher again

16xx Marci inherits VMS from GB

1665 Marci sends VMS to Kircherwith letter

1665-80 Kircher owns VMS

1680 Kircher dies

History of Voynich Manuscript

??

Page 16: Voynich ontcijfering

1864 Ethel Boole born in England1865 WV born in Lithuania1885 WV imprisoned, Polish nationalist1890 WV & EB meet, marry in 19021898 WV publishes first book list1912 WV acquires VMS in “ancient castle”1914 WV moves to USA, opens bookshop1919 WV sends photostatic copies of VMS1919 Copying reveals de Tepenecz signature1919 WV writes to Bohemian State Archvs1921 WV presents VMS + Marci letter

mentioning Bacon, $160k price1921 Newbold & WV announce decipherment1930 WV dies. VMS placed in vault, $100k1931 VMS appraised at $19,4001960 Ethel dies, VMS to secretary Ann Nill

“Castle” revealed as Villa Mondragone1961 NY dealer Hans Kraus buys for $24,5001969 Kraus donates VMS to Yale1972 Brumbaugh finds WV letters in BSA200x Zandbergen finds 1639 Baresch letter

in newly online Kircher archive

History of Voynich Manuscript

??

Page 17: Voynich ontcijfering

1864 Ethel Boole born in England1865 WV born in Lithuania1885 WV imprisoned, Polish nationalist1890 WV & EB meet, marry in 19021898 WV publishes first book list1912 WV acquires VMS in “ancient castle”1914 WV moves to USA, opens bookshop1919 WV sends photostatic copies of VMS1919 Copying reveals de Tepenecz signature1919 WV writes to Bohemian State Archvs1921 WV presents VMS + Marci letter

mentioning Bacon, $160k price1921 Newbold & WV announce decipherment1930 WV dies. VMS placed in vault, $100k1931 VMS appraised at $19,4001960 Ethel dies, VMS to secretary Ann Nill

“Castle” revealed as Villa Mondragone1961 NY dealer Hans Kraus buys for $24,5001969 Kraus donates VMS to Yale1972 Brumbaugh finds WV letters in BSA200x Zandbergen finds 1639 Baresch letter

in newly online Kircher archive

History of Voynich Manuscript

??

Page 18: Voynich ontcijfering

1864 Ethel Boole born in England1865 WV born in Lithuania1885 WV imprisoned, Polish nationalist1890 WV & EB meet, marry in 19021898 WV publishes first book list1912 WV acquires VMS in “ancient castle”1914 WV moves to USA, opens bookshop1919 WV sends photostatic copies of VMS1919 Copying reveals de Tepenecz signature1919 WV writes to Bohemian State Archvs1921 WV presents VMS + Marci letter

mentioning Bacon, $160k price1921 Newbold & WV announce decipherment1930 WV dies. VMS placed in vault, $100k1931 VMS appraised at $19,4001960 Ethel dies, VMS to secretary Ann Nill

“Castle” revealed as Villa Mondragone1961 NY dealer Hans Kraus buys for $24,5001969 Kraus donates VMS to Yale1972 Brumbaugh finds WV letters in BSA200x Zandbergen finds 1639 Baresch letter

in newly online Kircher archive

1576-1612 Rudolf II purchases VMS

1608-1622 J. de Tepenecz signs VMSin Bohemian court

1630s George Baresch owns VMSGB sends letter to Kircher

1639 GB writes Kircher again

16xx Marci inherits VMS from GB

1665 Marci sends VMS to Kircherwith letter

1665-80 Kircher owns VMS

1680 Kircher dies

History of Voynich Manuscript

??

“Barschius” owns VMSbetween J. de Tepenecz

and Marci

Page 19: Voynich ontcijfering

1864 Ethel Boole born in England1865 WV born in Lithuania1885 WV imprisoned, Polish nationalist1890 WV & EB meet, marry in 19021898 WV publishes first book list1912 WV acquires VMS in “ancient castle”1914 WV moves to USA, opens bookshop1919 WV sends photostatic copies of VMS1919 Copying reveals de Tepenecz signature1919 WV writes to Bohemian State Archvs1921 WV presents VMS + Marci letter

mentioning Bacon, $160k price1921 Newbold & WV announce decipherment1930 WV dies. VMS placed in vault, $100k1931 VMS appraised at $19,4001960 Ethel dies, VMS to secretary Ann Nill

“Castle” revealed as Villa Mondragone1961 NY dealer Hans Kraus buys for $24,5001969 Kraus donates VMS to Yale1972 Brumbaugh finds WV letters in BSA200x Zandbergen finds 1639 Baresch letter

in newly online Kircher archive

1576-1612 Rudolf II purchases VMS

1608-1622 J. de Tepenecz signs VMSin Bohemian court

1630s George Baresch owns VMSsends letter to Kircher

1639 GB writes Kircher again

16xx Marci inherits VMS from GB

1665 Marci sends VMS to Kircherwith letter

1665-80 Kircher owns VMS

1680 Kircher dies

History of Voynich Manuscript

Page 20: Voynich ontcijfering

Newbold Decipherment

• Marci letter Bacon Cabala “letter doubling” cipher

• Create 222 = 484 Latin letter pairs AA…XX– these letter pairs are the cipher alphabet

• Assign each plaintext Latin letter to a set of cipher-alphabet letter pairs (B AQ, RT, …)

• This gives the encipherer some freedom, while the recipient can still decipher by using the table

• Cleverly encipher plaintext in such a way as to construct a “cover” message that looks like Latin, to fool readers

Page 21: Voynich ontcijfering

Newbold System

• Example:a n n … DO MI NU … DOMINU …

• Too hard to assemble good “cover” text!• So, make cipher letter-pairs overlap:

a n n … AD DB BR … ADBR …• Also difficult, possibly too easy to decipher• So, employ anagramming:

a n n … OM DO MI … DO OM MI … DOMI …• Now can construct a plausible looking “cover” text

in Latin for our secret message (also in Latin) – an ingenious system, to be sure!!

Page 22: Voynich ontcijfering

Newbold Decipherment

Hmm, by the method, both plaintext and ciphertext should be in Latin letters…

But the VMS doesn’t have Latin letters…

Page 23: Voynich ontcijfering

William Newbold,Polymath, PhD UPenn

… 4OPCC89 … apparentciphertext

“artist’s rendition”

Page 24: Voynich ontcijfering

William Newbold,Polymath, PhD UPenn

… 4OPCC89 …

DOMI

apparentciphertext

realciphertext:DOMI…

“artist’s rendition”

Page 25: Voynich ontcijfering

Let’s Decipher with Newbold !

PCC89 …

DOMI

DOMI…

DO OM MI …

OM DO MI …

a n n …

non-deterministicanagramming

lookup in 222 table

o n n …non-deterministicmapping from 11Latin letters to full 22

real ciphertext

doublingapparent ciphertext

Page 26: Voynich ontcijfering

Let’s Decipher with Newbold !

PCC89 …

DOMI

DOMI…

DO OM MI …

OM DO MI …

a n n …

non-deterministicanagramming

lookup in 222 table

o n n …non-deterministicmapping from 11Latin letters to full 22

real ciphertext

doublingapparent ciphertext

Of course the 222 table isn’t given, so we have to build it up through cryptanalysis. Wow, this is a lot of work!

Page 27: Voynich ontcijfering

Newbold Decipherment

1300 real ciphertext “letters” in first 3 lines

Decipherment of those first lines:“I, Roger Bacon, have written this…”(in Latin)

Anagramming sets of 55 letters is sometimes required.

Slow but steady progress… Andromeda galaxy, ovaries & ova … so Bacon must have had a microscope & telescope, hundreds of years before they were discovered!

Page 28: Voynich ontcijfering

The Text

• Approx. 38,000 words, unknown script• Writing style similar to 15th century

Florentine “humanist” hand• Between 23 and 40 distinct characters• No corrections, likely to have been copied• Writing was done after illustrations

Page 29: Voynich ontcijfering

Transcription

BSC8AE OPCC9 4OE FCC89 4OFCC9 4OP9 SCBS9 4OBSC9 EFAM OPAE29

2ZC9 4OFC89 4OFAM Z89 4OFCC9 SC89 4OFCC9 4OFCC9 ESC89 EOP9

8ZC9 4OPCCC9 8ARSC89 4OFC9 4OP9

BSC8AE OPCC9 4OE FCC89 4OFCC9 4OP9 SCBS9 4OBSC9 EFAM OPAE292ZC9 4OFC89 4OFAM Z89 4OFCC9 SC89 4OFCC9 4OFCC9 ESC89 EOP98ZC9 4OPCCC9 8ARSC89 4OFC9 4OP9

last paragraph, f103r

Page 30: Voynich ontcijfering

Another medieval manuscript, just for calibration…

Page 31: Voynich ontcijfering

Introduction to Astrology and Its Use in Weather Prediction, Medicine, and Agriculture, in English. Manuscript on Paper. 1490.

Page 32: Voynich ontcijfering

Alphabet: Currier/D’Imperio

Transcription

C S Z

C S ZP F B V

P F B VQ X W Y

Q X W YJ A E R O I D

J A E R O I D6 7 8 9 4 2

6 7 8 9 4 2

G H 1

G H 1T U 0

T U 0N M 3

N M 3K L 5

K L 5

Page 33: Voynich ontcijfering

Alphabet: Currier/D’Imperio

Transcription

J A E R O I D

J A E R O I D

G H 1

G H 1Maybe this is really

IR IIR IIIRThere are several transcriptionschemes to choose from.

T U 0

T U 0

C S Z

C S ZP F B V

P F B VQ X W Y

Q X W Y6 7 8 9 4 2

6 7 8 9 4 2

Page 34: Voynich ontcijfering

Alphabet: Currier/D’Imperio

Transcription

C S Z

C S ZVariations of Z , or separate characters?

S S S S S S

Page 35: Voynich ontcijfering

Alphabet: Currier/D’Imperio

Transcription

C S Z

C S ZP F B V

P F B VQ X W Y

Q X W YAre these ligatures?Is Q just a fancy way of writing SP ?

If you didn’t know English, how would you know if was the same as ?

Suppose never occurred. Would that be evidence?Suppose did occur, with the same contexts as (e.g., *shing)?Suppose did occur, but never in the same context as ?

Another common motif:

fi f i

f if if i

fifi

SOORSOE9S9

Page 36: Voynich ontcijfering

Letter Frequencies

25468 O20227 C17655 914281 A12973 811008 S10471 E10026 F6716 R5994 P5423 44501 Z4076 M

2886 21752 N1413 B1046 J950 Q908 X591 T524 *431 V316 I217 W157 D156 3

148 U96 674 Y52 K31 G17 L14 H2 11 51 0

O

C

9

A

8

S

E

F

R

P

4

Z

M

2

N

B

J

Q

X

T

*

V

I

W

D

3

U

6

Y

K

G

L

H

1

5

0

Total63k character tokens

count letter count letter count letter

Page 37: Voynich ontcijfering

Most Frequent Words

863 8AM537 OE501 SC89469 AM426 ZC89396 SOE363 OR350 AR344 SC9318 8AR308 4OFCC9305 4OFCC89283 ZC9279 4OFAN272 4OFC89270 89262 4OFAM260 AE253 8AE243 2219 SOR

212 OFAM211 8AN191 4OFAE186 ZOE177 OFCC9174 SCC9172 SCOE155 S9155 OPC89154 OPAM152 4OFAR151 9151 4OE150 S89147 4OF9144 ZCC9144 OFAN144 2AM143 OPAE141 OPAR140 SX9

140 OPCC9138 OFAE130 ZO129 OFAR119 ESC89118 OFC89

8AM

OE

SC89

AM

ZC89

SOE

OR

AR

SC9

8AR

4OFCC9

4OFCC89

ZC9

4OFAN

4OFC89

89

4OFAM

AE

8AE

2

SOR

OFAM

8AN

4OFAE

ZOE

OFCC9

SCC9

SCOE

S9

OPC89

OPAM

4OFAR

9

4OE

S89

4OF9

ZCC9

OFAN

2AM

OPAE

OPAR

SX9

OPCC9

OFAE

ZO

OFAR

ESC89

OFC89

Totals:

8116 word types38k word tokens

count word count word count word

etc

Page 38: Voynich ontcijfering

Word Length DistributionsVoynichLength Distribution1 0.022 0.103 0.224 0.235 0.216 0.127 0.058 0.019 0.00310 0.00111 0.000112 0.0000713 0.0000235 0.00002

EnglishLength Distribution1 0.032 0.153 0.164 0.155 0.116 0.097 0.118 0.089 0.0510 0.0311 0.0112 0.00613 0.002

Counts on word types

Page 39: Voynich ontcijfering

Features of the Text

• 115 (out of 8116) word types appear doubled at least once

… 4OFCC89 4OFCC89 …

• 8 words appear tripled… 4OFC89 4OFC89 4OFC89 … … SOE SOE SOE … … ZCOE ZCOE ZCOE … … OFAM OFAM OFAM … … OE OE OE … … 9PAM 9PAM 9PAM … … 8AM 8AM 8AM … … 4OFCC89 4OFCC89 4OFCC89 …

However, very few repeatedword bigrams and wordtrigrams!

No word trigram appears morethan 5 times.

Page 40: Voynich ontcijfering

Some Theories About the Text

• Cryptogram• Phonetic writing system• Philosophical language• Outsider art• Glossolalia• Hoax

Page 41: Voynich ontcijfering

Cryptogram

• Newbold (1921)• Manly (1931) critique of Newbold• Feely (1945), abbreviated Latin• Strong (1945), polyalphabetic cipher, no

details– might fall into hands of enemies of USA!

• Brumbaugh (1972), numerological box• Several attempts in the 1990s

Page 42: Voynich ontcijfering

William Freidman• Most famous American cryptographer

of World War II– broke key ciphers, including Japanese

“Purple” code, led proto-NSA• VMS Study Group (1944-46)

– developed transcription alphabet– group disbanded after the war

• 2nd VMS Study Group (1962)– at RCA

• Included his VMS theory in paper on another topic– paper shortened due to space constraints– VMS theory included in a footnote, as an

anagram, to establish “invention date”

Theory

VMS written in a synthetic “philosophical”language

Page 43: Voynich ontcijfering

“ Writing in Tongues ”

• Glossolalia (Speaking in tongues)– Christian New Testament, Pentecost– People spoke tongues foreign to themselves

• Writing in Tongues?– Medium Helene Smith, investigated by Theodore

Flournoy (1896)– Under a trance, Smith was able to converse with

Martians– She learned their language and could speak and

write it– Looked like a genuine language– Grammar closer to French than you might expect

suggested in Kennedy & Churchill, 2005

Smith’s Martian

Page 44: Voynich ontcijfering

Hoax

• Previous hoaxes:– Hitler diaries– Vinland map

• Voynich Manuscript:– How?– Why?– Who?

Page 45: Voynich ontcijfering

How?• Gordon Rugg

(Scientific American, 2004)– Proposed Cardan grille– Elizabethan espionage

tool– If applied with

randomness injected, claimed to generate VMS-like text

Page 46: Voynich ontcijfering

Why?KPMG Forensic’s 2006 Survey of Fraud in

Australia and New ZealandMost Popular Motives for Fraud:

– greed/lifestyle (54%)– gambling (22%)– personal financial pressure (5%)– other (5%)– not specified (3.5%)– opportunity (0.4%)– substance abuse (0.4%)

Page 47: Voynich ontcijfering

Who?member of Societyof Friends of Russian Freedom

said to havefaked passports

Needed $ who doesn’t?

tricky said to havetraded newer,“better” booksfor monks’old dirty ones

spoke 18 languages

Marci lettervery convenient

faked to add a RogerBacon connection?

BUT: Baresch letter later found in Kircher archive also mention Bacon

BUT: What if Voynichhad seen that letter?

de Tepeneczsignature suspiciously foundduring overexposure

BUT: same signaturein other docs

BUT: what if Voynichknew that?

suggested in Kennedy & Churchill, 2005

Page 48: Voynich ontcijfering

Experiments

• Can computers help us make sense of VMS?• Is VMS a kind of letter substitution cipher?

– Originally in Latin?– English?– Ukrainian?– Ukrainian written without vowels?

• Are there patterns of any sort?

Page 49: Voynich ontcijfering

Substitution Cipher

ingcmpnqsnwf cv fpn owoktvcv

hu ihgzsnwfv rqcffnw cw owgcnwf

kowazoanv ...

Page 50: Voynich ontcijfering

Substitution Cipher

e e e e ingcmpnqsnwf cv fpn owoktvcv

e e ehu ihgzsnwfv rqcffnw cw owgcnwf

ekowazoanv ...

Page 51: Voynich ontcijfering

Substitution Cipher

e e e the ingcmpnqsnwf cv fpn owoktvcv

e e ehu ihgzsnwfv rqcffnw cw owgcnwf

ekowazoanv ...

Page 52: Voynich ontcijfering

Substitution Cipher

e he e the ingcmpnqsnwf cv fpn owoktvcv

e e e thu ihgzsnwfv rqcffnw cw owgcnwf

ekowazoanv ...

Page 53: Voynich ontcijfering

Substitution Cipher

e he e of the ingcmpnqsnwf cv fpn owoktvcv

e e e thu ihgzsnwfv rqcffnw cw owgcnwf

ekowazoanv ...

Page 54: Voynich ontcijfering

Substitution Cipher

e he e of the fofingcmpnqsnwf cv fpn owoktvcv

e f o e o oe thu ihgzsnwfv rqcffnw cw owgcnwf

efkowazoanv ...

Page 55: Voynich ontcijfering

Substitution Cipher

e he e of theingcmpnqsnwf cv fpn owoktvcv

e e e thu ihgzsnwfv rqcffnw cw owgcnwf

ekowazoanv ...

Page 56: Voynich ontcijfering

Substitution Cipher

e he e is the sisingcmpnqsnwf cv fpn owoktvcv

e s i e i ie thu ihgzsnwfv rqcffnw cw owgcnwf

eskowazoanv ...

Page 57: Voynich ontcijfering

Substitution Cipher

e he e is the sisingcmpnqsnwf cv fpn owoktvcv

e s i e i ie thu ihgzsnwfv rqcffnw cw owgcnwf

eskowazoanv ...

Cryptodict

abacdefb ACADEMICabacdefb DEDICATEabacdefb MEMBRANEabacdefc ELECTRICabacdefc TUTELAGEabacdefd ANARCHICabacdefd EVERYDAYabacdefe ANALYSESabacdefe ANALYSISabacdeff EYEGLASS

Page 58: Voynich ontcijfering

decipherment is the analysisingcmpnqsnwf cv fpn owoktvcvof documents written in ancienthu ihgzsnwfv rqcffnw cw owgcnwflanguages ...kowazoanv ...

Substitution CipherCryptodict

abacdefb ACADEMICabacdefb DEDICATEabacdefb MEMBRANEabacdefc ELECTRICabacdefc TUTELAGEabacdefd ANARCHICabacdefd EVERYDAYabacdefe ANALYSESabacdefe ANALYSISabacdeff EYEGLASS

Page 59: Voynich ontcijfering

Generative Models

Spanish letter trigram model

a {all Voynich letters}b {all Voynich letters}c {all Voynich letters}

…z {all Voynich letters}_ _

V A S 9 2 _ 9 F A E _ A R _ A P A M _ …

Probabilistic model thatsubstitutes VMS letters for Latinletters. Initially uniform.

q u o _ v a d e _ b r e r t e _ …

Train on Spanish web text.Parameters fixed.

EM method demonstrated on many decipherment tasks in [Knight et al 2006].

Easy experiments in Carmel finite-state package:% carmel --train-cascade corpus latin.wfsa subst.wfst

Returns trained devices & Viterbi decipherment.

EM Algorithm.argmax P(VMS) = argmax Σ P(latin) P(VMS | Latin)

θ θ latin

Page 60: Voynich ontcijfering

Substitution CipherInput Best decipherment assuming

plaintext is Spanish

cevzren cnegr qryvatravbfb uvqnytb qbadhvwbgr qr yn znapun …

primera parte del ingenioso hidalgo don quijote de la mancha …

VAS92 9FAE AR APAM ZOE ZOR9 QOR92 9 FOR ZOE89 …

decos acho es imen des dena denal y des denta …

If plaintext is assumed to be Latin:quiss squm is onum pomquss hates s qum hatis …

Page 61: Voynich ontcijfering

• Pre-collect language models for 80 languages

• Decipher against each

• See which decoding run yields highest probability

Hypothesize Other

Source Languages

Page 62: Voynich ontcijfering

United Nations

Declaration of Human Rights

No one shall be arbitrarily deprived of his property Niemand se eiendom sal arbitrêr afgeneem word nie Asnjeri nuk duhet të privohet arbitrarisht nga pasuria e tij ال يجوز تجريد أحد من ملكه تعسفاJaniw khitisa utaps oraqeps inaki aparkaspati Arrazoirik gabe ez zaio inori bere jabegoa kenduko Den ebet ne vo tennet e berc'hentiezh digantañ diouzh c'hoant Hикой не трябва да бъде произволно лишен от своята

собственост Ningú no serà privat arbitràriament de la seva propietat 任 何 人 的 财 产 不 得 任 意 剥 夺。

Di a so prupiità ùn ni pò essa privu nimu di modu tirannicu Nitko ne smije samovoljno biti lišen svoje imovine Nikdo nesmí být svévolně zbaven svého majetku Ingen må vilkårligt berøves sin ejendom Niemand mag willekeurig van zijn eigendom worden beroofd

Nul ne peut être arbitrairement privé de sa propriété Nimmen mei samar fan syn eigendom berôve wurde Ninguín será privado arbitrariamente da súa propiedade Niemand darf willkürlich seines Eigentums beraubt werden Κανείς δεν μπορεί να στερηθεί αυθαίρετα την ιδιοκτησία του Avavégui ndojepe'a va'erâi oimeháicha reinte imbáe teéva Ba wanda za a kwace wa dukiyarsa ba tare da cikakken dalili ba Senkit sem lehet tulajdonától önkényesen megfosztani Engan má eftir geðþótta svipta eign sinni Tak seorang pun boleh dirampas hartanya dengan semena-mena Necuno essera private arbitrarimente de su proprietate Ní féidir a mhaoin a bhaint go forlámhach de dhuine ar bithAl neniu estu arbitre forprenita lia proprieto Kelleltki ei tohi tema vara meelevaldselt ära võtta Eingin skal hissini vera fyri ongartøku Me kua ni dua e kovei vua na nona iyau Keltään älköön mielivaltaisesti riistettäkö hänen omaisuuttaan

300+ words in many of world’s languages, UTF-8 encoding

Page 63: Voynich ontcijfering

Unknown Source LanguageInput Best guess

of plaintext language

Best decipherment

cevzren cnegr qryvatravbfb uvqnytbqba dhvwbgr qr ynznapun …

Spanish primera parte del ingenioso hidalgo don quijote de la mancha …

VAS92 9FAE AR APAM ZOE ZOR9 QOR92 9 FOR ZOE89 …

Romanian nonsense

Page 64: Voynich ontcijfering

Consonantal WritingInput Best guess

of plaintext language

Best decipherment

ceze ceg qy atafuqyt qa dwg q y zapu…

Spanish prmr prt dl ngnshdlg dn qvt d l mnch…

VAS92 9FAE AR APAM ZOE ZOR9 QOR92 9 FOR ZOE89 …

more nonsense

Page 65: Voynich ontcijfering

Generative Models

• Okay, that didn’t work…

• Let’s devise looser generative models, to mine for patterns.

Page 66: Voynich ontcijfering

Generative ModelsTrigram model over {a, b, _ }

a {all Voynich letters}

b {all Voynich letters}

_ _

What parameter settingsresult in highest P(corpus) ? EM algorithm.

a a _ b a b _ a b a a _ …

Initially uniform

V A S 9 2 _ 9 F A E _ A R _ A P A M _ …

Page 67: Voynich ontcijfering

Generative ModelsTrigram model over {a, b, _ }

a {all English letters}

b {all English letters}

_ _

i n _ t h e _ t o w n _ w h e r e _ i _ was …

What parameter settingsresult in highest P(corpus) ? EM algorithm.

a a _ b a b _ a b a a _ …

Initially uniform

Page 68: Voynich ontcijfering

Generative ModelsTrigram model over {a, b, _ }

a

b

_ _

i n _ t h e _ t o w n _ w h e r e _ i _ was …

What parameter settingsresult in highest P(corpus) ? EM algorithm.

a a _ b a b _ a b a a _ …

Sample tagging with learned model:

a b _ b b a _ b a b b _ i n _ t h e _ t o w n _

b b a b a _ a _ …w h e r e _ i _ …

Initially uniform

Page 69: Voynich ontcijfering

??

Generative ModelsTrigram model over {a, b, _ }

a {all Voynich letters}

b {all Voynich letters}

_ _

V A S 9 2 _ 9 F A E _ A R _ A P A M _ …

What parameter settingsresult in highest P(corpus) ? EM algorithm.

a a _ b a b _ a b a a _ …

Sample tagging with learned model:

? ? ? ? ? _ ? ? ? ? _ ? ? _V A S 9 2 _ 9 F A E _ A R _

? ? ? ? _ ? ? ? _ ? ? ? ? _ …A P A M _ Z O E _ Z O R 9 _ …

Initially uniform

Page 70: Voynich ontcijfering

Generative ModelsTrigram model over {a, b, _ }

a

b

_ _

V A S 9 2 _ 9 F A E _ A R _ A P A M _ …

What parameter settingsresult in highest P(corpus) ? EM algorithm.

a a _ b a b _ a b a a _ …

Sample tagging with learned model:

b b b b a _ a b b a _ b a _V A S 9 2 _ 9 F A E _ A R _

b b b a _ b b a _ b b b a _ …A P A M _ Z O E _ Z O R 9 _ …

Initially uniform

Page 71: Voynich ontcijfering

Generative Models

a

b

English

a

b

Voynich

P(letter | tag) P(tag | letter)

00.10.20.30.40.50.60.70.80.9

1

B D J K M N P Q V W X L R C F G T H S Y U E O A I

00.10.20.30.40.50.60.70.80.9

1

0 1 4 S W Y X Q C A F P B I O 8 V * 2 H E G T R K U 6 J D 9 3 N M 5 L

P(a)

P(a)

Page 72: Voynich ontcijfering

Generative ModelsBigram model over {a, b}

a {all Voynich words!}

b {all Voynich words!}

What parameter settingsresult in highest P(corpus) ? EM algorithm.

a a b a b a b a a …

VAS92 9FAE AR APAM ZOE ZOR9 QRC2 9 ...

Page 73: Voynich ontcijfering

Generative ModelsBigram model over {a, b}

a

b

a a b a b a b a a …

VAS92 9FAE AR APAM ZOE ZOR9 QRC2 9 ...

Do words with similar contextshave similar spellings?!

That would be very interesting.

Page 74: Voynich ontcijfering

Generative ModelsBigram model over {a, b}

a

b

a a b a b a b a a …

VAS92 9FAE AR APAM ZOE ZOR9 QRC2 9 ...

Sample tagging with learned model:

a a a a a aVAS92 9FAE AR APAM ZOE ZOR9

a a a a a …QRC2 9 FOR ZOE89 2OR9 …

WAIT, WHAT?

Do words with similar contextshave similar spellings?!

That would be very interesting.

Page 75: Voynich ontcijfering

Generative Models

0

200

400

600

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101

106

111

116

121

126

131

136

141

146

151

156

161

166

171

176

181

186

191

196

201

206

211

216

221

0

200

400

600

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101

106

111

116

121

126

131

136

141

146

151

156

161

166

171

176

181

186

191

196

201

206

211

216

221

Voynich words tagged as “a”

Voynich words tagged as “b”

pages

Page 76: Voynich ontcijfering

Generative Models

0

200

400

600

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101

106

111

116

121

126

131

136

141

146

151

156

161

166

171

176

181

186

191

196

201

206

211

216

221

0

200

400

600

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101

106

111

116

121

126

131

136

141

146

151

156

161

166

171

176

181

186

191

196

201

206

211

216

221

Voynich words tagged as “a”

Voynich words tagged as “b”

pages

Page 77: Voynich ontcijfering

Generative Models

0

200

400

600

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101

106

111

116

121

126

131

136

141

146

151

156

161

166

171

176

181

186

191

196

201

206

211

216

221

0

200

400

600

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101

106

111

116

121

126

131

136

141

146

151

156

161

166

171

176

181

186

191

196

201

206

211

216

221

Voynich words tagged as “a”

Voynich words tagged as “b”

pages

Herbal Astro Bio Pharma Stars

Page 78: Voynich ontcijfering

Generative Models

0

200

400

600

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101

106

111

116

121

126

131

136

141

146

151

156

161

166

171

176

181

186

191

196

201

206

211

216

221

0

200

400

600

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 101

106

111

116

121

126

131

136

141

146

151

156

161

166

171

176

181

186

191

196

201

206

211

216

221

Voynich words tagged as “a”

Voynich words tagged as “b”

pages

Herbal Astro Bio Pharma Stars

Known since Capt. Currier’s analysis (1976): Two “languages” (in the formal sense).Several handwriting styles, supposedly similar breakdown.

Page 79: Voynich ontcijfering

Captain Currier’s

“Two Languages”

Help! I’m tired.

Pages w/Herbaldrawings

Page 80: Voynich ontcijfering

Zandbergen Dot Plot

For every pair of pages, how similar are they to each other?

Rene Zandbergen (1997)

pages

sam

e pages

Herbal Astro Bio Pharma Stars

Page 81: Voynich ontcijfering

Focus Further Experiments

on Voynich-B (Bio & Stars)

• Consistent vocabulary• Still plenty of words

• Let’s try models that divide words into classes

• 10 classes

Page 82: Voynich ontcijfering

10 Classes of words: English

etc etc etc

etc etc etc

Page 83: Voynich ontcijfering

a

c

d

e

f

g h

i j

b

10-classtagging ofVoynich-B

Page 84: Voynich ontcijfering

Class-Tag Sequences

• Tagging of first VMS page:– f g d h f g i d b j c c b e e a h f g e e a b e e a h f g d b j j c c b e a h f g j j j c c

c h f g b j j c c b j j c b j c c b e a h f g b j c b j c c b j c b i d i d c b j c c c c cc c c c b e a i d b j c c b j c c b j c c b j c c c c c h f g d b j j j j c c h f g b j j c b e a b i d i d h f g d i d i d i d h f g d b j j j c b j c c c c b j c c c b e a h f h f h f g b j c b e e e a h f g b j e a i d i d b j c b j c b j c h f g b j j c c c c c c b j j c b j c b e a h f g d i d i d b j c b j j j j c b j j c c c b j c b j c b j c c c c b j c b j c c c c c c i d b j c c c c b j c c c b j c c c c c c b j c h f g e a h f g i d i d b j j c b j c b j c b j c b e a b j c c c c c b j c c c c c c c c c i d b j c c c c b j c b j c c b i d i d i d b j j c b j c c c i d i d i d h f g b j c c c c c c c c c c c c c c c c b e a h f g h f g e a i

• 14-grams found in 10-class tagging:– 25 c c c c c c c c c c c c c c

– 9 i d i d i d i d b e a h f g

– 7 i d i d i d i d i d i d i d

– 7 i d i d h f g e e a h f g e

– 7 e a h f g e a h f g e a i d

– 6 j c c c c c c c c c c c c c

Page 85: Voynich ontcijfering

10 Classes

of words:

Voynich-B

Tags per

page.

050

100

1 6 11 16 21 26 31 36 41 46

a

a 050

100

1 6 11 16 21 26 31 36 41 46

b

b

0

100

200

1 6 11 16 21 26 31 36 41 46

c

c0

50

100

1 6 11 16 21 26 31 36 41 46

d

d

050

100

1 6 11 16 21 26 31 36 41 46

e

e 050

100

1 6 11 16 21 26 31 36 41 46

f

f

050

100

1 6 11 16 21 26 31 36 41 46

g

g 050

100

1 6 11 16 21 26 31 36 41 46

h

h

0

50

100

1 6 11 16 21 26 31 36 41 46

i

i0

50

100

1 6 11 16 21 26 31 36 41 46

j

j

Page 86: Voynich ontcijfering

10 Classes

of words:

Voynich-B

Tags per

page.

“Bio” words vs.

“Stars” words

050

100

1 6 11 16 21 26 31 36 41 46

a

a 050

100

1 6 11 16 21 26 31 36 41 46

b

b

0

100

200

1 6 11 16 21 26 31 36 41 46

c

c0

50

100

1 6 11 16 21 26 31 36 41 46

d

d

050

100

1 6 11 16 21 26 31 36 41 46

e

e 050

100

1 6 11 16 21 26 31 36 41 46

f

f

050

100

1 6 11 16 21 26 31 36 41 46

g

g 050

100

1 6 11 16 21 26 31 36 41 46

h

h

0

50

100

1 6 11 16 21 26 31 36 41 46

i

i0

50

100

1 6 11 16 21 26 31 36 41 46

j

j

Page 87: Voynich ontcijfering

Conclusion• Voynich Manuscript

– What it is pretty clear– Where it came from less clear– What it means totally unclear

• Lots of room for empirical, unsupervised computer techniques– Character analysis (e.g., ligatures)– Determining relations between words and pictures– Identification of “topics”– More cipher types

Page 88: Voynich ontcijfering

thank you