VoynichVoynich Manuscript
similarSimilar words
CurrierCurrier types
SectionsSections
linksLiterature
EMailE-Mail


Graphs for similar words in the Voynich Manuscript



The existence of similarly spelled word types represent a remarkable feature of the VMS. It is possible to generate another word type from the word pool by replacing a glyph by a another one, or by adding or deleting a glyph. In fact for 6948 out of 8026 (86,5 %) word types at least one similar type exists (using the transcription of Takeshi Takahashi). It is surprising that for frequently used word types all conceivable spelling permutations exist. Moreover, it is possible to order the types by their similarities to build a graph of similar words for the whole VMS. The main network of similar word types is connecting 6837 out of 8026 (85 %) types with each other.

For the following graphs two different words are handled as similar if they differ in only one glyph (edit distance is 1). The size of a node is determined by the number of times a word occurs in the VMS. An interesting feature visible in the graphs is that they contain pairs of frequently used words linked together like "daiin" and "aiin", "chol" and "chor" or "chedy" and "shedy". It is noteworthy that the graphs for single pages also frequently contain pairs of frequently used word pairs like "daiin"/"aiin" or "chol"/"chor".

all types
complete manuscript


Typical for the graph are word types ending with "iin", "ol" and "dy" combined with the prefixes "d", "ch" and "qo". The following table combines all typical suffixes and prefixes and describes this way the main nodes within the network graph for the complete manuscript:

prefixaiinoldy
noneaiinoldy
d-daiindolddy
ch-chaiincholchedy
qo-qokaiinqokolqokedy


For frequently used word types multiple similarly spelled word types exist. Sometimes the only difference between two word types is an additional quill stroke. This is the case, for instance, for "aiin" and "ain". In other cases, similarly shaped glyphs replace each other. One example of such a case is "ain" and "air". With this set of rules, it is possible to build a "grid" for the word types in the VMS. For a grid containing all word types occurring at least four times and covering 60% of the VMS-text see vms_network.txt.

For the whole manuscript only 229 word types exists (2,85 %) which differ in more then two glyphs to all other word types occurring in the VMS. Two typical words of this kind are "okeokeokeody" and "okeeolkcheey". All 229 types occur only once and it is possible to split them into two or more words also occurring in the VMS. It is for instance possible to split the word "okeo keo keody" into three and the word "okeeol kcheey" into two words.