Authors: Doron Witztum, Eliyahu Rips and Yoav Rosenberg
Source: StatisticalScience 1994, Vol. 9, No. 3, 429-438 (abridged)
Abstract. It has been noted that when the Book of Genesis is written as two-dimensional arrays, equidistant letter sequences spelling words with related meanings often appear in close proximity. Quantitative tools for measuring this phenomenon are developed. Randomization analysis showsthat the effect is significant at the level of 0.00002.
Key words and phrases: Genesis, equidistant letter sequences, cylindrical representations, statistical analysis.
1. Introduction
The phenomenon discussed in this paper was first discovered several decades ago by Rabbi Weissmandel. (WEISSMANDEL, H. M. D. (1958) Torath Hemed. Yeshivath Mt. Kisco, Mt. Kisco.) He found some interesting patterns in the Hebrew Pentateuch (the Five Books of Moses, consisting of words or phrases expressed in the form of equidistant letter sequences (ELS's)--that is, by selecting sequences of equally spaced letters in the text.
As impressive as these seemed, there was no rigorous way of determining if these occurrences were not merely due to the enormous quantity of combinations of words and expressions that can be constructed by searching out arithmetic progressions in the text. The purpose of the research reported here is to study the phenomenon systematically. The goal is to clarify whether the phenomenon in question is a real one, that is, whether it can or cannot be explained purely on the basis of fortuitous combinations.
The approach we have taken in this research can be illustrated by the following example. Suppose we have a text written in a foreign language that we do not understand. We are asked whether the text is meaningful (in that foreign language) or meaningless. Of course, it is very difficult to decide between these possibilities, since we do not understand the language. Suppose now that we are equipped with a very partial dictionary, which enables us to recognize a small portion of the words in the text: "hammer" here and "chair" there, and maybe even "umbrella" elsewhere. Can we now decide between the two possibilities?
Not yet. But suppose now that, aided with the partial dictionary, we can recognize in the text a pair of conceptually related words, like "hammer" and "anvil." We check if there is a tendency of their appearances in the text to be in "close proximity." If the text is meaningless, we do not expect to see such a tendency, since there is no reason for it to occur. Next, we widen our check; we may identify some other pairs of conceptually related words: like "chair" and "table," or "rain" and "umbrella." Thus we have a sample of such pairs, and we check the tendency of each pair to appear in close proximity in the text. If the text is meaningless, there is no reason to expect such a tendency. However, a strong tendency of such pairs to appear in close proximity indicates that the text might be meaningful.
Note that even in an absolutely meaningful text we do not expect that, deterministically, every such pair will show such tendency. Note also, that we did not decode the foreign language of the text yet: we do not recognize its syntax and we cannot read the text.
This is our approach in the research described in the paper. To test whether the ELS's in a given text may contain "hidden information," we write the text in the form of two-dimensional arrays, and define the distance between ELS's according to the ordinary two-dimensional Euclidean metric. Then we check whether ELS's representing conceptually related words tend to appear in "close proximity."
Suppose we are given a text, such as Genesis (G). Define an equidistant letter sequence (ELS) as a sequence of letters in the text whose positions, not counting spaces, form an arithmetic progression; that is, the letters are found at the positions
n, n + d, n+2d, . . . , n + (k - 1) d.We call d the skip, n the start and k the length of the ELS. These three parameters uniquely identify the ELS, which is denoted (n, d, k).
Let us write the text as a two-dimensional array--that is, on a single large page--with rows of equal length, except perhaps for the last row. Usually, then, an ELS appears as a set of points on a straight line. The exceptional cases are those where the ELS "crosses" one of the vertical edges of the array and reappears on the opposite edge. To include these cases in our framework, we may think of the two vertical edges of the array as pasted together, with the end of the first line pasted to the beginning of the second, the end of the second to the beginning of the third and so on. We thus get a cylinder on which the text spirals down in one long line.
It has been noted that when Genesis is written in this way, ELS's spelling out words with related meanings often appear in close proximity. In Figure 1 we see the example of 'patish-(hammer) and 'sadan-(anvil); in Figure 2, 'Zidkiyahu-(Zedekia) and 'Matanya-(Matanya), which was the original name of King Zedekia (II Kings 24:17). In Figure 3 we see yet another example of (the Chanuka) and 'chashmonaee -(Hasmonean), recalling that the Hasmoneans were the priestly family that led the revolt against the Syrians whose successful conclusion the Chanuka feast celebrates.
Indeed, ELS's for short words, like those for 'patish-ùéèô'(hammer) and 'sadan-ðãñ' (anvil), may be expected on general probability grounds to appear close to each other quite often, in any text. In Genesis, though, the phenomenon persists when one confines attention to the more "noteworthy" ELS's, that is, those in which the skip |d| is minimal over the whole text or over large parts of it. Thus for 'patish-ùéèô' (hammer), there is no ELS with a smaller skip than that of Figure 1 in all of Genesis; for 'sadan-ðãñ' (anvil), there is none in a section of text comprising 71% of G; the other four words are minimal over the whole text of G. On the face of it, it is not clear whether or not this can be attributed to chance. Here we develop a method for testing the significance of the phenomenon according to accepted statistical principles. After making certain choices of words to compare and ways to measure proximity, we perform a randomization test and obtain a very small p-value, that is, we find the results highly statistically significant.