A.2. The Corrected Distance

In the previous section we defined a measure W(w,w')of proximity between two words w and w' -- an inversemeasure of the distance between them. We are, however, interested less inthe absolute distance between two words than in whether this distance islarger or smaller than "expected." In this section, we define a"relative distance" c (w,w'), which issmall when w is "unusually close" to w', and is 1, oralmost 1, when w is "unusually far" from w'. 

The idea is to use perturbations of thearithmetic progressions that define the notion of an ELS. Specifically,start by fixing a triple (x,y,z) of integers in the range {-2,-1,0,1,2}; thereare 125 such triples. Next, rather than looking for ordinary ELS's (n,d,k),look for "(x,y,z)- perturbed ELS's" (n,d,k)(x,y,z),obtained by taking the positions 

n, n + d,..., n + (k-4)d, n + (k-3)d + x, n + (k-2)d + x + y, n + (k-1)d +x +y + z

instead of the positions n, n +d, n + 2d,..., n + (k-1)d. Note that in aword of length k, k-2 intervals could be perturbed. However,we preferred to perturb only the three last ones, for technicalprogramming reasons. 

The distance between two (x,y,z)-perturbedELS's (n,d,k)(x,y,z) and (n',d',k')(x,y,z)is defined as the distance between the ordinary (unperturbed) ELS's (n,d,k)and (n',d',k'). 

We may now calculate the "(x,y,z)-proximity"of two words w and w' in a manner exactly analogous to thatused for calculating the "ordinary" proximity W(w,w'). This yields 125 numbers W ((x,y,z)(w,w'), of which W (w,w') = W(0,0,0)(w,w') is one. We are interested in only some of these 125 numbers;namely, those corresponding to triples (x,y,z) for which thereactually exist some (x,y,z)-perturbed ELS's in Genesis for w,and some for w' [the other W(x,y,z)(w,w') vanish]. Denote by M(w,w') the set of all suchtriples, and by m(w,w') the number of its elements. 

Suppose (0,0,0) is in M (w,w'),that is, both w and w' actually appear as ordinary ELS's(i.e., with x = y = z = 0) in the text. Denote by v (w,w') the number of triples (x,y,z) in M (w,w')for which W(x,y,z) (w,w')>= W (w,w'). If m (w,w')>= 10 (again, 10 is an arbitrarily selected "moderate"number), 

c (w,w'):= v (w,w') / m (w,w'). 

If (0,0,0) is not in M (w,w'), orif m (w,w') < 10 (in which case we consider the accuracyof the method as insufficient), we do not define c (w,w'). 

In words, the corrected distance c (w,w')is simply the rank order of the proximity W (w,w')among all the "perturbed proximities" W(x,y,z)(w,w'); we normalize it so that the maximum distance is 1. A largecorrected distance means that ELS's representing w are far awayfrom those representing w', on a scale determined by how far the perturbed ELS's for w are from those for w'. 

Click Here to Go To Appendix A.3.

Protected by Copyscape Originality Check