On the precision of Ptolemy ’ s geographic coordinates in his Geographike

In his Geographike Hyphegesis ( ∼150 AD) Claudius Ptolemy catalogued the positions of over 6300 ancient locations in the form of geographic longitude and latitude. A determinatio of the frequencies of the fractions of the degrees indicates that the coordinates are given with di fferent precision /resolution and larger regional di fferences. The author’s assumptions concerning the origin of the detected form of the frequency distribution were examined with statistical hypothesis tests and partly confirmed. Di fferences between the frequencies of theΩandΞ-recension were investigated in terms of possible transcription errors in the manuscripts. A calculation method is given with which the properties of coordinates of di fferent resolutions can be estimated. It is applied both globally and regionally. Such basic information about the precision of the Ptolemaic coordinates is important for detecting systematic and gross errors in the Ptolemaic coordinates and thereby for the identification of the Ptolemaic locations with their today’s place.


Introduction
Ancient geography and cartography have been a very active area of research especially in the last decades, see for example Stevens (1998aStevens ( , 1998b)), Hübner (2000), Cosgrove (2001), Stückelberger and Mittenhuber (2009).One of the most important scientific works of antiquity is Claudius Ptolemy's (ca.100-178 AD) "Geographike Hyphegesis" (GH), in short "Geography", which is the object of the current study.With his "Geography" Ptolemy summarized the geographic knowledge of his time, apart from a theoretical part it includes a catalogue of over 6300 positions of ancient locations.Ptolemy introduced a uniform, global coordinate system, in which the positions are given by means of geographic longitude Λ and latitude Φ.The original "Geography" does not exist any more, but there are over 50 Greek manuscripts, the oldest one dates from the end of the 13th century.
With today's information technology (processing power, algorithms and software) computer-based mathematical analyses of historical geographic data are possible (see for example Beineke (2001), Niederöst (2005) and Kleineberg et al. (2010)).Such analyses can lead to a better understand-Correspondence to: C. Marx (christian.marx@tu-berlin.de)ing of the knowledge and the scientific methods of the past as well as of historical developments in general.Among the innumerable works on the "Geography" efforts of calculational investigations of the Ptolemaic coordinates are rare.
The following investigation is a first statistical analysis of the precision of the Ptolemaic geographic coordinates in terms of their resolution.This precision needs to be distinguished from the accuracy in terms of the actual errors (random, systematic, gross), however, the resolution and the size of the random errors are associated.The current study was inspired by the extensive research of Graßhoff (1990) and his forerunners, e.g.Vogt (1925), into the fractions of degree of the catalogued star coordinates in Ptolemy's Almagest.
The catalogued coordinates in the "Geography", Books 2-7 are given in degree and fractions of degree with the biggest denominator of the fractions being 12.This may lead to the assumption that the Ptolemaic coordinates were generally given with a precision/resolution of 1 C. Marx: On the precision of Ptolemy's geographic coordinates in his Geographike Hyphegesis The frequencies of the fractions of degree will be determined globally and regionally.From the findings assumptions about the origin of the distribution of the fractions will be derived and tested with statistical hypothesis tests.Furthermore a calculation method will be given and applied whereby the proportions of coordinates with different precision can be estimated.
Information about the precision of the Ptolemaic coordinates is of use for modelling the stochastic properties of the Ptolemaic coordinates within a distortion analysis of the Ptolemaic coordinates.On the other hand, a distortion analysis is important for the identification of the Ptolemaic locations with their today's place.
To ease the notation and readability, the fractions of degree will be expressed in arc minutes (0 , 5 = 1 12

The frequency of the fractions of degree
When regarding all the Ptolemaic coordinates, the degree values as a raw indication of the position not distribute randomly due to their interdependence of the topographic conditions and the settlement structure.Regarding only the fractions of the degree values, this interdependence is insignificant so that they distribute randomly.Furthermore every coordinate and thereby its fraction of degree is a random variable, because it was deduced from coincidence afflicted data.
For this reason the minute value can be seen as the result of a random experiment "generating a minute value M".M can take on the 12 possible values m 1 = 0 , m 2 = 5 , m 3 = 10 , ... , m 12 = 55 .Regarding the minute values of n coordinates, the probability P that the frequency H j of minute value m j is x is given by the density function of the binomial distribution with the probability p j of minute value m j .
Based on the assumption that all coordinates have a resolution of 1 12 • , each of the 12 possible minute values should occur with the same probability p j = 1 12 with j = 1...12 and therefore the minute values follow a uniform distribution.Then the frequencies H j of the different minute values m j should be approximately the same due to the high amount of locations in the "Geography".
In the following the longitudes and latitudes will be examined separately.Reasons for this are, that the accuracies of a location's longitude and latitude will only be rarely or not correlated at all and if there is a correlation, its kind is unknown in detail and will differ from location to location.This follows from the independent and different measurement of longitudes (mainly terrestrial) and latitudes (terrestrial; Gnomon; length of the longest day) in the antiquity and Ptolemy's determination of coordinates from travel reports and itineraries, which only provided distances and no detailed directions, so that the accuracy of a determined position probably differs in longitude and latitude.
The frequencies of the minute values m j were determined for all coordinates of the location catalogue (GH II-VII).Due to the differences between the manuscripts there are no unique values bequeathed for every coordinate.The manuscripts are presumably based on two recensions, the so called Ωand Ξ-recension (Stückelberger and Graßhoff, 2006, p. 27).The Ξ-recension is bequeathed by only one manuscript (Codex Vaticanus Graecus 191, in short X) which ends in Book V, Chapter 13.The coordinates of the Ωand Ξrecension (represented by X) differs partly.Due to the works of Stückelberger and Graßhoff (2006) the coordinates of both recensions are available (data on attached CD).Thereof 6333 Ptolemaic positions were used for the following investigations.On the one hand a data set of coordinates of the Ωrecension was built from the given positions (in short "Ω") and on the other hand a data set, where the Ω-coordinates are replaced by the Ξ-recension if differences between both recensions exist (in short "Ω/Ξ").
Figure 1 shows the results for the frequencies of the minute values of the longitudes and latitudes in form of histograms.As can be seen the differences between Ω and Ω/Ξ have an ineffectual influence on the form of the distributions here.The largest difference occurs in the latitudes at 0 and amounts to 46.Obviously the distributions in longitude and latitude differ significantly from a uniform distribution.This contradicts a general resolution of 1 12 • .The distributions are approximately symmetric (0 excluded) and have a specific form with four peaks at 0 , 20 , 30 , 40 .With decreasing frequency occur: 0 , 30 , 20 /40 , 15 /45 , 10 /50 , 5 /25 /35 /55 .The last group is only rarely present in the longitudes.

On the origin of the frequencies of the fractions of degree
A possible reason for the deviations of the distribution of the minute values from a uniform distribution is that in addition to coordinates of the resolution 1

12
• also coordinates with more imprecise resolutions exist.Reasons for different resolutions can be: -When Ptolemy determined coordinates from given geographic data he was aware of the uncertainty of his results and rounded the coordinate values meaningfully.This concerns especially data from travel reports and itineraries.
-Ptolemy was given coordinates with different accuracies.So the accuracy of ancient measurement devices and methods differed and measurements resulted in coordinates with different accuracies.
Coordinates with 5 , 25 , 35 and 55 can only have a resolution of 1 12 • , whereas coordinates with other minute values can also be given with lower precision, for example 20 -and 40 -coordinates with the resolution 1 6 In the following H m j denotes the frequency of the coordinates with minute value m j and H m j a i the frequency of the coordinates with resolution a i and minute value m j .Within this notation a i is expressed by a fraction in contrast to m j ( 1 12 , 1 6 , 1 4 , 1 3 , 1 2 , 1 1 ).If there are coordinates with different resolutions 1 1 1 . (2) If, for example, the frequency of coordinates with a resolution of 1 12 • is approximately the same among the 5and 10 -coordinates (H 05 ), the frequency of the 10coordinates (H 10 ) will be substantially larger than the frequency of the 5 -coordinates (H 05 ) due to additional coordinates with resolution 1 6 • among the 10 -coordinates.Therefore the frequencies of some minute values increase and the distribution shows the detected pikes.
The observed symmetric property of the distribution results from approximate equalities of frequencies of particular minute values.An explanation for the approximate equalities is that among coordinates with a specific resolution the possible minute values occur with approximately equal frequencies.Regarding the random character of the minute values this appears plausible.Thus for example is H 15 and H 15   1  4   ≈ H 45   1  4 so that for the frequencies of 15 -and 45coordinates is H 15 ≈ H 45 .
Hence for the following examinations is supposed: Assumption 1 The coordinates in the location catalogue of the "Geography" have different resolutions • and 1 • .
Assumption 2 Regarding only the coordinates with one specific resolution, the possible fractions of degree have the same probability.
Therefore for the expected value of the H m j a i holds: With the relation (Sachs, 1992, p. 95) . (5) The equality of the expected values of specific frequencies H m j leads to the observed symmetry of the distribution in Fig. 1. 4 Verification of the assumptions on the distribution of the fractions of degree

Simulation
Assumptions 1 and 2 about the origin of the observed form of the distribution of the minute values were tested firstly by means of simulation, i. e. a random generation of minute values.For every resolution a i a specific number n i of coordinates was given and their minute values M were generated randomly using a pseudo random generator.In doing so every, at a specific resolution a i possible minute values m j got the same probability p. Figure 2 shows the result of the random generated frequencies H m j on the sample of the longitudes.In the histogram also the random portions H m j a i of coordinates with different resolutions are indicated.Comparing the result of the simulation with the observed frequency in Fig. 1, a high similarity in the form of the distribution can be stated.That argues for Assumptions 1 and 2. However, since the simulated distribution is more symmetric than the observed one, it seems that the uniform distribution assumed in Assumption 2 is only approximately present in the Ptolemaic coordinates.

Statistical hypothesis tests
Assumptions 1 and 2 or Hypothesis (3), respectively, can not be tested by statistical tests, because there are no observations H m j a i for the frequencies of coordinates with the same resolution.However, it was shown that Hypothesis (3) leads to a symmetric frequency distribution of the minute values with That can be tested with the observed frequencies H m j so that an indirect test of Hypothesis (3) is possible.
The test of Hypothesis ( 6) is made with the χ 2 test if there are more than 2 frequencies to compare and with the binomial test if there are 2 frequencies to compare.In the four single tests for Hypotheses (6a-d) the null hypothesis H 0 is that the k = 4 (5 , 25 , 35 , 55 ) or k = 2 (otherwise) minute values m j occur with a probability of p = 1/k.The alternate hypothesis H A is p 1/k.
The χ 2 -test is made for Hypothesis (6a).The test statistic is in general (Sachs, 1992, pp. 420-421) with the class number k, the observed frequencies n i , the hypothetical probabilities p i and n = k i=1 n i .Here is k = 4 (four classes 5 , 25 , 35 , 55 ), n i = H m j and p i = 1 4 (testing uniform distribution).Under H 0 χ 2 follows a χ 2 distribution with f = k − 1 − u degrees of freedom, where u is the number of the unknown parameters.Here is u = 0.
The χ 2 test is replaced by the binomial test if only two classes exist (Büning and Trenkler, 1978, pp. 103-106).At the three tests for Hypotheses (6b-d) the two possible classes are either M = m j or M = m k (k j), each with p = 0.5.The probability that a specific minute value m j occurs x times is given by the distribution density of the binomial distribution (1).Test statistic of the binomial test is the observation H m j itself.

Applying the hypothesis tests
Since four single tests for 5 /25 /35 /55 , 15 /45 , 10 /50 and 20 /40 have to be made, the test of Hypothesis ( 6) is a kind of multiple test.Thus the probability increases that at least one null hypothesis is rejected by mistake in comparison to a single test (Sachs, 1992, pp. 183-184).That is why a significance level of 1 − α = 0.99 was chosen for the single tests.
As a result, the general validity of Hypothesis ( 6) must be rejected, some differences between the H m j are significant: 1. Ω and Ω/Ξ: in the longitudes between H 15 and H 45 , 2. Ω/Ξ: in the longitudes between H 05 , H 25 , H 35 and H 55 , 3. Ω: in the latitudes between H 20 and H 40 , 4. Ω and Ω/Ξ: in the latitudes between H 05 , H 25 , H 35 and H 55 .
Since less than half of the tested differences are significant, Assumptions 1 and 2 must not be rejected totally, also because the results of the simulation argued for them.Still, coordinates with different resolutions can be assumed (Assumption 1), Assumption 2 has to be adapted.It is possible, that the minute values of coordinates with the same resolution are only approximately uniformly distributed or that Hist.Geo Space Sci., 2, 29-37, 2011 www.hist-geo-space-sci.net/2/29/2011/ there were systematic influences on the frequencies of the minute values, respectively.A possible reason for the significant differences of the H m j specified above is: 1) The value 40 = 2 3 • was preferred when coordinate values were rounded.That holds especially for preferring 40 to 35 and 45 , but is also imaginable for the value 50 .
Thus H 40 is increased systematically compared to H 20 so that 3. occurs, H 45 is increased systematically compared to H 15 so that 1. occurs, and H 35 is increased compared to H 05 , H 25 and H 55 so that 2. and 4. occur.
In the Greek number representation the value 2 3 • was expressed by the special sign γo .Thereby it was the only fraction of degree > 1 2 • , which was not expressed by an addition of unit fractions (e.g.35 • ).Furthermore with the value 2 3 • the full degree was only divided into three parts, i.e. it was less divided in comparison to the use of other fractions of degree > 1 2 • , whereby the value was probably more mathematically attractive too.Moreover, Vogt (1925) argues with the preference of specific fractions of degree in terms of a large frequency of half and full degrees in the latitudes in the star catalogue of the Almagest.So half and full degrees are supposed to have a higher attractiveness when reading the measurement instrument.
If the value 2 3 • was preferred it is to be ask why a fraction < 1 2 • was not preferred too.An explanation for this fact is that each of these values with exception of 5 12 • = 25 could be expressed with a single unit fraction so that there was not a specifically preferable value among them.Large differences between H 40 and H 20 could also be explained by a frequent mix of 20 = 1 3 Stückelberger and Graßhoff (2006, p. 40) a typical mistake in the manuscripts), but then a frequent erroneous transcription from γ to γo must have occurred, which surely is not as likely as the opposite case.
After the frequencies of the fractions of degree were investigated globally the question arises, if the observed form of the distribution also occurs regionally.To investigate this, the location catalogue was divided into 10 regions in which the frequencies of the minute values was determined (see also Fig. 7): 1. GH II.2, 3, 7-13: Ireland, Britain, Gaul, Germania, Rhaetia, Noricum (550 locations),  from the global distribution.This can be explained by regionally different amounts of coordinates with different precision.Furthermore, larger relative differences between frequencies appear, which should coincide according to Hypothesis (6), however, this is anyhow to be expected due to smaller samples.The globally observed excess of 40 does not occur in every region.Hypothesis ( 6) was tested in the single regions again by means of the described hypothesis tests, the significance level was again 1 − α = 0.99.In the following " ", " " and " " stand for statistically significant differences as well as ">" and "<" for insignificant but noticeable differences.The index Ω stands for the data set Ω, index Ξ for data set Ω/Ξ.If no index is used, the result holds for Ω and Ω/Ξ.The results in the several regions are (Ξ-recension only in Regions 1-7, since X ends in GH V.13): 1. no significant differences 2. Λ: Ω (not explainable with 1)) In the three Regions 1, 4 and 5 are in Ω as well as in Ω/Ξ no significant differences concerning the Hypothesis (6).(Notable is that these regions match the area of Europe except for the southern and south-eastern regions of GH III.1-4 and GH III.13-17.)If there are significant differences, they differ regionally in general.Only a few significant differences exist, in no region all single test results contradict Hypothesis (6).The frequencies H m j which are responsible for significant differences are mostly obvious.

About the differences between Ωand Ξ-recension
In the histograms of the several regions large differences occur partly between the Ωand Ξ-recension.A possible explanation for a large difference is that a large amount of equal transcription errors occurs in one of both recensions.For example in the longitudes of Region 2 (GH II.4-6) the differences ∆H 20 = 17 and ∆H 50 = −17 can be explained by a frequent error in Ω: 2a) 50 → 20 : erroneous transcription from 50 = 1 2 But then there should exist about 17 locations with 20 in Ω and 50 in Ξ.That is not the case.
For further types of differences ∆H m j such explanations are imaginable, which have to be investigated.In addition to Error 2a) the following transcription errors were considered: 2b) 50 → 30 : erroneous transcription from 50 = 1 2 For each of the Regions 1-7 was determined for how many locations the minute values differ in the Ωand Ξ-recension according to the possible Errors 2a) to 2f); in the case of Error 2f) also the degree value had to be considered.As a result, pairs of minute values occur according to Errors 2a) to 2f) in Ω and Ξ, but in general their number is much lower than the observed differences ∆H m j between Ω and Ξ.Hence, the differences in the frequencies of the minute values between Ω und Ξ mostly can not be explained by the Errors 2a) to 2f).Regarding differences ∆H m j ≥ 4, only differences concerning the minute values M = 10 and M = 30 can be explained by a transcription error.The respective regions are: Deviations from the uniform distribution, which appear as peculiar observations H m j in the histogram, can be weighted down in the parameter estimation.The estimation also gives the uncertainties (standard deviations) of the estimated portions of the resolutions.Large differences between frequencies H m j , which should be equal according to Hypothesis (6), will result in larger uncertainties.
In the following the frequency of coordinates with a resolution a i is named H a i .An estimation of the H a i is carried out in two steps.At first the H m a i are estimated from the observed frequencies H m j .Then the H a i are simply summed up.
For the estimation of the H m a i a common parameter Ĥm a i is introduced into the Equation system (2) for these unknown H For the determination of the 6 unknown Ĥm a i 12 observations H m j are given.Ĥm are overdetermined by the Equation system (9a-j).Since the observed values H m j are random variables, the Equation system (9a-j) are not satisfied, so that the equations have to be expanded by random errors or residuals v m j : Then, the estimation of Ĥm is made with an adjustment (least squares method).Equations ( 9k) and (9l) do not play a role in the overdetermination of the unknowns (they are the only two equations with the two unknowns Ĥm and covariances of the Ĥm a i .Therefrom we get the variances of the H a i by means of a variance-propagation, i.e. information about the uncertainty of the results. Initially the whole location catalogue of the "Geography" was examined.The detected deviations from Hypothesis (6) were taken into account; the obvious outliers H 45 in the longitudes as well as H 35 and H 40 in the latitudes were weighted down with the weight 0.5 (instead of 1.0).
Figure 6 shows the estimated proportions of the resolutions.In the diagram also the estimated standard deviations are plotted.Their values are small enough for interpreting • and a small portion of the resolution 1 • .In latitude slightly predominates the resolution 1 12 • , followed by 1 4 • , 1 6 • , 1 3 • and 1 2 • .According to the estimation latitudes with a resolution of 1 • does not occur.Thus, in latitude exist more coordinates with smaller resolutions than in longitude.This is reasonable due to the departure of the longitudes and, at this point, does not mean that the latitudes are really more accurate.
The frequencies of the coordinates with different resolutions were also estimated for the Regions 1-10.The visually detected outliers H m j were taken into account again using lower weights p = 0.5 in the adjustment.In order to summarize the results here, for every region an average resolution ā was calculated by means of Since the resolution ā in longitude is not comparable for regions of different latitudes, the resolutions ā were converted into kilometre, where the average latitude of the region was used approximately.Table 1 gives the calculated ā, Fig. 7 shows them additionally in form of axes.Only the Ωrecension is considered here, for the Ξ-recension the results are similar.The longitudes are more imprecise than the latitudes in every region, however, insignificant in Europe.Maybe that has to do with a scaling error in the Ptolemaic coordinates, which differs in longitude and latitude and also affects the fractions of degree.A larger scaling error in the longitudes is explicable with Ptolemy's overestimation of the dimension of the Ecumene.
The most precise coordinates are to be found in Greece, Macedonia and Italy, followed by the Iberian Peninsula.Less precise are the coordinates in Southeastern and Eastern Europe, followed by the Near East, Central and Western Europe and Africa.In Asia, east of the Caspian Sea the coordinates are the least precise.This arrangement obviously reflects the sphere of control and area influence of the Roman Empire, from which we can assume, that there were more accurate measurements and more geographic information available.

Conclusions
A method was given and applied whereby new information about the precession of the Ptolemaic coordinates can be delivered by investigating the frequencies of the fractions of degree.That method can also be applied to other regions of Ptolemaic locations, which are of interest.Information about the precision of the Ptolemaic coordinates is important for an analysis of the distortions of the Ptolemaic coordinates, i.e. for providing a stochastic adjustment model to detect systematic and gross errors.An analysis of the distortions is helpful for the identification of unknown Ptolemaic locations.In doing so the distortions are described by a mathematical function so that the Ptolemaic positions can be transformed into the modern geographic coordinate system.An identification of the Ptolemaic locations in conjunction with an analysis of distortions is currently performed by an interdisciplinary project group at the Technische Universität Berlin.The works on Europe (Books II and III) have been completed; the results on Book II, Chapters 9 and 11-13 are published in Kleineberg et al. (2010).
• y → xz • 00 : erroneous transcription of a coordinate value XZ to XZ, whereby the fraction Z becomes the degree value Z, with the possible values ς = 6, δ = 4 and γ = 3 for Z and the multiple of 10 X.