@TITLE = On the Accuracy and Stability of Clocks Synchronized by the Network Time Protocol in the Internet System<$FReprinted from: Mills, D.L. On the accuracy and stability of clocks synchronized by the Network Time Protocol in the Internet system. ACM Computer Communication Review 20, 1 (January 1990), 65-75.> <$FSponsored by: Defense Advanced Research Projects Agency contract number N00140-87-C-8901 and by National Science Foundation grant number NCR-89-13623.> @AUTHOR = David L. Mills Electrical Engineering Department University of Delaware @AUTHOR = Abstract @ABSTRACT = This paper describes a series of experiments involving over 100,000 hosts of the Internet system and located in the U.S., Europe and the Pacific. The experiments are designed to evaluate the availability, accuracy and reliability of international standard time distribution using the Internet and the Network Time Protocol (NTP), which has been designated an Internet Standard protocol. NTP is designed specifically for use in a large, diverse internet system operating at speeds from mundane to lightwave. In NTP a distributed subnet of time servers operating in a self-organizing, hierarchical, master-slave configuration exchange precision timestamps in order to synchronize host clocks to each other and national time standards via wire or radio. @ABSTRACT = The experiments are designed to locate Internet hosts and gateways that provide time by one of three time distribution protocols and evaluate the accuracy of their indications. For those hosts that support NTP, the experiments determine the distribution of errors and other statistics over paths spanning major portions of the globe. Finally, the experiments evaluate the accuracy and reliability of precision timekeeping using NTP and typical Internet paths involving ARPANET, NSFNET and regional networks. The experiments demonstrate that timekeeping throughout most portions of the Internet can be maintained to an accuracy of a few tens of milliseconds and a stability of a few milliseconds per day, even in cases of failure or disruption of clocks, time servers or networks. Keywords: network clock synchronization, standard-time distribution, performance evaluation, internet protocol. @HEAD LEVEL 1 = Introduction How do hosts and gateways in a large, dispersed networking community know what time it is? How accurate are their clocks? In a 1988 survey involving 5,722 hosts and gateways of the Internet system [14], 1158 provided their local time via the network. Sixty percent of the replies had errors greater than one minute, while ten percent had errors greater than 13 minutes. A few had errors as much as two years. Most host clocks are set by eyeball-and-wristwatch to within a minute or two and rarely checked after that. Many of these are maintained by some sort of battery-backed clock/calender device using a room-temperature quartz oscillator that may drift seconds per day and can go for weeks between manual corrections. For many applications, especially those designed to operate in a distributed internet environment, much greater accuracy, stability and reliability are required. The Network Time Protocol (NTP) is designed to distribute standard time using the hosts and gateways of the Internet system. The Internet consists of over 100,000 hosts on over 800 packet-switching networks interconnected by a comparable number of gateways. While the Internet backbone networks and gateways are engineered and managed for good service, operating speeds and service reliabilities vary considerably throughout the regional and campus networks of the system. This places severe demands on NTP, which must deliver accurate, stable and reliable standard time throughout the system, in spite of component failures, service disruptions and possibly mis-engineered implementations. NTP and its forebears were developed and tested on PDP11 computers and the Fuzzball operating system, which was designed specifically for timekeeping precisions of a millisecond or better [15]. An implementation of NTP as a Unix 4.3bsd system daemon was built by Michael Petry and Louis Mamakos at the University of Maryland. A special-purpose hardware/software implementation of NTP was built be Dennis Ferguson at the University of Toronto. At least 16 NTP primary time servers are presently synchronized by radio or satellite to national time standards in the U.S., Canada and the U.K. About half of these are connected directly to backbone networks and are intended for ubiquitous access, while the remainder are connected to regional and campus networks and intended for local distribution. It is estimated that there are well over 2000 secondary servers in North America, Europe and the Pacific synchronized by NTP directly or indirectly to these primary servers. This paper describes several comprehensive experiments designed to evaluate the availability, accuracy, stability and reliability of standard time distribution using NTP and the hosts and gateways of the Internet. The first is designed to locate hosts that support at least one of three time protocols specified for use in the Internet, including NTP. Since Internet hosts are not centrally administered and network time is not a required service in the TCP/IP protocol suite, experimental determination is the only practical way to estimate the penetration of time service in the Internet. The remaining experiments use only NTP and are designed to assess the nominals and extremes of various errors that occur in regular system operation, including those due to the network paths between the servers and the radio propagation paths to the source of synchronization, as well as the intrinsic stabilities of the various radio clocks and local clocks in the system. This paper does not describe in detail the architecture or protocols of NTP, nor does it present the rationale for the particular choice of synchronization method and statistical processing algorithms. Further information on the background, model and algorithms can be found in [18], while details of the latest NTP protocol specification can be found in [16]. This paper itself is an edited and expanded version of [17]. @HEAD LEVEL 2 = Standard Time and Frequency Dissemination In order that precision time and frequency can be coordinated throughout the world, national administrations operate primary time and frequency standards and maintain Coordinated Universal Time (UTC) by observing various radio broadcasts and through occasional use of portable atomic clocks. A primary frequency standard is an oscillator that can maintain extremely precise frequency relative to a physical phenomenon, such as a transition in the orbital states of an electron. Presently available atomic oscillators are based on the transitions of the hydrogen, cesium and rubidium atoms and are capable of maintaining fractional frequency stability to 10-13 and time to 100 ns when operated in multiple ensembles at various national standards laboratories. The U.S. National Institute of Standards and Technology (NIST - formerly National Bureau of Standards) operates radio broadcast services for the dissemination of standard time [21]. These include short-wave transmissions from stations WWV at Fort Collins, CO, and WWVH at Kauai, HI, long-wave transmissions from WWVB, also at Fort Collins, and satellite transmissions from the Geosynchronous Orbiting Environmental Satellite (GOES). These transmissions and those of some other countries, including Canada and the U.K., include a timecode modulation which can be decoded by special-purpose radio receivers and interfaced to an NTP time server. Using high-frequency transmissions, reliable frequency comparisons can be made to the order of 10-7, but time accuracies are limited to the order of a millisecond [5]. Using long-wave transmissions and appropriate receiving and averaging techniques and corrections for diurnal and seasonal propagation effects, frequency comparisons to within 10-11 are possible and time accuracies of from a few to 50 microseconds can be obtained. Using GOES the accuracy depends on an accurate ephemeris and correction factors, but is generally of the same order as WWVB. Other systems intended primarily for navigation, including LORAN-C [8], Global Positioning System (GPS) [4], OMEGA [25], and various very-low-frequency communication stations in principle can be used for very precise time and frequency transfer on a global scale; however, these systems do not provide timecodes including time-of-day or day-of-year information. @HEAD LEVEL 2 = The Network Time Protocol An accurate, reliable time distribution protocol must provide the following: @INDENT HEAD = 1. @INDENT = The primary time reference source(s) must be synchronized to national standards by wire, radio or portable clock. The system of time servers and clients must deliver continuous local time based on UTC, even when leap seconds are inserted in the UTC timescale. @INDENT HEAD = 2. @INDENT = The time servers must provide accurate, stable and precise time, even with relatively large statistical delays on the transmission paths. This requires careful design of the data smoothing and deglitching algorithms, as well as an extremely stable local clock oscillator and synchronization mechanism. @INDENT HEAD = 3. @INDENT = The synchronization subnet must be reliable and survivable, even under unstable conditions and where connectivity may be lost for periods extending to days. This requires redundant time servers and diverse transmission paths, as well as a dynamically reconfigurable subnet architecture. @INDENT HEAD = 4. @INDENT = The synchronization protocol must operate continuously and provide update information at rates sufficient to compensate for the expected wander of the room-temperature quartz oscillators commonly used in ordinary computer systems. It must operate efficiently with large numbers of time servers and clients in continuous-polled and procedure- call modes and in multicast and point-to-point configurations. @INDENT HEAD = 5. @INDENT = The system must operate with a spectrum of systems ranging from personal workstations to supercomputers, but make minimal demands on the operating system and supporting services. Time server software and especially client software must be easily installed and configured. In addition to the above, and in common with other promiscuously distributed services, the system must include generic protection against accidental or willful intrusion and provide a comprehensive interface for network management. In NTP address filtering is used for access control, while encrypted checksums are used for authentication [16]. Network management presently uses a proprietary protocol with provisions to migrate to standard protocols where available. In NTP one or more primary time servers synchronize directly to external reference sources such as radio clocks. Secondary time servers synchronize to the primary servers and others in a configured subnet of NTP servers. Subnet servers calculate local clock offsets and delays between them using timestamps with 200 picosecond resolution exchanged at intervals up to about 17 minutes. As explained in [16], the protocol uses a distributed Bellman-Ford algorithm [3] to construct minimum- weight spanning trees within the subnet based on hierarchical level (stratum) and total synchronization path delay to the primary servers. A typical NTP synchronization subnet is shown in Figure 1a<$&fig12>, in which the nodes represent subnet servers and normal stratum number and the heavy lines the active synchronization paths. The light lines represent backup synchronization paths where timing information is exchanged, but not necessarily used to synchronize the local clock. Figure 1b shows the same subnet, but with the line marked x out of service. The subnet has reconfigured itself automatically to use backup paths, with the result that one of the servers has dropped from stratum 2 to stratum 3. Besides NTP, there are several protocols designed to distribute time in local-area networks, including the DAYTIME protocol [22], TIME Protocol [23], ICMP Timestamp message [7] and IP Timestamp option [24]. The DCN routing protocol incorporates time synchronization directly into the routing protocol using algorithms similar to NTP [11]. The Unix 4.3bsd time daemon timed uses a single master-time daemon to measure offsets of a number of slave hosts and send periodic corrections to them [9]. However, these protocols do not include engineered algorithms to compensate for the effects of statistical delay variations encountered in wide-area networks and are unsuitable for precision time distribution throughout the Internet. @HEAD LEVEL 2 = Determining Time and Frequency In this paper to synchronize frequency means to adjust the clocks in the network to run at the same frequency, to synchronize time means to set the clocks so that all agree at a particular epoch with respect to UTC, as provided by national standards, and to synchronize clocks means to synchronize them in both frequency and time. A clock synchronization subnet operates by measuring clock offsets between the various servers in the subnet and so is vulnerable to statistical delay variations on the various transmission paths between them. In the Internet the paths involved can have wide variations in delay and reliability, while the routing algorithms can select landline or satellite paths, public network or dedicated links or even suspend service without prior notice. In statistically noisy internets accurate time synchronization requires carefully engineered filtering and selection algorithms and the use of redundant resources and diverse transmission paths, while stable frequency synchronization requires finely tuned local clock tracking loops and multiple offset comparisons over relatively long periods of time. For instance, while only a few comparisons are usually adequate to resolve local time for an Internet host to within a few tens of milliseconds, dozens of measurements over many hours are required to achieve a frequency stability of a few tens of milliseconds per day and hundreds of measurements over many days to achieve the ultimate accuracy of a millisecond per day. Figure 2<$&fig11> shows the overall organization of the NTP time server model. Timestamps exchanged with possibly many other servers are used to determine individual roundtrip delays and clock offsets relative to each server as follows. Number the times of sending and receiving NTP messages as shown below and let i be an even integer. <$&fig[^]>Then <$Et sub {i-3},~t sub{i-2},~t sub {i-1},~t> are the values of the four most recent timestamps as shown. The roundtrip delay di and clock offset ci of the receiving server relative to the sending server is: @CENTER = <$Ed sub i~=~(t sub i~-~t sub {i - 3} )~-~(t sub {i - 1}~-~t sub {i - 2} )> , <$Ec sub i~=~{(t sub {i - 2}~-~t sub {i-3})~+~(t sub {i-1}~-~t sub i ) } over 2> . This method amounts to a continuously sampled, returnable-time system, which is used in some digital telephone networks [19]. Among the advantages are that the transmitted time and received order of the messages are unimportant and that reliable delivery is not required. Obviously, the accuracies achievable depend upon the statistical properties of the outbound and inbound data paths. Further analysis and experimental results bearing on this issue can be found in [6], [12] and [13]. As shown in Figure 2, the computed offsets are first filtered to reduce incidental noise and then evaluated to select the most accurate and reliable subset among all available servers. The filtered offsets from this subset are first combined using a weighted average and then processed by a phase-locked loop (PLL). In the PLL the phase detector (PD) produces a correction term, which is processed by the loop filter to control the local clock, which functions as a voltage-controlled oscillator (VCO). Further discussion on these components is given in subsequent sections. @HEAD LEVEL 1 = Discovering Internet Timetellers An experiment designed to discover Internet time server hosts and evaluate the quality of their indications was conducted over a nine-day interval in August 1989. This experiment is an update of previous experiments conducted in 1985 [13] and early 1988 [14]. It involved sending time-request messages in each of three time distribution protocols: ICMP Timestamp, TIME and NTP, to every Internet address that could reasonably be associated with a working host. Previously, lists of such addresses were derived from the Internet host table maintained by the Network Information Center (NIC), which contained 6382 distinct host and gateway addresses as of August 1989. With the proliferation of the Internet domain-name system used to resolve host addresses from host names [20], the NIC host table has become increasingly inadequate as a discovery vehicle for working host addresses. In a comprehensive survey of the domain-name system, Mark Lotter of SRI International recently compiled a revised host table of 137,484 entries. Each entry includes two lists, one containing the Internet addresses of a single host or gateway and the other containing its associated domain names. For the experiment this 9.4-megabyte table was sorted by address and extraneous information deleted, such as entries containing missing or invalid addresses, to produce a control file of 112,370 entries. The experiment itself was conducted with the aid of the control file and a specially constructed experiment program written for the Fuzzball operating system [15]. The data were collected using experiment hosts located at the University of Delaware and connected to the University of Delaware campus network and SURA regional network. The experiment program reads each entry from the control file in turn and sends time- request messages to the first Internet address found. If no reply is received after one second, the program tries again. If no reply is received after an additional second, the program abandons the attempt and moves to the next entry in the control file. The program accumulates error messages and sample data for up to eight samples in each of the three time protocols. It abandons a host upon receipt of an ICMP error message [7] and abandons further hosts on the same network upon receipt of an ICMP net-unreachable message. Using this procedure, attempts were made to read the clock for 107,799 distinct host addresses. In the experiment the clock offsets were measured for each of the three time protocols relative to the local clock used on the experiment host, which is synchronized via radio to NBS standards to within a few milliseconds. The maximum, minimum and mean offset for up to eight replies for each protocol was computed and written to a statistics file, which contains valid responses, ICMP error messages of various kinds, timeout messages and other error indications. In the tabulation shown in Table 1<$&tab1> the timeout column shows the number of occasions when no reply was received, while the error column shows the error messages received, including ICMP time-exceeded, ICMP host-unreachable and ICMP port-unreachable messages. The unknown column tabulates occurrences of a specially marked ICMP Timestamp reply that indicates the host supports the protocol, but does not have a synchronized time-of-day clock. In summary, of the 107,799 host addresses surveyed, 94,260 resulted in some kind of entry in the statistics file. Of these, 20,758 hosts (22%) were successful in returning an apparently valid indication. Note that there may be more than one attempt to read a host clock and that some clocks were read using more than one protocol. The valid entries were then processed to delete all except the first entry received for each address and protocol. In addition, if a host replied to an NTP request, all other entries for that host were deleted, while, if a host did not reply to an NTP request, but did for a TIME request, all other entries for that host were deleted. This results in a list of 8455 hosts which provided an apparently valid time indication, including 3694 for ICMP Timestamp, 7666 for TIME and 789 for NTP. In order to discover as many NTP hosts as possible, the NTP synchronization subnet operating in the Internet was explored starting from the known primary servers using special monitoring programs designed for this purpose. This search, together with those discovered using the domain-name system and additional information gathered by other means, resulted in a total of about 990 NTP hosts. These hosts were then surveyed again, while keeping track of ancillary information to determine whether they were synchronized and operating correctly. This resulted in a list of 946 hosts apparently synchronized to the NTP subnet and operating correctly. The methodology used here can miss a sizeable number of NTP hosts, such as personal computers, hosts not listed in the NIC or domain-name database and implementations that do not respond to the monitoring programs. In fact, extrapolating from data assembled from personal communications, the grand search described here discovered much less than half of the NTP-speaking hosts. @HEAD LEVEL 2 = Evaluation of Timekeeping Accuracy by Protocol In evaluating the quality of standard time distribution it is important to understand the effects of errors on the applications using the service. For many applications the maximum error under all conditions is more important than the mean error under controlled conditions. In these applications conventional statistics such as mean and variance are inappropriate. A useful statistic has been found to be the error distribution plotted on log-log axes and showing the probability P(x>>a) that a sample x from the population exceeds the value a on the x axis. Figure 3<$&fig2> shows the error distributions for each of the three time protocols included in the survey. The top line in Figure 3 is for ICMP Timestamp, the next down is for TIME and the bottom is for NTP. The graphs shown in Figure 3 suggest several conclusions. First, the time accuracy of the various hosts varies dramatically over at least nine decades from milliseconds to over 11 days. To be sure, not many hosts showed very large errors and there is cause to believe these hosts either were never synchronized or were operating improperly. In the case of NTP, for example, which is designed expressly for time synchronization, eight hosts showed errors above ten seconds, a value considered barely credible for a host correctly synchronized by NTP in the Internet. It is very likely that some or all of these hosts, representing about one percent of the total NTP population, were using an old NTP implementation with known bugs. On the other hand, one percent of the ICMP Timestamp hosts show errors greater than a day, while one percent of TIME hosts show errors greater than a few hours. Clearly, at least on some machines running the latter two protocols, time is not considered a cherished service. At the other end of the scale, Figure 3 suggests that at least 30 percent of the hosts in all three protocols make some attempt to maintain accurate time to about 30 ms with NTP, a minute with TIME and a couple of minutes with ICMP Timestamp. Between this regime and the one- percent regime the accuracies deteriorate; however, in general, NTP hosts maintain time about a thousand times more accurate than either of the other two protocols. @HEAD LEVEL 1 = NTP Performance Analysis and Measurement The above experiments were designed to assess the performance of all time servers that could be found in the Internet, regardless of protocol, system management discipline or protocol conformance. The remaining experiments described in this paper involve only the NTP protocol and the algorithms used in NTP implementations to synchronize the local clock. @HEAD LEVEL 2 = Accuracy and Stability of NTP Primary Time Servers In this experiment a number of NTP primary time servers was surveyed for overall accuracy and stability. Primary servers are synchronized by radio or satellite to national standards and located at or near points of entry to national and international backbone networks. Since they are monitored and maintained on a regular basis, their performance can be taken as representative of a managed system. The experiment operated over a two-week period in August 1989 using paths between six primary servers on the east coast, west coast and midwest. All measurements were made from an experiment host located at the University of Delaware. Most of the paths involve links operating at 1.5 Mbps or higher, although there are over a dozen links on some paths and some lower speed links are in use. Samples of roundtrip delay and clock offset were collected at intervals from one to 17 minutes on all six paths and the data recorded in files for later analysis. Table 2<$&tab2> shows the results of the survey, which involved about 33,000 samples. For each server the name, synchronization source, number of gateway/router hops and number of samples are shown. The offset and delay columns show the sample medians for these quantities in milliseconds. Note that the number of samples collected depends on whether the server is selected for clock synchronization, as determined by the NTP clock-selection algorithm described in [16]. As in previous surveys of this type, statistics based on the sample median yield more reliable results than those based on the sample mean. However, statistics based on the trimmed mean (also called Fault- Tolerant Average [10]) with 25 percent of the samples removed are within a millisecond of the values shown in Table 2. The residual offset errors apparent in Table 2 can be traced to subtle asymmetries in path routing and network/gateway configurations. If these can be calibrated, perhaps using a portable atomic clock, reliable time transfer over the Internet should be possible within a millisecond or two if measurements are made over periods consistent with the two-week experiment. Assuming successive offset measurements can be made with confidence to this order, frequency transfer over the Internet could in principle be determined to the order of 10-9 in two weeks. In order to test this conjecture an experiment was designed to determine the stability of the apparent timescale constructed from the first-order offset differences produced in an experiment similar to that which produced Table 2. This is similar to the approach described in [1] to analyze the intrinsic characteristics of a precision oscillator. In the month-long experiment, measured offsets were filtered by the algorithm described in the next section. The resulting samples were averaged at given intervals from about a minute to about ten days. The difference in offsets at the beginning and end of the interval divided by the duration of the interval represents the frequency during that interval. The standard deviation <$Esigma ( tau )> calculated from the sample population for each given interval <$Etau> is shown in Figure 4<$&fig10>. Among the primary servers listed in Table 2, the lower curve represents the "best" one (UMD) and the upper curve the "worst" one (ISI). The results show that, even for the best server and using carefully filtered data averaged over periods in the order of days, reliable stabilities approaching .01 parts per million (ppm) - about a millisecond per day - are difficult to achieve without further processing. Techniques which can approach this goal will be presented later in this paper. @HEAD LEVEL 2 = Effects Due to Filtering Algorithm In order to more completely assess the accuracy and reliability that clocks can be synchronized using NTP and the Internet, the paths illustrated in Table 2 were carefully measured in several surveys conducted over a period of 18 months. Each survey used up to six time servers and lasted up to two weeks. A typical survey involves the path between experiment hosts at the University of Delaware and USC Information Sciences Institute, located near Los Angeles, over a complex path of up to twelve network hops involving NSFNET, ARPANET and several other regional and campus nets. This path was purposely selected as among the statistically noisiest in order to determine how well clocks can be synchronized under adverse conditions. A number of algorithms for deglitching and filtering time-offset data are summarized in [12] and [18]. Experiments during the development of NTP Version 2 have produced an algorithm which provides high accuracy together with a low computational burden. The key to the new algorithm becomes evident through an examination of scatter diagrams plotting clock offset versus roundtrip delay. Without making any assumptions about the distributions of queueing and transmission delays on either direction along the path between two servers, but assuming the intrinsic frequency errors of the two clocks are relatively small, let d0 and c0 represent the delay and offset when no other traffic is present on the path and so represents the best estimates of the true values. The problem is to accurately estimate d0 and c0 from a sample population of di and ci collected under typical conditions and varying levels of network load. Figure 5<$&fig3> shows a typical scatter diagram for the path under study, in which the points (di, ci) are concentrated near the apex of a wedge defined by lines extending from the apex with slopes 0.5, corresponding to the locus of points as the delay in one direction increases while the delay in the other direction does not. From these data it is obvious that good estimators for (d0, c0) are points near the apex and that the best offset samples occur at the lower delays. Therefore, an appropriate technique is simply to select from the n most recent samples the sample with lowest delay and use its associated offset as the estimate. This is the basis of the clock filter shown in Figure 2 and the NTP Version 2 algorithm described in detail in [16]. Figure 6<$&fig4> shows the raw time-offset series for the path under study over a six-day interval, in which occasional errors up to several seconds are apparent. Figure 7<$&fig5> shows the time-offset series produced by the filtering algorithm, in which the large errors have been dramatically reduced. Finally, the overall performance of the path is apparent from the error distributions shown in Figure 8<$&fig6>. The upper line shows the distribution for the raw data, while the lower line shows the filtered data. The significant facts apparent from the latter line are that the median error over all samples was only a few milliseconds, while the maximum error was no more than 50 ms. @HEAD LEVEL 2 = Effects due to Other Processing Algorithms Precision timekeeping requires an exceptionally stable local oscillator reference in order to deliver accurate time when the synchronization path to a primary server has failed. Furthermore, the oscillator and control loop must maintain accurate time and stable frequency over wide variations in synchronization path delays. For instance, in order to maintain time to within a millisecond per day without outside reference, the local oscillator frequency must maintain stability to within .01 ppm or better. Stabilities of this order usually require a relatively expensive oven- compensated quartz oscillator, which is not a common component in everyday computer systems. The NTP local clock model uses an adaptive- parameter, type-II, phase-locked loop (PLL), which continuously corrects local oscillator phase and frequency variations relative to updates received from the network or radio clock. The (open-loop) transfer function is @CENTER = <$EF(s)~=~{ omega sub c sup 2 } over { s sup 2 tau sup 2 }~(1~+~{ s tau } over { omega sub z})> , where <$Eomega sub c> is the gain (crossover frequency), <$Eomega sub z> the corner frequency of the lead network (necessary for PLL stability), and <$Etau> is a parameter used for bandwidth control. Bandwidth control is necessary to match the PLL dynamics to varying levels of timing noise due to the intrinsic stability of the local oscillator and the prevailing path delays in the network. On one hand, the loop must track uncompensated board-mounted crystals found in common computing equipment, where the frequency tolerance may be only .01 percent and can vary several ppm as the result of normal room temperature changes. On the other hand, after the frequency errors have been tracked for several days, and assuming the local oscillator can be stabilized accordingly, the loop must maintain stabilities to the order of .01 ppm. The NTP PLL is designed to adapt automatically to these regimes by measuring the sample variance and adjusting <$Etau> over a 16-fold range. In order to assess how closely the NTP PLL meets these objectives, the experiment described in Section 3.1 above was repeated, but with the local clock of the experiment host derived from a precision quartz oscillator. The offsets measured between each of the six primary servers and the experiment host were collected and processed by a simulator that duplicates the NTP processing algorithms. However, in addition to the algorithms described in [16], which select a subset of quality clocks and from them a single clock as the synchronization source, an experimental clock-combining method involving a weighted average of offsets from all selected clocks was used. In principle, such methods can reduce the effect of systematic offsets shown in Table 2 [2]. However, these methods can also significantly increase the sample variance presented to the PLL and thus reduce the local-clock stability below acceptable levels. Thus, the experiment represents a worst-case scenario. Figure 9<$&fig7> shows the frequency error distribution produced by the simulator using offset samples collected from all six primary servers over a four-week period. The results show that the maximum frequency error over the entire period from all causes is less than .02 ppm, or a couple of milliseconds per day. During this period there were several instances where other servers failed and where severe congestion on some network paths caused weighting factors to change in dramatic ways and <$Etau> to be adjusted accordingly. Figure 9 may thus represents the bottom line on system performance at the present level of NTP technology refinement. @HEAD LEVEL 1 = Accuracy and Stability of Radio Synchronization In order to assess the overall system synchronization accuracy relative to UTC, it is necessary to consider the inherent accuracy, stability and precision of the radio propagation paths and radio clocks themselves. All of the radio clocks used in the surveys have a design precision within one millisecond and are potentially accurate to within a millisecond or two relative to the propagation medium. However, the absolute accuracy depends on knowledge of the radio propagation path to the source of standard time and frequency. In addition, the radio clocks themselves can be a source of random and systematic errors. @HEAD LEVEL 2 = Estimation of Propagation Delays An evaluation of the timekeeping accuracy of the NTP primary servers relative to national standards in principle requires calibration by a portable atomic clock; however, in the absence of a portable clock, the propagation delay can be estimated for the great-circle path between the known geographic coordinates of the transmitter and receiver. However, this can result in errors as large as two milliseconds when compared to the actual oblique ray path. Additional errors can be introduced by unpredictable latencies in the radio clocks, operating system, hardware and in the protocol software (e.g., encryption delays) for NTP itself. It is possible to estimate the timekeeping accuracy by means of a detailed analysis of the radio propagation path itself. In the case of the WWVB and MSF services on 60 kHz, the variations in path delay are relatively well understood and limited to the order of 50 microseconds [5]. In the case of the GOES service the accuracy is limited by the ability to accurately estimate the distance along the line-of-sight path to the satellite and the ability to maintain accurate stationkeeping in geosynchronous orbit. In principle, the estimation errors for either of these services is small compared to the accuracy usually expected of Internet timestamps generated with NTP. However, in the case of the WWV/H and CHU services, which operate on HF frequencies from 2.5 through 20 MHz, radio propagation is determined by the upper ionospheric layers, which vary in height throughout the day and night, and by the geometric ray path determined by the maximum usable frequency (MUF) and other factors, which also vary throughout the day, season and phase of the 11-year sunspot cycle. In an effort to calibrate how these effects affect the limiting accuracy of the NTP primary servers using WWV/H and CHU services, existing computer programs were used to determine the maximum usable frequency (MUF) and propagation geometry for typical ionospheric conditions forecast for January 1990 on the 2476-km path between Newark, DE, and Fort Collins, CO, by two-hour intervals. The results, shown in Table 3<$&tab3>, assume a smoothed sunspot number (SSN) of 194 and include the time interval (UTC hour), MUF (MHz) and delay (ms) for frequencies from 2.5 through 20 MHz. In case no propagation path is likely, the delay entry is left blank. The delay itself is followed by a code indicating whether the path is entirely in sunlight (j), in darkness (n) or mixed (x) and the number of hops. A symbol (m) indicates two or more geometric paths are likely with similar amplitudes, which may result in multipath fading and unstable indications. From Table 3 it can be seen that the delay decreases as the controlling ionospheric layer (F2) falls during the night (to about 250 km) and rises during the day (to about 350 km). The delay also changes when the number of hops and thus the oblique ray geometry changes. The maximum delay variation for this particular path is from 8.6 to 9.7 ms, a variation of 1.1 ms. While this variation represents a typical scenario, other scenarios have been found where the variations exceed two milliseconds. These results demonstrate that the ultimate accuracy of HF-radio derived NTP time may depend on the ability to accurately estimate the propagation path variations or to confine observations to the same time each day. @HEAD LEVEL 2 = Accuracy and Stability of Radio Clocks The final experiment reported in this paper involves an assessment of the accuracy and stability of a commercial WWV/H radio clock under typical propagation conditions. In order to separate these effects from those due to the measurement host, the local clock was derived from a precision oven-compensated quartz oscillator with rated stability of 5x10-9 per day and aging rate of 1x10-9 per day. The oscillator was set to within about 1x10-8 relative to the 20-MHz WWV transmission under good propagation conditions near midday at the midpoint of the propagation path. The offsets of the radio clock relative to the local clock were filtered and processed by the NTP algorithms (open loop) and then recorded at 30-second intervals for a period of about two weeks. The results of the experiment are shown in Figure 10<$&fig8> and Figure 11<$&fig9>. Figure 10 shows the estimated frequency error by intervals for the entire period and reveals a frequency stability generally within .05 ppm, except for occasional periods where apparent phase hits cause the indications to surge. The times of these surges are near times when the path MUF between the transmitter and receiver is changing rapidly (see Table 3) and the receiver must change operating frequency to match. An explanation for the surges is evident in Figure 11, which shows the measured offsets during an interval including a typical surge. The figure shows a negative phase excursion of about 10 ms near the time the MUF would ordinarily fall in the evening and a similar positive excursion near the time the MUF would ordinarily rise in the morning. Since the phase excursions are far beyond those expected due to ionospheric effects alone, the most likely explanation is that the increased noise in received WWV/H signals near the time of MUF-related frequency changes destabilizes the signal processing algorithms resulting in incorrect signal tracking. This particular problem has not been observed with WWVB or GOES radio clocks. @HEAD LEVEL 1 = Conclusions Over the years it has become something of a challenge to discover and implement architectures, algorithms and protocols which deliver precision time in a statistically rambunctious Internet. In perspective, for the ultimate accuracy in frequency and time transfer, navigation systems such as LORAN-C, OMEGA and GPS, augmented by portable atomic clocks, are the preferred method. On the other hand, it is of some interest to identify the limitations and estimate the magnitude of timekeeping errors using NTP and typical Internet hosts and network paths. This paper has identified some of what are believed to be the major limitations in accuracy and measured their effects in large-scale experiments involving major portions of the Internet. The results demonstrated in this paper suggest several improvements that can be made in subsequent versions of the protocol and hardware/software implementations, such as improved radio clock designs, improved timebase hardware, at least at the primary servers, improved frequency-estimation algorithms and more diligent monitoring of the synchronization subnet. When a sufficient number of these improvements mature, NTP Version 3 may appear. @HEAD LEVEL 1 = References @INDENT HEAD = 1. @INDENT = Allan, D.W., J.H. Shoaf and D. Halford. Statistics of time and frequency data analysis. In: Blair, B.E. (Ed.). Time and Frequency Theory and Fundamentals. National Bureau of Standards Monograph 140, U.S. Department of Commerce, 1974, 151-204. @INDENT HEAD = 2. @INDENT = Allan, D.W., J.E. Gray and H.E. Machlan. The National Bureau of Standards atomic time scale: generation, stability, accuracy and accessibility. In: Blair, B.E. (Ed.). Time and Frequency Theory and Fundamentals. National Bureau of Standards Monograph 140, U.S. Department of Commerce, 1974, 205-231. @INDENT HEAD = 3. @INDENT = Bertsekas, D., and R. Gallager. Data Networks. Prentice-Hall, Englewood Cliffs, NJ, 1987. @INDENT HEAD = 4. @INDENT = Beser, J., and B.W. Parkinson. The application of NAVSTAR differential GPS in the civilian community. Navigation 29, 2 (Summer 1982). @INDENT HEAD = 5. @INDENT = Blair, B.E. Time and frequency dissemination: an overview of principles and techniques. In: Blair, B.E. (Ed.). Time and Frequency Theory and Fundamentals. National Bureau of Standards Monograph 140, U.S. Department of Commerce, 1974, 233-313. @INDENT HEAD = 6. @INDENT = Cole, R., and C. Foxcroft. An experiment in clock synchronisation. The Computer Journal 31, 6 (1988), 496-502. @INDENT HEAD = 7. @INDENT = Defense Advanced Research Projects Agency. Internet Control Message Protocol. DARPA Network Working Group Report RFC-792, USC Information Sciences Institute, September 1981. @INDENT HEAD = 8. @INDENT = Frank, R.L. History of LORAN-C. Navigation 29, 1 (Spring 1982). @INDENT HEAD = 9. @INDENT = Gusella, R., and S. Zatti. The Berkeley UNIX 4.3BSD time synchronization protocol: protocol specification. Technical Report UCB/CSD 85/250, University of California, Berkeley, June 1985. @INDENT HEAD = 10. @INDENT = Kopetz, H., and W. Ochsenreiter. Clock synchronization in distributed real-time systems. IEEE Trans. Computers C-36, 8 (August 1987), 933-939. @INDENT HEAD = 11. @INDENT = Mills, D.L. DCN local-network protocols. DARPA Network Working Group Report RFC-891, M/A-COM Linkabit, December 1983. @INDENT HEAD = 12. @INDENT = Mills, D.L. Algorithms for synchronizing network clocks. DARPA Network Working Group Report RFC-956, M/A-COM Linkabit, September 1985. @INDENT HEAD = 13. @INDENT = Mills, D.L. Experiments in network clock synchronization. DARPA Network Working Group Report RFC-957, M/A-COM Linkabit, September 1985. @INDENT HEAD = 14. @INDENT = Mills, D.L. Network Time Protocol (version 1) specification and implementation. DARPA Network Working Group Report RFC-1059, University of Delaware, July 1988. @INDENT HEAD = 15. @INDENT = Mills, D.L. The fuzzball. Proc. ACM SIGCOMM 88 Symposium (Palo Alto, CA, August 1988), 115-122. @INDENT HEAD = 16. @INDENT = Mills, D.L. Network Time Protocol (version 2) specification and implementation. DARPA Network Working Group Report RFC-1119, University of Delaware, September 1989. @INDENT HEAD = 17. @INDENT = Mills, D.L. Measured performance of the Network Time Protocol in the Internet system. DARPA Network Working Group Report RFC-1128, University of Delaware, October 1989. @INDENT HEAD = 18. @INDENT = Mills, D.L. Internet time synchronization: the Network Time Protocol. DARPA Network Working Group Report RFC-1129, University of Delaware, October 1989. @INDENT HEAD = 19. @INDENT = Mitra, D. Network synchronization: analysis of a hybrid of master-slave and mutual synchronization. IEEE Trans. Communications COM- 28, 8 (August 1980), 1245-1259. @INDENT HEAD = 20. @INDENT = Mockapetris, P. Domain names - concepts and facilities. DARPA Network Working Group Report RFC-1034, USC Information Sciences Institute, November 1987. @INDENT HEAD = 21. @INDENT = Time and Frequency Dissemination Services. NBS Special Publication 432, U.S. Department of Commerce, 1979. @INDENT HEAD = 22. @INDENT = Postel, J. Daytime protocol. DARPA Network Working Group Report RFC-867, USC Information Sciences Institute, May 1983. @INDENT HEAD = 23. @INDENT = Postel, J. Time protocol. DARPA Network Working Group Report RFC-868, USC Information Sciences Institute, May 1983. @INDENT HEAD = 24. @INDENT = Su, Z. A specification of the Internet protocol (IP) timestamp option. DARPA Network Working Group Report RFC-781. SRI International, May 1981. @INDENT HEAD = 25. @INDENT = Vass, E.R. OMEGA navigation system: present status and plans 1977-1980. Navigation 25, 1 (Spring 1978).