Oversleeping: errors in delays

The tick size becomes important just about every time you ask the kernel to do something related to pausing or delaying your process.

This includes calls to the following functions:

select()
alarm()
nanosleep()
nanospin()
delay()
the whole family of timer_*() functions

Normally, you use these functions assuming they'll do exactly what you say: “Sleep for 8 seconds!”, “Sleep for 1 minute!”, and so on. Unfortunately, you get into problems when you say “Sleep for 1 millisecond, ten thousand times!”

Delaying for a second: inaccurate code

Does this code work assuming a 1 ms tick?

void OneSecondPause() {

  /* Wait 1000 milliseconds. */
  for ( i=0; i < 1000; i++ ) delay(1);
}

Unfortunately, no, this won't return after one second on IBM PC hardware. It'll likely wait for three seconds. In fact, when you call any function based on the nanosleep() or select() functions, with an argument of n milliseconds, it actually takes anywhere from n to infinity milliseconds. But more than likely, this example will take three seconds.

So why exactly does this function take three seconds?

Timer quantization error

What you're seeing is called timer quantization error. One aspect of this error is actually something that's so well understood and accepted that it's even documented in a standard: the POSIX Realtime Extension (1003.1b-1993/1003.1i-1995). This document says that it's all right to delay too much, but it isn't all right to delay too little—the premature firing of a timer is undesirable.

Since the calling of delay() is asynchronous with the running of the clock interrupt, the kernel has to add one clock tick to a relative delay to ensure the correct amount of time (consider what would happen if it didn't, and a one-tick delay was requested just before the clock interrupt went off).

Figure 1. A single 1 ms sleep with error.

That normally adds half a millisecond each time, but in the example given, you end up synchronized with the clock interrupt, so the full millisecond gets tacked on each time.

Figure 2. Twelve 1 ms sleeps with each one's error.

The small error on each sleep accumulates:

Figure 3. Twelve 1 ms sleeps with the accumulated error.

OK, that should make the loop last 2 seconds—where's the extra second coming from?

The tick and the hardware timer

The problem is that when you request a 1 ms tick rate, the kernel may not be able to actually give it to you because of the frequency of the input clock to the timer hardware. In such cases, it chooses the closest number that's faster than what you requested. In terms of IBM PC hardware, requesting a 1 ms tick rate actually gets you 999,847 nanoseconds between each tick. With the requested delay, that gives us the following:

1,000,000 ns + 999,847 ns = 1,999,847 ns of actual delay
1,999,847 ns / 999,847 ns = 2.000153 ticks before the timer expires

Since the kernel expires timers only at a clock interrupt, the timer expires after ceil(2.000153) ticks, so each delay(1) call actually waits:

999,847 ns * 3 = 2,999,541 ns

Multiply that by 1000 for the loop count, and you get a total loop time of 2.999541 seconds.

Delaying for a second: better code

So this code should work?

void OneSecondPause() {

    /* Wait 1000 milliseconds. */
   for ( i=0; i < 100; i++ ) delay(10);
}

It will certainly get you closer to the time you expect, with an accumulated error of only 1/10 of a second.