10.2 Rounding errors

Although very useful, the enumerate function has a problem: when the increment \(\Delta_t\) is not an integer, its successive addition will accumulate rounding errors, which may cause a bizarre behavior. To exemplify this situation, let us consider the number of elements of two enumerations of the interval \([0,1]\), the first with an increment of \(\frac{1}{10}\) and the second with an increment of \(\frac{1}{100}\):

> length(enumerate(0, 1, 1//10))

11

> length(enumerate(0, 1, 1//100))

101

Since the enumerate function produces an array that includes both the first and the last element of the interval, we conclude that the number of generated elements is correct. However, if we replace \(\frac{1}{10}\) and \(\frac{1}{100}\) by the equivalent \(0.1\) and \(0.01\) we obtain something strange:

> length(enumerate(0, 1, 0.1))

11

> length(enumerate(0, 1, 0.01))

100

Obviously, there is a problem there: apparently, the second enumeration ended earlier than expected, failing to produce the final element. Worse still, this situation seems to have an irregular behavior, as we can see in the following interactions:

> length(enumerate(0, 1, 0.001))

1000

> length(enumerate(0, 1, 0.0001))

10001

The problem arises from the fact that \(0.1\) and \(0.01\) and are not accurately representable in binary notation. In fact, \(0.1\) is represented by a number that is slightly smaller than \(\frac{1}{10}\), while \(0.01\) is represented by a number slightly bigger than \(\frac{1}{100}\). As a consequence, a sufficiently large sum of these numbers eventually accumulates an error that makes the final calculated value to be greater than the limit, and thus be discarded. The history of computer science is filled with catastrophic examples caused by rounding errors, from mistakes in anti-missile defense systems to the wrong calculation of stock market indexes.

To avoid this problem, it would be natural to think that we should limit ourselves to the use of increments in the form of fractions but, unfortunately, that is not always possible. For example, when we use trigonometric functions (such as sines and cosines), it is usual to generate values in a range with an increment involving the irrational number \(\pi\), not representable as a fraction. Thus, to deal with real numbers properly, we should use another approach based on producing the number of actually desired values, avoiding successive sums. For that, instead of using the increment as a parameter, we will use the number of desired values instead.

In its actual form, from the limits \(t_0\) and \(t_1\) and the increment \(\Delta_t\), the enumerate function generates each of the points \(t_i\) by calculating:

\[t_i=t_0+\underbrace{\Delta_t+\Delta_t+\cdots{}+\Delta_t}_{\text{$i$ times}} \equiv\] \[t_i=t_0+\sum_{j=0}^{i}\Delta_t\]

The problem of the previous formula is, as we have seen, the potentially large error accumulation that occurs when we add many times the small error that exists in the value of \(\Delta_t\). To minimize this error accumulation, we must find an alternative formula that avoids the term \(\Delta_t\). A possibility is to consider, not an increment of \(\Delta_t\), but an \(n\) number of \(\Delta_t\) increments. Obviously, this \(n\) number relates to \(\Delta_t\) through the equation

\[n=\frac{t_1-t_0}{\Delta_t}\]

Consequently, we also have

\[\Delta_t=\frac{t_1-t_0}{n}\]

Replacing \(\Delta_t\) in the equation that calculates the points \(t_i\), we obtain

\[t_i=t_0+\sum_{j=0}^{i}\Delta_t \equiv\] \[t_i=t_0+\sum_{j=0}^{i}\frac{t_1-t_0}{n} \equiv\] \[t_i=t_0+\frac{t_1-t_0}{n}\sum_{j=0}^{i}1 \equiv\] \[t_i=t_0+\frac{t_1-t_0}{n}i\]

The fundamental aspect of the last equation is that it no longer corresponds to a sum of \(i\) terms where rounding error accumulation can occur, but to a function \(f(i)=t_0+\frac{t_1-t_0}{n}i\), where \(i\) causes no error since it is an integer that evolves from \(0\) to \(n\). From this number it is now possible to calculate the value of \(t_i\) that, although it may have the inevitable rounding errors caused by not using fraction numbers, it will not have an accumulation of these errors.

Based on this last formula, it is now easy to write a new function enumerate_n that directly computes the value of \(t_i\) from the number \(n\) of desired increments in the interval \([t_0, t_1]\):

enumerate_n(t0, t1, n) =

  map(i -> t0+(i*(t1-t0))/n,

      enumerate(0, n, 1))

The enumerate_n function is actually predefined in Khepri, with the name division. The division function, like the enumerate_n function, receives as parameters the interval limits and the number of increments:

> division(0, 1, 4)

5-element Vector{Float64}:

 0.0

 0.25

 0.5

 0.75

 1.0

> division(0, pi, 4)

5-element Vector{Float64}:

 0.0

 0.7853981633974483

 1.5707963267948966

 2.356194490192345

 3.141592653589793

Unlike the enumerate_n function, the division function also has an optional parameter (by omission, the value true) that indicates whether the last element of the interval is included. This option is intended to facilitate the use of periodic functions. To better understand this parameter, let us consider that we want to place four objects at the four cardinal points. For this, we can use polar coordinates to divide the circle into four parts, with the angles \(0\), \(\frac{\pi}{2}\), \(\pi\), and \(\frac{3\pi}{2}\). However, if we use the expression division(0, 2*pi, 4) we will not only get those values but also the upper limit of the interval \(2\pi\), which would imply placing two overlapping objects, one for the angle \(0\) and another for the angle \(2\pi\). Naturally, we can solve the problem by using division(0, 3*pi/2, 3), but that seems much less natural than writing division(0, 2*pi, 4, false). That is, we want to divide \(2\pi\) in four pieces but we do not want the last value (because it will be equal to the first).