Skip to main content
Seminar | Mathematics and Computer Science

Floating-Point Arithmetic through Dekker’s System

LANS Seminar

Abstract: In this seminar, we will examine a unique floating point number system that does not require uniqueness in its number representation. This system appeared in Dekker’s 1971 seminal work titled, A floating-point technique for extending the available precision,” which is often credited as the foundation of popular floating-point emulation techniques and error-free transformation techniques. Special notes will be provided on the convenience of this system for theoretical analysis and how it allows general statements to be made succinctly.

We will build things from the ground up, paper-and-pencil style. We will start with how floating-point numbers are represented, derive intuitive and unintuitive properties of floating point arithmetic, demonstrate how these properties can be leveraged to design highly accurate algorithms and, if time permits, connect with how all this knowledge can help with computer system validation.

No particular background is expected from the audience, though properties of real and integer numbers may be used without explanation (sweet bites may be provided in their stead).