MadChuckle: Why Floating-Point Arithmetic is Just Wrong?!

2010-04-28

Why Floating-Point Arithmetic is Just Wrong?!

Mused by madchuckle

Labels: Programming Languages, Python, Software Development

....or why 0.1 + 0.1 + 0.1 - 0.3 is NOT equal to zero !

I know that floating point operations in computer languages are inherently wrong! In fact every seasoned developer knows this; often with some painful memories of bug-hunting sessions in the past. But newcomers might be suprised by this behavior.

So, let me first explain what is wrong with the floating point numbers? Let's use Python as our guinea pig:



>>> f1 = 0.1

>>> f2 = f1+f1+f1

>>> f3 = 0.3

>>> f4 = f3 - f2

>>> f4

-5.5511151231257827e-017

Oops! Not what one expects I guess. The reason for the above result can be traced back to:



>>> f = 0.1

>>> f

0.10000000000000001

So, the 'float' structure cannot accurately hold the value of 0.1 with perfect precision. Actually what it holds is 0.100000000000000005551115123125 to be exact, but Python string representation shows only the first 17 decimal digits.

This is not a bug, but a 'feature' of float structure. To represent real numbers with a limited amount of memory, it uses a special bit structure. The details are not much important. But if you want to know you can check out this excellent explanation and this Wikipedia article.

IEEE754 Double Floating Point Format

What counts is that floating point precision is not perfect and tiny amounts of inaccuracies can build up in a long sequence of operations. A similar result is that one cannot and should not compare two float numbers for equality; always try to use '>' and '<' comparisons instead of '='s to avoid ugly bugs.

Another important thing is that when the numbers get big in value, the precision that can be provided for the decimal part becomes smaller as it has only a finite amount of bits to use.

So, that's a quick background on the subject. The reason I am now thinking about is that my viewshed analysis code is fully working now, and I am trying to assess if I should have used alternative more precise methods instead of floating point arithmetics. This led to some research on the subject in the Python world as a newbie. I'm not suprised Python has some kind of 'decimal' structure for arbitrary precision arithmetics.

Python Decimal class provides a solution for the precision needs where using 'float' is not acceptable. So, what are the reasons to use and not to use 'float'. We can quickly work out a list for that:

When to use 'FLOAT'

When very high precision levels are not needed
When perfect equality between calculated values are not checked for
When the numbers are not very huge in value
When speed is more important than accuracy

When NOT to use 'FLOAT' (and use 'DECIMAL')

When a literal value must be represented by the perfectly same number
When very high precision levels are needed
When very big numbers should be represented
When the precision level should be adjustable or fixed to a certain digits
When scientifically perfect rounding should be applied using significant digits in operations
When doing financial calculations involving money values
When accuracy is more important than speed

I will not go into how to use Python decimal in this post. But the following are some starting points about multiprecision arithmetic for the curious:

Python's Official Decimal Documentation
MPMath: Highly advanced Python library for multiprecision floating-point arithmetic
Python's New Fraction Module: For directly operating on rational numbers
The General Multiprecision PYthon project (GMPY): Wraps C/C++ libraries for Python starting with the GNU Multiple Precision (GMP) library

So, should I use Decimal in Python? Based on the above criteria the answer is 'no, not really'. Because the algorithm is already running very slowly. But still I plan to implement the decimal calculations as an alternative option in order to see if the differences are big enough to make a difference. I'll let you know about the results in both speed and accuracy terms.

MadChuckle

summary: 'perfection is a b*tch!'
mood: let down

MadChuckle

2010-04-28

Why Floating-Point Arithmetic is Just Wrong?!

No comments:

Post a Comment