Should Floating Point Really Be The Default?

hostilefork · September 4, 2024, 8:10am

In an issue on Red's GitHub, Gregg gives this example:

red>> n: 0  while [n <= 0.3][print mold/all n  n: n + 0.1]
0
0.1
0.2
; Huh?

red>> n: 0  while [n <= 3][print mold/all n  n: n + 1]
0
1
2
3
; OK. I'm not crazy

red>> n: 0  while [n <= 0.4][print mold/all n  n: n + 0.1]
0
0.1
0.2
0.30000000000000004
0.4
; What... Oh... How do I work around that?

Sure, you can point everyone to "What Every Computer Scientist Should Know About Floating Point Arithmetic" and say it's not your problem. I mean geez, it's only a mere 50 or 60 pages long.

But as Gregg says:

"Yes, we can explain it. Yes, not everyone will hit this. But [...] we don't just look like we're passing problems on and throwing our hands in the air, saying it's not our problem; it's part of our mission, and being empathetic to pain points is something people can hold up as examples of why to use Red, rather than why not to use Red."

This makes me wonder about the choice to use floating point as a default. Yes, it's in the hardware. Yes you should be able to use them somehow. But it may not be a fit for the language goals when considering examples like the above. So I thought I'd ask the AIs about it.

Note: The MONEY! datatype in R3-Alpha is not an IEEE floating point. It's something Ladislav wrote:

Deci significands are 87-bit long, unsigned, unnormalized, stored in
little endian order. (Maximal deci significand is 1e26 - 1, i.e. 26
nines)

Sign is one-bit, 1 means nonpositive, 0 means nonnegative.

Exponent is 8-bit, unbiased.

hostilefork · September 4, 2024, 8:11am

ChatGPT

Languages that avoid using floating-point representation for decimal numbers typically do so to provide better precision for certain kinds of calculations, especially in financial and business applications where exact decimal representation is critical.

Prominent Examples of Languages Avoiding Floating-Point as Default:

COBOL:
- What Went Right: COBOL uses fixed-point decimal arithmetic by default, which is crucial for financial calculations where exact precision is necessary. This choice made COBOL a staple in business, finance, and government systems, where avoiding floating-point errors is essential.
- What Went Wrong: COBOL’s strict adherence to fixed-point arithmetic limited its applicability outside of these specific domains. It also meant that performance in non-financial contexts was slower compared to languages using floating-point arithmetic.
Rexx:
- What Went Right: Rexx, designed for ease of use in scripting and text processing, also opts for fixed-point arithmetic by default. This choice made it easy to write scripts without worrying about floating-point imprecision.
- What Went Wrong: Rexx’s simplicity and fixed-point arithmetic made it less suitable for scientific and engineering applications where floating-point is necessary. As a result, its usage was limited to simpler scripting tasks.
Ada:
- What Went Right: Ada offers both fixed-point and floating-point types, giving developers the choice based on the application's needs. For safety-critical and financial applications, fixed-point types are used to ensure precision.
- What Went Wrong: The flexibility came at the cost of complexity, as developers had to understand and choose the correct type for each use case. This complexity made Ada less popular than more straightforward languages like C or Python.

Consequences and Lessons Learned:

Precision vs. Performance: Languages that prioritize precision over performance by avoiding floating-point as the default are often limited to domains where that precision is paramount. They may lack the versatility and speed of languages that embrace floating-point arithmetic.
Adoption and Popularity: Fixed-point arithmetic can make a language less appealing for general-purpose programming, particularly in fields like scientific computing, graphics, and games where floating-point performance is critical.
Niche Success: Languages that avoid floating-point arithmetic find success in niche areas where their precision is a strong advantage, but they struggle to gain traction outside those niches.

Modern Developments:

Some modern languages, like Python, attempt to bridge the gap by offering multiple numeric types (e.g., float and decimal) and making it easy to choose the right one for the job. However, this still requires developers to be aware of the underlying issues and make conscious decisions about numeric types.

In summary, while avoiding floating-point arithmetic can prevent certain types of errors, it also limits the language's applicability and performance in domains where floating-point is the appropriate tool. The success or failure of these languages often hinges on how well they serve their intended niche and how easily developers can adapt to or work around their limitations.

hostilefork · September 4, 2024, 8:15am

Claude

Languages with Alternative Decimal Representations

Python
- While Python uses floating-point numbers by default, it also provides a Decimal type in its standard library.
- The Decimal type offers precise decimal representation and arithmetic.
- Fate: Widely successful language, with the Decimal type being particularly useful in financial applications.
Julia
- Julia uses floating-point by default but provides arbitrary-precision arithmetic via the BigFloat type.
- It also offers a Rational type for exact representation of ratios.
- Fate: Growing popularity in scientific computing and numerical analysis.
Scala
- Scala, running on the JVM, inherits Java's BigDecimal class for arbitrary-precision decimal arithmetic.
- Fate: Successful in both industry and academia, particularly for big data processing.
Haskell
- Haskell provides a Rational type for exact representation of ratios.
- It also has libraries for arbitrary-precision decimals.
- Fate: Influential in programming language research and functional programming communities.
Erlang
- Erlang uses arbitrary-precision integers by default and has a rational module for exact rational
  arithmetic.
- Fate: Successful in telecommunications and distributed systems.

What Went Right

Precision in Financial Applications: Languages offering exact decimal representations are preferred in financial and accounting applications where precision is crucial.
Predictability: Alternative representations often provide more predictable behavior, avoiding surprises like the one demonstrated in the Rebol example.
Ease of Understanding: For beginners, exact representations can be easier to understand and reason about.
Flexibility: Many languages that offer alternative representations do so alongside floating-point, giving developers the choice based on their specific needs.

What Went Wrong (or Challenges Faced)

Performance: Exact representations and arbitrary-precision arithmetic are generally slower than hardware-supported floating-point operations.
Memory Usage: Exact representations often require more memory, which can be a concern for large-scale applications or systems with limited resources.
Compatibility: Deviating from the IEEE 754 standard (used for floating-point) can create interoperability issues with other systems and libraries.
Learning Curve: Developers accustomed to floating-point may need to adjust their thinking and practices when working with alternative representations.

Conclusion

While floating-point remains the default in many languages due to its performance and hardware support, there's a growing recognition of its limitations. Languages that offer alternative representations, either as the default or as easily accessible options, provide developers with powerful tools to handle decimal arithmetic more precisely when needed. The success of these languages often comes from striking a balance between performance and precision, allowing developers to choose the right tool for their specific use case.

hostilefork · April 29, 2025, 1:04pm

It Seems Like libmpdec May Be A Better Answer

This is what Python uses, as the C library for the Decimal128 standard.

https://www.bytereef.org/mpdecimal/index.html

Pure C, actively maintained, BSD-2, and thoroughly/meticulously documented implementation of an IEEE specification. It's about 80k of compiled binary footprint, apparently.

The Decimal128 standard is--as the name suggests--128 bits. We have room for that as-is on 64-bit builds, since cells are 256 bits on 64-bit platforms. But Cells on 32-bit platforms are exactly 128 bits, so we'd have to chop the representation down a little (offering maybe more limited significant figures).

What's The Case For It? (ChatGPT sez...)

Exactness of Representation

Decimal fractions like 0.1 and 0.3 are represented exactly.

This eliminates one of the most frustrating sources of confusion in casual programming:

>> 0.1 + 0.2 = 0.3  ; in Rebol today with IEEE float
== false

With libmpdec, you'd get true, because decimal math is exact where humans expect it to be.

User-Friendly Semantics

Trailing zeros are preserved if you want to track precision intent (0.500 ≠ 0.5).

This allows support for operations or formats that infer significance:

>> digits 0.500  ; hypothetical intent-preserving Rebol dialect
== 3

Improved Comparisons (If You Want)

You could offer a standard = that compares only by value (0.5 == 0.50)...

...and a ≈ or equal-sigfigs? that compares both value and digits of significance.

Avoids Cross-Platform Gotchas

Decimal libraries like libmpdec behave identically on all platforms.

Native double may differ slightly across compilers and architectures due to rounding behavior, subnormals, fused multiply-add, etc.

Better Fit for Intentional Programming

Rebol’s design is about precision of intent, not raw machine access.

Supporting a decimal library that preserves intent (like 0.50 meaning “two significant digits”) fits Rebol’s philosophy more than binary floats.

No Accidental Scientific Notation Drift

Binary float math can lead to e.g.:

>> 1e308 * 10
== inf

With libmpdec, you have guardrails and can trap overflows or round gracefully.

Better Interop with Human-Facing Data

When working with money, measurements, logs, etc., people expect base-10 behavior.

Rebol’s goal is to be a glue language. libmpdec speaks the "people number" language.

What Are The Gotchas?

More Complex Internal Representation

You will have to either:

expose a decimal! type with more structure, or
flatten it to always round to a canonical form.

This affects equality, printing, and serialization.

Storage Size and Interop

A decimal! type may be larger than a float! (especially if you want arbitrary precision).

Native C interop will be less clean — you can't just pass it as a double.

You'll Need to Define Canonicalization Behavior

Do you collapse 0.500 and 0.5 for equality?

Do you preserve input formatting for round-trip consistency?

Missing Hardware Acceleration

If you want to interface with e.g. GPU or SIMD code later, decimal types won't fit.

...You Had Me At `(0.1 + 0.2 = 0.3)`

This is really the issue. I don't feel like we should be exposing people to the concerns of floating point.

Maybe some extension types for doing specialized math (VECTOR! maybe) would use floats, but I don't think just the basic decimal numbers should.

Anyway... I just got to leaning a little more this direction when trying to address issues in equality. If this resolves things there in a real measurable way, I think it may be worth doing.