Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decimal data type controls are not printing correctly - the data is getting cut off and rounded. #1089

Open
ivkina opened this issue Oct 19, 2022 · 16 comments
Milestone

Comments

@ivkina
Copy link

ivkina commented Oct 19, 2022

Create a simple data control of Decimal type in a report. Set control value to "1111111111111112.12". Print/preview report in Eclipse/View Report/Web Browser. See simple report design and report output attached.
Note that the data is getting cut and rounded. If I remove 1 digit and make precision = 17 instead of 18, then it prints correctly. Is there as limit on BigDecimal precision data type in BIRT?
decimal_not_printing.zip
decimal_not_printing_viewer
The issue happens in old BIRT 4.9 (Java 8) and new BIRT 4.9 (Java 11).

If you run/print the same number in Java 8 or 11 it prints correctly:
BigDecimal d = new BigDecimal("1111111111111112.12");
System.out.println(d);

@hypn0chka
Copy link

Write me. I added an email to my profile.

@claesrosell
Copy link
Contributor

After debugging your example I draw this conclusion:
Javascript numeric literals is translated to double:s in Rhino (the javascript engine that Birt use). That double is later converted to a BigDecimal by Birt. It is the String to double convertion that looses the precision in this example.

@ivkina
Copy link
Author

ivkina commented Oct 19, 2022

Thanks for looking into that. In real life example we ran across this issue with total (aggregate) controls. And the totals (both table bindings and controls) are also defined as Decimal data type. Probably, the internal scripting is also doing double conversion in between and then BIRT classes convert the end result into BigDecimal.
Is there any way to fix that or is that part of rhino scripting which is beyond the scope?

@claesrosell
Copy link
Contributor

After "reading up" on numbers in javascript they seem to be double precision floats, which means that the Rhino implementation is correct.

@hvbtup
Copy link
Contributor

hvbtup commented Oct 20, 2022

If you are calculating invoices, it is a mistake to use float or double data types.
Instead, all calculations should be done with 10-based numbers in this case.

That's the reason why most programming languages (including ancient Cobol) also support data types like java.lang.BigDecimal which are able to store decimal values as such and to perform calculation. A conversion from a binary floating point number to a 10-based number and vice versa incurs a conversion error in many cases.

For example, take 31/10 = 3.1 (exactly).
According to https://babbage.cs.qc.cuny.edu/ieee-754.old/decimal.html, when converting this to an IEEE 754 32bit float,
this is 0x40466666, and converting back to a decimal representation with 8 significant digits, is is approx. 3.0999999.

BTW:
We have a VAT of 19% in Germany, and I have to create invoices with BIRT - so I know how tricky it is.
Actually, I try to avoid BIRT's aggregations then and instead I perform all the calculations witihn SQL with group functions or analytic functions.
Also note that e.g. Oracle's 10-based NUMBER data type uses a different rounding mode in comparsion to Oracle's BINARY_FLOAT / BINARY_DOUBLE, IIRC.

I debugged a simple report and can see that - unfortunately - BIRT converts BigDecimal to double:

excerpt from TotalSum.java:

	@Override
	public void onRow(Object[] args) throws DataException {
		assert (args.length > 0);
		if (args[0] != null) // ignore nulls in calculations
		{
			sum = calculator.add(sum, calculator.getTypedObject(args[0]));
		}
	}

excerpt from NumberCalculator.java:

@Override
public Object getTypedObject(Object obj) throws DataException {
	try {
		return DataTypeUtil.toDouble(obj);
	} catch (BirtException e) {
		throw DataException.wrap(e);
	}
}

In DataTypeUtil.java, the toDouble method calls an overloaded method of the same name.

public static Double toDouble(Object source, ULocale locale) throws CoreException {
    ...
	if (source instanceof Double) {
		return (Double) source;
	} else if (source instanceof Number) {
		if (!isConvertableToDouble((Number) source)) {
			throw new CoreException(ResourceConstants.CONVERT_FAILS, new Object[] { source.toString(), "Double" });
		}
		double doubleValue = ((Number) source).doubleValue();
		return new Double(doubleValue);
	} else if (source instanceof Boolean) {

which calls

private static boolean isConvertableToDouble(Number n) {
	assert n != null;

	double doubleValue = n.doubleValue();
	return !Double.isInfinite(doubleValue);

}

To summarize (pun intended :-), BIRT will convert a BigDecimal resp. BIRT decimal value to double before summing it up.

Probably later this double value will be converted to BigDecimal again. I'm not sure about this: I tried to debug this but failed - even for a very simple report it is hard to debug this if you don't know where to look.

Thus in some cases you will face a small difference if you compare BIRT's output to what you get out if you very carefully compute everything manually in the decimal system.
This difference may look like a rounding error, but it can also be a conversion error.

I suggest - if your data is from an SQL database - that you perform the calculations on the SQL side.

@ivkina
Copy link
Author

ivkina commented Oct 20, 2022

After "reading up" on numbers in javascript they seem to be double precision floats, which means that the Rhino implementation is correct.

Well, I understand the floating math and get it that in a lot of cases it is no good for financial calculations. But that is why you have Decimal (BigDecimal) data type. My understanding was that if you use Decimal control in BIRT it should display the number as it is w/o any internal conversion to double and then back to BigDecimal. That seems to be not the case and as the result you end with the wrong number. Each time decimal values are converted to double before any calculation and formatting the number may lose the precision aspect of the Decimal data type (which is really what Decimal type is designed for).

We occasionally run across such issues in BIRT reports though we are using only Decimal data type for DataSource/DataSet columns, aggregates/totals and report controls.
The suggestion to calculate totals/aggregates in SQL instead of doing the same via report engine seems to be quite an overreach...What for would need a report engine then???
Finally, found this discussion from old days about BIRT fully supporting BigDecimal calculations - https://bugs.eclipse.org/bugs/show_bug.cgi?id=155138.
Thanks

@hvbtup
Copy link
Contributor

hvbtup commented Oct 20, 2022

Finally, found this discussion...

I wish the final conclusion of that discussion would be true...

I agree this is a bug which is still not fixed.

Probably, the internal scripting is also doing double conversion in between and then BIRT classes convert the end result into BigDecimal.

To be precise: AFAIK it is not scripting done in Rhino which is causing this error. It's caused by the Java implementation in the source files I mentioned (and other source files for other aggregate functions).

Is there any way to fix that or is that part of rhino scripting which is beyond the scope?

As this is caused on the Java side, it should be possible to fix it (in the abovementioned source files).

But I doubt that anyone will fix it (unless you fix it yourself by providing a PR), because it would be quite an effort:

I think it would be necessary to support all the calculations in two different ways:
a) for number objects with a binary based representation (using Java's double or Double internally).
b) for number objects with a decimal based representation (using Java's BigDecimal internally).
... and to take care of the possible case that these types of numbers are mixed in the calculations (e.g. by coercing to BigDecimal).

Furthermore, if you actually do any financial computations in Javascript (not BIRT aggregates), you're probably doomed:
As Claes said and according to https://www.w3schools.com/js/js_numbers.asp, numbers in Javascript are always IEEE 754 double.

A workaround which avoids most of these problems is to use CT (instead of EUR or USD) as your basis (see http://adripofjavascript.com/blog/drips/avoiding-problems-with-decimal-math-in-javascript.html).
For division (e.g. German VAT computation x * 19 / 100), there is still a possible error, because AFAIK Javascript uses round-half-even mode for numbers, which is absolutely OK for scientific computations to avoid a statistical bias, but for rounding in invoices, the norm EN 16931 for example prescribes mercantile rounding (round up 0.5 to 1 for positive numbers, round down -0.5 to -1 for negative numbers).

@hvbtup
Copy link
Contributor

hvbtup commented Oct 20, 2022

As a side note, the Classic Models example database built into BIRT is a classic example for wrong database model design:
It uses SQL FLOAT data types for BUYPRICE (probably because the embedded database engine does not support decimal based numbers). Database columns for prices should use a decimal based number type in the DB as well and the DB should use mercantile rounding mode for such numbers.

@wimjongman wimjongman added this to the 4.13 milestone Jan 5, 2023
@wimjongman wimjongman modified the milestones: 4.13, 4.14 Mar 3, 2023
@claesrosell
Copy link
Contributor

I looked at this again, and the original problem happens because the script is using a javascript number literal which cannot "translate" to a corresponding double value. If the one would need an exact value the script needs to create a BigDecimal instead. Javascript does not have a corresponding type so the way to do that is:
new java.math.BigDecimal("1111111111111112.12"); instead of the 1111111111111112.12 literal. Just setting the Birt datatype to "Decimal" is not enough since it will be instantiated with the floating point value from the javascript expression.

With that said: There are bugs or at least lack of function in BIRTs calculations. These are highlighted by @hvbtup in one of the comments. The aggregator functions of BIRT does not use BigDecimal where appropriate. The aggregator functions (AggrFunction) that use decimals all has "DataType.DOUBLE_TYPE" set as datatype. This will select a the "NumberCalculator" and that calculator always converts numbers to doubles in calculations.

There are currently no way for the report designer to define what precision that should be used. It is possible to define the datatype for a "computed column" in a dataset, but that information is not taken into account (nor available) when the ICalculator, that will perform the calculations, is selected. The datatype only affects the resulting datatype. This means that all calculations will be done with doubles and then converted to the selected datatype.

I see three different ways forward:

a) Change the aggregator functions that needs higher then double precision to "Decimal", so a BigDecimalCalculator is selected for calculations during aggregation.
b) Take the result datatype (selected be the report designer) into account when selecting calculator. Basically, if the return type is Decimal and the AggrFunction is "DataType.DOUBLE_TYPE", select BigDecimalCalculator, otherwise do as before.
c) For the aggregator functions that may need higher then double precision, take the result datatype (selected be the report designer) into account when selecting calculator. Basically, if the return type is Decimal and the AggrFunction is "DataType.DOUBLE_TYPE", select BigDecimalCalculator, otherwise do as before.

To solve the x * 19 / 100 problem, if done in Javascript one would need to write an expression like this:
(new java.math.BigDecimal("101010")).multiply(new java.math.BigDecimal("0.19"), java.math.MathContext.DECIMAL128)

@hvbtup if I were to try to "close" https://bugs.eclipse.org/bugs/show_bug.cgi?id=155138 again could you provide me with a test report that reproduces all the known BigDecimal problems? I am aware of the aggregation functions and the Javascript literals, but are there others?

@hvbtup
Copy link
Contributor

hvbtup commented Nov 9, 2023

I won't be able to create such a test report in November.

The topic of proper calculation is actually quite complex.

I'll try to collect some brief notes here about what can go wrong:

  • First of all, implicit conversion is the cause of so many errors in all kinds of software. I really wish all programming languages would forbid them.
  • A conversion from a 10-based number representation to or from a 2- (or 16-) based number representation will almost always result in precision loss due to conversion errors. E.g. you cannot a value like 0.1 (1/10) exactly as a float or double.
  • For most computations, there are several different ways to correctly calculate them if we assume we can store all numbers exactly. But in reality, we cannot store numbers exactly, and thus in reality the resulting errors depend very much on the way we perform the calculation due to rounding errors. That's what the mathematical discipline "numerics" is all about - to perform calculations that the unvoidable rounding errors have minimal impact on the end result.
  • It makes a difference when we round. As a simple example, say we have to exact input values 1/2 and 1/2. Our end result should be an integer. If we can compute with the exact values and round afterwards, we get 1/2 + 1/2 = 1.0 = 1. But if we round the individual values, then 0.5 will be rounded up to 1, so the result is 1+1 = 2.

As a result of what I said above, your wording "higher precision than double" is not correct.
Usually nobody needs higher double precision in a calculation.

One would use float (or double) variables if the values are usually not exact anyway, e.g. they are measures (in the sense that someone has actually measured something, e.g. the alcohol contrentration of a liquor).
In such a case, one usually should store all measured values as doubles (unless they describe counting something) and perform all calculations with the double datatype and round-half-even rounding mode, and store the result as double. Only in the last step, when presenting the result to a user, the result has to be converted from double (2-based) into a decimal representation. In this last step, conversion error is inevitable. But all the calculations are then good from a numerical point of view, if we take as given that we can store numbers only with limited memory size (in bytes).

But when the calculations are about offers/invoices/taxes/money, then we must use decimal datatypes and (usually) round-half-up from the start to the end. That means all intermediate results must be stored in decimal form, too.

This "commercial" kind of calculation avoids all conversion errors, but it still incurs rounding errors.

So it's not "higher precision than double", but instead a slightliy different way of performing the calculation to avoid conversion errors.

@wimjongman wimjongman modified the milestones: 4.14, 4.15 Nov 21, 2023
@SteveSchafer-Innovent
Copy link
Contributor

I'm going to see what I can do with this. At least the part about getting it to use BigDecimalCalculator in aggregates. The OP's example involving constant values is going to be a different solution.

@SteveSchafer-Innovent
Copy link
Contributor

It is possible to change the calculator in the onRow method, once it's known that the incoming data is a BigDecimal, however the return datatype of SUM would still be float and the user would need to know to change the datatype of the aggregate column binding from float to decimal. I wonder if it would be better to have a parallel set of aggregators specifically for decimal return values, such as DECIMAL_SUM. Maybe as an optional plugin.

Additionally, there are some things the user would need to know to do for commercial/financial reporting, such as using new BigDecimal() for constant values and always using BirtMath for computations. These could be documented in the plugin GitHub page.

I think BIRT should support financial reporting, even if it requires users to do things a bit differently. You can't necessarily rely on a SQL backend to do it for you.

@hvbtup
Copy link
Contributor

hvbtup commented Feb 15, 2024

E.g. for SUM:
We have got the information which datatype the colum is (from the dataset metadata resp. the binding metadata).

Shouldn't it be possible to let the internal calculations use something like T <extends Number> for the calculations?

And perhaps we should give an opportunity to explicitly specify a rounding mode for the calculations. If not specified, it should be inferred from the data type, e.g. float -> ROUND_HALF_EVEN, decimal -> ROUND_HALF_UP.

+1 for BIRT should support financial reporting.

Anyway, we kind of hijacked this ticket. Maybe it makes sense to change the title.

@SteveSchafer-Innovent
Copy link
Contributor

@hvbtup In the column binding, there isn't an explicit declaration of the data type of an argument. There's just an expression. And javascript being what it is, you can't really know the data type of the expression until you execute it. That's why it's possible to change the calculator in onRow but not before that. And at that point it's too late to change the return type of the binding (unless the user does it manually). Even if you had a type-agnostic calculator you'd still have the same problem with the return type. That's what led me to the idea of a set of decimal aggregate functions.

It actually looks to be trivial to implement a decimal sum. Just extend TotalSum and override getName and getDataType. I'm not sure if it can be implemented as an external plugin though because of "Discouraged Access" warnings, but it could certainly be a built-in aggregator.

As for rounding, I confess to not being very familiar with it so I'll do some research to see if what you are suggesting is possible.

@SteveSchafer-Innovent
Copy link
Contributor

I've added DECIMALSUM and DECIMALAVG to our birt-functions-lib plugin project: https://github.com/innoventsolutions/birt-functions-lib/tree/birt-4.14
and they work nicely. The site URL is https://github.com/innoventsolutions/innoventsolutions.github.io/raw/4.14.1/4.14.
I'll be building out the rest of the aggregates that suffer from rounding errors due to float calculations over time and will note in this ticket as I progress.

@SteveSchafer-Innovent
Copy link
Contributor

I've added 19 aggregations that use decimal calculations to the Innovent Solutions birt.functions.lib plugin. The site URL is the same one we've been using for several years: https://innoventsolutions.github.io/2.5.2. The aggregations are in a separate category so they are optional. The wiki URL is https://innoventsolutions.github.io. (I'll be adding wiki content soon). The repo URL is https://github.com/innoventsolutions/birt-aggregations-lib.

I believe that using these aggregations and using BirtMath for calculations and new BigDecimal for constant values should allow on to produce financial reports devoid of floating point errors. If you get a chance to use them, let me know what you think. Please post issues to the repo if you have any problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants