If a benchmark takes very long to run, it's harder to iterate on changes
to see their effect. Even reduced to 100, this pow_bench takes around 8
seconds on my machine, and still shows meaningful optimization effects.
bigint: fix float conversions
The default implementations of to_f32, to_f64, from_32 and from_f64 are
limited to numbers fitting in i64 but BigInt can of course be much bigger.
The default to_f32 also has double rounding.
from_f32 and from_f64 have undefined behaviour if the float is too big to
fit in an i64.
This fixes these issues and keeps the rounding consistant with other float
to int conversions.
Also add ToBigUint and ToBigInt implementations for f32 and f64.
Currently this returns None if the BigInt is too big for to_f32 and to_f64 but it might make more sense to return Some(INFINITY) but I've decided to match f64::to_f32() for now.
The default implementations of to_f32, to_f64, from_32 and from_f64 are
limited to numbers fitting in i64 but BigInt can of course be much bigger.
The default to_f32 also has double rounding.
from_f32 and from_f64 have undefined behaviour if the float is too big to
fit in an i64.
This fixes these issues and keeps the rounding consistant with other float
to int conversions.
Also add ToBigUint and ToBigInt implementations for f32 and f64.
The hidden "mod test" layout of the first example has been broken for a
while, but it wasn't noticed because rustdoc wasn't passing any features
at all. That was fixed in rust-lang/rust#30372, and now we need to get
our ducks in a row too.
bigint: simplify Hash
There cannot be any leading zeros in a BigUint so just derive Hash which will just hash the Vec directly.
Add Hash to Sign so we can derive it for BigInt as well.
There cannot be any leading zeros in a BigUint so just derive Hash which will just hash the Vec directly.
Add Hash to Sign so we can derive it for BigInt as well.
[RFC] Some performance improvements
I added add and subtract methods that take an operand by value, instead of by reference, so they can work in place instead of always allocating new BigInts for the result.
I'd like to hear if anyone knows a better way of doing this, though - with the various wrappers/forwardings doing this for all the operations is going to be a mess and add a ton of boilerplate. I haven't even tried to do a complete job of it yet.
I'm starting to think the sanest thing to do would be to only have in place implements of most of the operations - and have the wrappers clone() if necessary. That way, the only extra work we're doing is a memcpy(), which is a hell of a lot faster than a heap allocation.
The main idea here is to do as much as possible with slices, instead of
allocating new BigUints (= heap allocations).
Current performance:
multiply_0: 10,507 ns/iter (+/- 987)
multiply_1: 2,788,734 ns/iter (+/- 100,079)
multiply_2: 69,923,515 ns/iter (+/- 4,550,902)
After this patch, we get:
multiply_0: 364 ns/iter (+/- 62)
multiply_1: 34,085 ns/iter (+/- 1,179)
multiply_2: 3,753,883 ns/iter (+/- 46,876)
Since the Num constraint only requires val-val ops, we ended up cloning
everything in Complex ops when they were forwarded to ref-ref. By using
val-val, we can reduce how often clones are needed.