Add traits `CheckedShl` and `CheckedShr` that correspond to the standard
library's `checked_shl` and `checked_shr` functions. Implement the trait
on all primitive integer types by default, akin to what the standard
library does.
The stdlib is somewhat inconsistent when it comes to the type of the
shift amount. The `checked_*` functions have a `u32` shift amount, but
the `std::ops::{Shl,Shr}` traits are generic over the shift amount. Also
the stdlib implements these traits for all primitive integer types as
right-hand sides. Our implementation mimics this behaviour.
350: Avoid large intermediate product in LCM r=cuviper a=mhogrefe
Changed the implementation of BigUint LCM from
`((self * other) / self.gcd(other))`
to
`self / self.gcd(other) * other`
The division is exact in both cases, so the result is the same, but the new code avoids the potentially-large intermediate product, speeding things up and using less memory.
I also removed the unnecessary parentheses, because I think it's clear what order everything will be executed in. But if others think differently I can add them back.
351: Remove num-macros r=cuviper a=cuviper
The first commit gives a final deprecation bump to `num-macros`, and
the second removes it from the repo altogether.
- Now uses Toom-3 multiplication for large inputs.
- `BigInt`/`BigUint` parsing now accepts `_` separating digits.
- `BigInt`/`BigUint::assign_from_slice` reinitializes the value, keeping
the same internal buffer.
- `BigUint` now implements many `*Assign` ops.
- `BigUint::modpow(exp, mod)` performs efficient modular exponentiation.
`Complex` now implements `Num`, `Rem`, and `RemAssign`.
(Complex remainders don't have a clear mathematical basis, but we choose
to round toward zero to a gaussian integer.)
342: rational: check for NaN when approximating floats r=cuviper a=cuviper
We had a test for NaN already, but thanks to undefined casts (#119) it
was only passing by luck -- on armv7hl it failed:
https://bugzilla.redhat.com/show_bug.cgi?id=1511187
Now we check for NaN explicitly.
339: Implement modpow() for BigUint backed by Montgomery Multiplication r=cuviper a=str4d
Based on this Gist: https://gist.github.com/yshui/027eecdf95248ea69606
Also adds support to `BigUint.from_str_radix()` for using `_` as a visual separator.
Closes#136
340: Fix documentation formatting with commonmark enabled r=cuviper a=mbrubeck
This makes formatting correct with the new pulldown-cmark Markdown parser (rust-lang/rust#44229).
328: Optimizing BigUint and Bigint multiplication with the Toom-3 algorithm r=cuviper a=kompass
Hi !
I finally implemented the Toom-3 algorithm ! I first tried to minimize the memory allocations by allocating the `Vec<BigDigit>` myself, as was done for Toom-2, but Toom-3 needs more complex calculations, with negative numbers. So I gave up this method, to use `BigInt` directly, and it's already faster ! I also chose a better threshold for the Toom-2 algorithm.
Before any modification :
```
running 4 tests
test multiply_0 ... bench: 257 ns/iter (+/- 25)
test multiply_1 ... bench: 30,240 ns/iter (+/- 1,651)
test multiply_2 ... bench: 2,752,360 ns/iter (+/- 52,102)
test multiply_3 ... bench: 11,618,575 ns/iter (+/- 266,286)
```
With a better Toom-2 threshold (16 instead of 4) :
```
running 4 tests
test multiply_0 ... bench: 130 ns/iter (+/- 8)
test multiply_1 ... bench: 19,772 ns/iter (+/- 1,083)
test multiply_2 ... bench: 1,340,644 ns/iter (+/- 17,987)
test multiply_3 ... bench: 7,302,854 ns/iter (+/- 82,060)
```
With the Toom-3 algorithm (with a threshold of 300):
```
running 4 tests
test multiply_0 ... bench: 123 ns/iter (+/- 3)
test multiply_1 ... bench: 19,689 ns/iter (+/- 837)
test multiply_2 ... bench: 1,189,589 ns/iter (+/- 29,101)
test multiply_3 ... bench: 3,014,225 ns/iter (+/- 61,222)
```
I think this could be optimized, but it's a first step !
By starting with `split_at_mut`, the hot multiplication loop runs with
no bounds checking at all! The remaining carry loop has a slightly
simpler check for when the remaining iterator runs dry.