Iterators generating (many) vectors

## Problem: slow because of repetitive heap-allocation
The 5 iterators below generate `Vec<...>` items.

	multi_cartesian_product(self)
	combinations(self, k: usize)
	combinations_with_replacement(self, k: usize)
	permutations(self, k: usize)
	powerset(self)
It can be slow, because of the repetitive allocations, even if the user does not need to permanently store each permutation.

I think that in most cases, the user uses each vector to produce something else and then each vector is discarded. In that case, because those iterators generate (up to) many many items (product of the lengths, factorial-related, powers of two), it creates **many** temporary vectors which can slow down code execution.

## Proposed solution: map slices
An alternative iterator could own one single (internal/private) vector and only give a reference to it to a function implementing `FnMut(&[...]) -> R`. Those iterators would then generate `R` items instead of vectors.

	// current methods unchanged, new methods:
	multi_cartesian_product_map<R, F>(self, func: F)
	combinations_map<R, F>(self, k: usize, func: F)
	combinations_with_replacement_map<R, F>(self, k: usize, func: F)
	permutations_map<R, F>(self, k: usize, func: F)
	powerset_map<R, F>(self, func: F)
	// all with  F: FnMut(&[...]) -> R

### Usage examples
`it.combinations(k)` and `it.combinations_map(k, ToOwned::to_owned)` would produce the same elements, but the first should be used!

`it.combinations(k).map(closure)` and `it.combinations_map(k, closure)` could produce the same elements if it does not require a vector. For what I tested, it's then roughly **_4 to 5 times faster_** to go through all combinations, unless the iterator has very few elements (then `.map` should be used).

In `it.permutations(5).filter_map(|perm: Vec<_>| some_option)`, if `some_option` does not need perm to be an owned vector then it could become `it.permutations_map(5, |perm: &[_]| some_option).flatten()` where there is no heap-allocation for each permutation.

### Implementation (**[commits here](https://github.com/rust-itertools/itertools/compare/master...Philippe-Cholet:vec_items)**)
New module `vec_items` (behind `use_alloc` feature) with:
- `pub trait VecItems<T> {  type Output;  fn new_item<I: Iterator<Item = T>>(&mut self, iter: I) -> Self::Output;  }`
- `pub struct CollectToVec;` implementing `VecItems` with `Vec<T>` output and just `iter.collect()` (so a new vector each time, as it's currently done).
- `pub struct MapSlice<F, T> { func: F, vec: Vec<T> }` implementing `VecItems` with `R` output where `F: FnMut(&[T]) -> R` and the internal vector is cleared but not dropped after use (and its length will not change often if ever).

E.g. for `combinations_map`:
- In `Itertools` trait: `#[cfg(feature = "use_alloc")] fn combinations_map<R, F: FnMut(&[Self::Item]) -> R>(self, k: usize, f: F) -> CombinationsMap<Self, F> where ...`
- in `src/combinations.rs`:
	- Convert `Combinations<I>` to `CombinationsBase<I, F>` with the new field `manager: F`
	- New (fused)iterator constraint `F: VecItems<I::Item>` and item type `F::Output`
	- New `pub type Combinations<I> = CombinationsBase<I, CollectToVec>;`
	- New `pub type CombinationsMap<I, F> = CombinationsBase<I, MapSlice<F, <I as Iterator>::Item>>;`
	- New `fn combinations_map<I, F>(iter: I, k: usize, f: F) -> CombinationsMap<I, F>`
- Test to compare with `combinations`.

Very similar for the 4 others. `powerset[_map]` is a bit simpler because it just uses `combinations[_map]` internally ; and it was simpler for `multi_cartesian_product` than anticipated.

### Closing thoughts
**Avoid cloning in `MapSlice` case?** I think(?) it might be possible to only have references: `MapSlice { func: F, vec: Vec<&T> }`. Then `F: FnMut(&[&T]) -> R` and no cloning. But I struggle with the lifetimes (or more probably some hard barrier). But is it worth it? Probably not if it's just integers, but otherwise?

The documentation I did not wrote yet would suggest to use the non `_map` variants if the user needs owned vectors and not slices (or if there is very little elements expected).

**PS:** I got the idea while replacing `(0..nb).permutations(nb)` (after using `flamegraph` and `heaptrack`) by a `_map` version of mine of `permutohedron::Heap` to avoid any (internal) heap-allocation while keeping a nice iterator structure. And for `8! == 40320` times, that was also 5 times faster.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Iterators generating (many) vectors #722

Problem: slow because of repetitive heap-allocation

Proposed solution: map slices

Usage examples

Implementation (commits here)

Closing thoughts

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Iterators generating (many) vectors #722

Description

Problem: slow because of repetitive heap-allocation

Proposed solution: map slices

Usage examples

Implementation (commits here)

Closing thoughts

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions