Benchmarks¶
Performance comparisons: rusket vs mlxtend, scaling to 1 Billion rows, and FPMiner throughput.
rusket includes two mining algorithms (FP-Growth and Eclat), both implemented in Rust. These benchmarks compare rusket against mlxtend (pure Python) on synthetic and real-world datasets.
Benchmark environment: Apple Silicon MacBook Air (M-series, arm64, 8 GB RAM). All timings are single-run wall-clock measurements. Your results may vary depending on hardware, thermal throttling, and background load.
Scale Benchmarks: 1M → 1B Rows 🚀¶
Interactive Chart¶
Three Input Paths¶
rusket supports three ways to ingest data at scale:
from_transactions→ sparse DataFrame — returns a pandas DataFrame, easy API- Direct CSR → Rust — pass
scipy.sparse.csr_matrixdirectly tofpgrowth(), skips pandas entirely FPMinerStreaming — memory-safe accumulator for 100M+ rows that don't fit in RAM
In-Memory Scale (fpgrowth)¶
| Scale | from_transactions → fpgrowth |
Direct CSR → fpgrowth | Speedup |
|---|---|---|---|
| 1M rows (200k txns × 10k items) | 5.0s | 0.1s | 50× |
| 10M rows (2M txns × 50k items) | 24.4s | 1.2s | 20× |
| 50M rows (10M txns × 100k items) | 63.1s | 4.0s | 15× |
| 100M rows (20M txns × 200k items) | 134.2s | 10.1s | 13× |
| 200M rows (40M txns × 200k items) | 246.8s | 17.6s | 14× |
Direct CSR is the power-user path
At 100M rows, direct CSR mining takes 1.2 seconds — the bottleneck is entirely the CSR build (24.5s). Compare to the pandas sparse path where mining alone takes 9.0s due to sparse.to_coo().tocsr() overhead.
Out-of-Core Scale (FPMiner Streaming)¶
For real-world retail datasets scaling to 1 Billion rows, FPMiner uses a memory-safe chunks approach (per-chunk sort + k-way merge).
Benchmark: 2,603 retail items, avg 4.4 items/basket, min_support = 0.1%
| Scale | add_chunk() | mine() | Total Time | Itemsets Found |
|---|---|---|---|---|
| 50M rows | 4.8s | 5.6s | 10.4s | 1,260 |
| 100M rows | 10.6s | 13.9s | 24.6s | 1,254 |
| 200M rows | 22.7s | 33.2s | 55.9s | 1,261 |
| 300M rows | 30.0s | 55.4s | 85.4s | 1,259 |
vs mlxtend¶
| Dataset | rusket (prep + mine) |
mlxtend (prep + mine) |
Speedup |
|---|---|---|---|
| 50k rows (10k txns, 100 items) | 0.0 s | 0.1 s | ~5× |
| 500k rows (50k txns, 500 items) | 0.2 s | 1.8 s | ~9× |
| 2M rows (500k txns, 2k items) | 0.2 s | 16.0 s | 80× |
The "Auto" Routine Algorithm¶
rusket.mine(method="auto") dynamically selects the algorithm that performs best based on the dataset density (Borgelt 2003 heuristic).
- FP-Growth: Excellent for dense data.
- Eclat: Excellent for sparse data (like retail baskets). Eclat directly uses hardware SIMD array-intersections (
popcnt) on the TID-lists, resulting in massive speedups (often 5× to 15× faster on sparse arrays).
Real-World Datasets¶
| Dataset | Transactions | Items | rusket |
mlxtend |
Speedup |
|---|---|---|---|---|---|
| andi_data.txt | 8,416 | 119 | 9.7 s (22.8M itemsets) | TIMEOUT 💥 | ∞ |
| andi_data2.txt | 540,455 | 2,603 | 7.9 s | 16.2 s | 2× |
Dense data
On andi_data.txt (~23 items/basket), mlxtend can't finish in 60s. rusket mines 22.8M itemsets in under 10s.
The Power-User Pipeline¶
import numpy as np
from scipy import sparse as sp
from rusket import FPGrowth
csr = sp.csr_matrix(
(np.ones(len(txn_ids), dtype=np.int8), (txn_ids, item_ids)),
shape=(n_transactions, n_items),
)
freq = FPGrowth(csr).mine(min_support=0.001, max_len=3, column_names=item_names)
At 100M rows, the mining step takes 1.2 seconds (not a typo).
🏆 Conquering the 1 Billion Row Challenge¶
Bottleneck 1: Memory Exhaustion during Ingestion¶
The Solution: The FPMiner class provides an out-of-core streaming API. It accepts chunks of (txn_id, item_id) pairs, performs a fast \(O(k \log k)\) sort in Rust, buffers them, and uses a k-way merge to stream directly into the final compressed CSR memory block.
Bottleneck 2: Algorithmic Memory Thrashing¶
The Solution: rusket employs a zero-allocation intersect_count_into() kernel. It pre-allocates a thread-local scratch buffer, intersected in-place, with an early-exit heuristic that aborts the memory scan the moment it proves the remaining bits cannot satisfy min_support.
Bottleneck 3: Sequential Seriality¶
The Solution: rusket merges tree construction into the parallel worker loop. Conditional trees are collected and mined concurrently inside the rayon thread pool.
Running the benchmarks¶
uv run maturin develop --release
uv run python benchmarks/bench_scale.py # Scale benchmark + Plotly chart
uv run python benchmarks/bench_realworld.py
uv run python benchmarks/bench_vs_mlxtend.py
uv run pytest tests/test_benchmark.py -v -s
Why is rusket faster?¶
| Technique | Description |
|---|---|
| Zero-copy CSR | indptr/indices passed to Rust as pointer hand-offs |
| Arena FP-Tree | Flat children arena, incremental is_path() tracking |
| Rayon | Parallel conditional mining across CPU cores |
| Eclat popcount | Vec<u64> bitsets + hardware popcnt for support |
| No Python loops | FP-Tree, mining, and metrics all in Rust |
pd.factorize |
O(n) integer encoding, faster than pd.Categorical at scale |
Recommender Benchmarks vs LibRecommender¶
Measured with
pytest-benchmark(5 rounds, warmed up, GC disabled). MovieLens 100k dataset (943 users, 1,682 items, 100k ratings). Onlymodel.fit()is timed — no startup or data loading overhead.
| Benchmark | rusket | LibRecommender | Speedup |
|---|---|---|---|
| ALS (Cholesky) (64 factors, 15 epochs) | 427 ms | 1,324 ms | 3.1× |
| ALS (eALS) (64 factors, 15 epochs) | 360 ms | N/A | — |
| BPR (64 factors, 10 epochs) | 33 ms | 681 ms | 20.4× |
| ItemKNN (k=100) | 55 ms | 287 ms | 5.2× |
| SVD (64 factors, 20 epochs) | 55 ms | ❌ TF-only (broken) | — |
| EASE | 71 ms | N/A | — |
Benchmark vs PySpark¶
PySpark is the industry standard for large-scale recommendation workloads. While rusket runs strictly single-machine (utilizing Rust Rayon threading), it relies on native memory and avoids the JVM serialization overhead, resulting in massive speedups over PySpark local mode.
Benchmark environment: Apple Silicon M-Series,
local[*]Spark master, 1 million ALS interactions (20k Users × 10k Items), 50k FPGrowth transactions.
| Algorithm | rusket |
pyspark.ml |
Speedup |
|---|---|---|---|
| FPGrowth (50k txns) | 0.027 s | 3.89 s | 139.8× |
| ALS (1M interactions) | 6.5 s | 71.1 s | 10.9× |