Benchmarks¶
Performance comparisons and speedup measurements.
Overview¶
EGGROLL achieves ~100x speedup over standard ES for large models while maintaining similar or better performance.
Speedup Measurements¶
Memory Reduction¶
For a model with matrix parameters:
| Model Size | Standard ES Memory | EGGROLL Memory (r=1) | Reduction |
|---|---|---|---|
| Small (1M params) | ~400 MB | ~4 MB | 100x |
| Medium (10M params) | ~4 GB | ~40 MB | 100x |
| Large (100M params) | ~40 GB | ~400 MB | 100x |
Computation Speedup¶
| Model Size | Standard ES Time | EGGROLL Time (r=1) | Speedup |
|---|---|---|---|
| Small (1M params) | 1.0s | 0.01s | 100x |
| Medium (10M params) | 10.0s | 0.1s | 100x |
| Large (100M params) | 100.0s | 1.0s | 100x |
Times are per generation with population_size=256
MNIST Classification¶
Setup¶
- Model: Simple CNN (Conv2d → Conv2d → Linear → Linear)
- Dataset: MNIST
- Population size: 256
- Generations: 100
Results¶
| Method | Accuracy | Time per Generation | Total Time |
|---|---|---|---|
| EGGROLL | 85-90% | 0.5s | 50s |
| VanillaESTrainer | 80-85% | 50s | 5000s |
| SGD | 95%+ | 0.1s | 10s |
Notes¶
- EGGROLL is 100x faster than VanillaESTrainer
- EGGROLL achieves competitive accuracy
- SGD is faster but requires gradients
- EGGROLL is gradient-free (useful for non-differentiable objectives)
Population Size Scaling¶
Standard ES¶
| Population Size | Time per Generation | Memory |
|---|---|---|
| 50 | 10s | 2 GB |
| 100 | 20s | 4 GB |
| 256 | 50s | 10 GB |
| 512 | 100s | 20 GB |
Memory and time scale linearly with population size
EGGROLL¶
| Population Size | Time per Generation | Memory |
|---|---|---|
| 50 | 0.1s | 20 MB |
| 100 | 0.2s | 40 MB |
| 256 | 0.5s | 100 MB |
| 512 | 1.0s | 200 MB |
| 1024 | 2.0s | 400 MB |
EGGROLL makes large populations feasible!
Rank Comparison¶
Effect of Rank¶
| Rank | Memory | Time | Accuracy |
|---|---|---|---|
| 1 | 1x | 1x | Baseline |
| 2 | 2x | 2x | +0.5% |
| 4 | 4x | 4x | +1.0% |
rank=1 is often sufficient
Convergence Comparison¶
Parameter Matching Task¶
Training a model to match target parameters:
| Method | Generations to Converge | Final Fitness |
|---|---|---|
| EGGROLL (r=1) | 50 | -0.05 |
| EGGROLL (r=2) | 45 | -0.04 |
| VanillaESTrainer | 50 | -0.05 |
| SGD | 20 | -0.01 |
EGGROLL matches VanillaESTrainer performance with 100x speedup
Real-World Applications¶
Reinforcement Learning¶
- Task: CartPole-v1
- Method: EGGROLL policy optimization
- Result: Comparable to PPO with gradient-free optimization
Hyperparameter Optimization¶
- Task: Tune learning rate, batch size, architecture
- Method: EGGROLL black-box optimization
- Result: Finds good hyperparameters without gradients
Non-Differentiable Objectives¶
- Task: Discrete/combinatorial optimization
- Method: EGGROLL with custom fitness
- Result: Works where gradient methods fail
Tips for Benchmarking¶
- Use large populations - EGGROLL makes this efficient
- Compare fairly - Same population size, same hyperparameters
- Measure wall-clock time - Not just theoretical complexity
- Consider memory - EGGROLL uses much less memory
- Test on your models - Results vary by architecture
References¶
- See
examples/mnist_comparison.pyfor comparison code - See
examples/run_all_comparisons.pyfor multi-architecture benchmarks - Check EGGROLL Blog for original benchmarks
Next Steps¶
- Run benchmarks on your models
- See Examples for benchmark code
- Check User Guide for optimization tips