About MLX Benchmarks
MLX Benchmark helps you understand how large language models perform on your hardware. Select a model and quantization, set context length and iterations, and run repeatable benchmarks that capture load time, tokenization time, time to first token, throughput, memory usage, and jitter. View detailed metrics grids, visualize comparisons in charts, and manage downloaded model repos directly in the app. Built for developers and AI enthusiasts who want clear, on-device performance insights without leaving their device.
Features:
- Model selection with quantization details
- Configurable prompt, context length, and iterations
- Metrics for load, tokenization, first-token latency, throughput, jitter, and memory
- Charts to compare runs and devices
- Downloaded model management (list, delete, refresh)
- Optional Supabase upload for aggregating benchmark results
Features:
- Model selection with quantization details
- Configurable prompt, context length, and iterations
- Metrics for load, tokenization, first-token latency, throughput, jitter, and memory
- Charts to compare runs and devices
- Downloaded model management (list, delete, refresh)
- Optional Supabase upload for aggregating benchmark results