A stronger, fairer way to secure cryptocurrency - and a benchmark tool to measure it
RandomX is a proof-of-work algorithm - a mathematical puzzle that computers solve to validate and secure cryptocurrency transactions. Think of it like a combination lock that takes real effort to open, ensuring nobody can cheat the system.
What makes RandomX special is that it's designed to run best on everyday CPUs - the same processors in regular desktops and laptops - rather than expensive specialized hardware. This keeps mining accessible and decentralized.
Each "hash" is one attempt at solving the puzzle. More hashes = more attempts per second.
RandomX v2 makes the puzzle 50% harder per hash by increasing the number of computational steps from ~4.2 million to ~6.3 million operations. But "harder" here is a good thing - it means each solution proves more real work was done.
| Property | v1 (Original) | v2 (New) |
|---|---|---|
| Operations per hash | ~4.2 million | ~6.3 million |
| Program size | 256 instructions | 384 instructions |
| Work increase | Baseline | +50% more work per hash |
| Best hardware | CPUs | CPUs (even more so) |
More operations per hash means the puzzle is harder to shortcut with specialized chips (ASICs/FPGAs).
Keeps everyday CPUs competitive, so regular people can participate - not just large mining farms.
Modern CPUs can do more useful work per watt of energy with the v2 algorithm.
The benchmark suite rigorously tests performance, stability, and power consumption before adoption.
This project includes an automated benchmarking script that runs both v1 and v2 side by side on your hardware and compares the results. Here's what it measures:
How many hashes per second each version achieves - the raw throughput of the algorithm.
Watts consumed during the test, measured directly from CPU power sensors (RAPL).
RandomX instructions per joule of energy - the most important metric for real-world viability.
Tracks crashes and errors over hundreds of runs to ensure the algorithm is rock-solid.
Combines hashrate with operations-per-hash to show real computational output per second.
Automatically detects your CPU type and applies optimal settings for AMD Ryzen, EPYC, and Intel processors.
The ideal outcome of adopting RandomX v2:
Imagine a spelling bee where contestants prove their skill by spelling words correctly.
The benchmark tool is the stopwatch and scorecard - it times each contestant, tracks their accuracy, and figures out who is the most efficient speller.
To understand why v2 exists, you need to understand a fundamental tension inside your CPU: computation is fast, but fetching data from memory is slow.
RandomX's main loop works by repeatedly executing a small random program and then reading a chunk of data (64 bytes) from a large Dataset stored in RAM. Each program iteration is deliberately tuned to take about 50–55 nanoseconds - roughly the same time it takes for DRAM to deliver data. The idea is that the CPU computes while the next piece of data is being fetched in the background (prefetched), so neither the CPU nor the memory bus sits idle.
The problem? While CPU cores got faster over the years, RAM latency stayed basically the same. CPUs have advanced roughly 1.5x since RandomX was designed in 2019, but DRAM still takes ~50–55 ns to respond. Modern processors like AMD Zen 5 finish the v1 program well before the data arrives - and then sit idle, burning power while waiting.
The memory fetch takes ~50 ns. The CPU finishes its work in less time, then stalls waiting for data.
This idle window is the core inefficiency of v1. The CPU has finished all its work but the next piece of dataset hasn't arrived yet. Those wasted cycles still consume power - your CPU is burning electricity while doing zero useful work.
RandomX v2 makes two key changes to eliminate the idle window:
CPUs have gotten roughly 1.5x faster since RandomX was designed in 2019, but RAM latency stayed at ~50-55 ns. So v2 increases the program to 384 instructions (1.5x), matching the CPU-to-memory speed ratio again and filling the gap with real computation.
In v1, the F and E registers were mixed with a simple XOR - nearly instant. In v2, they're mixed with 16 AES encryption rounds. This uses the CPU's dedicated AES hardware during what was previously dead time, and also improves data entropy before scratchpad writes.
Same memory fetch time, but now the CPU stays busy until data arrives. No wasted cycles. No wasted cycles. No wasted energy.
The second major optimization is about how far ahead the CPU looks when pre-loading data from RAM.
While running iteration N, the CPU requests the data for iteration N+1. On faster modern CPUs, this isn't enough lead time - the data may not arrive before it's needed, causing a stall.
While running iteration N, the CPU requests data for iteration N+2. This gives DRAM twice as long to deliver, virtually guaranteeing the data is ready when needed - even on the fastest modern processors.
v1 (1 iteration lookahead):
v2 (2 iteration lookahead):
By requesting data two iterations in advance, the memory system has twice as long to deliver - eliminating stalls on fast modern CPUs.
A third, more subtle change targets floating-point performance. The CFROUND
instruction controls the CPU's rounding mode for floating-point math. In v1, it executed
frequently - but changing the rounding mode is expensive on modern CPUs
because it forces the processor to flush its instruction pipeline.
In v2, CFROUND becomes conditional, executing only 1/16th of the
time. This dramatically reduces pipeline flushes. On AMD Zen 1 CPUs, this single change
yielded an 8.4% hashrate increase, with 5–10% gains expected across
AMD architectures.
v1 - CFROUND fires often (every ~16 instructions):
Each red bar = pipeline flush. CPU throws away in-progress work and restarts.
v2 - CFROUND conditional (1/16 chance):
Rarely interrupted. The CPU's pipeline stays full almost all the time.
The total operations per hash tell the story: v1 performs 4,456,448 total ops (4.2M VM + 262K AES), while v2 performs 6,815,744 total ops (6.3M VM + 524K AES) - a 52.9% increase in work per hash.
Raw hashrate (hashes/second) may drop slightly on some CPUs, but the work per joule - the metric that actually matters - improves dramatically:
| Processor | Arch | Hashrate Change | Work/Joule Improvement |
|---|---|---|---|
| Ryzen AI 9 365 | Zen 5 | +9.2% | +67.0% |
| Ryzen AI 9 HX 370 | Zen 5 | +8.0% | +65.1% |
| Ryzen 7 1700X | Zen 1 | +0.8% | +54.1% |
| Core i9-12900K | Alder Lake | -3.9% | +47.0% |
| Ryzen 9 7945HX | Zen 4 | -5.1% | +45.2% |
| Ryzen 9 3950X | Zen 2 | -7.9% | +40.9% |
| Ryzen 5 8600G | Zen 4 | -8.5% | +39.9% |
| Ryzen 9 5950X | Zen 3 | -12.5% | +38.2% |
| Ryzen 7 3700X | Zen 2 | -14.7% | +30.5% |
| Core i7-8650U | Kaby Lake-R | -22.7% | +18.2% |
| Core i7-6820HQ | Skylake | -24.4% | +15.6% |
Every tested CPU shows a significant efficiency gain - newer architectures benefit the most.
All of these changes share a common theme: making it harder to build a shortcut.
The net effect: a CPU designed for everyday computing (browsing, compiling, gaming) is also the ideal machine for RandomX v2. No specialized hardware can do it significantly better.
An ASIC must defeat every layer to gain an advantage.
Each layer pushes the design closer to a general-purpose CPU.