Dakarand in the Linux kernel

In 2014 I decided to improve the Linux kernel's random number generator
a bit.  Recently we had all learned the importance of good crypto and
that without a decent RNG all crypto is guaranteed to be bad, so I was
motivated.  It is also clear that the best place for a good RNG is
inside the kernel, as all other code can read random numbers from the
kernel.

So how do we guarantee good random numbers?  Most answers turn out to be
bad.  Most options require some kind of hardware that isn't necessarily
available.  RDRAND, keyboards, disk drives, network adapters, etc. are
all optional and depending on them for a generic RNG seemed silly.  The
non-optional hardware is little more than a CPU, memory and an interrupt
controller.  Pretty limited indeed.

Queue DakaRand.  The idea behind DakaRand is to have two unsynchronized
high-precision clocks and use the drift between those clocks as entropy
source.  DakaRand uses the system clock and a busy loop on the CPU
flipping a bit.  We probably don't want a busy loop, though.

What we can use is the register set.  On interrupts, the register set
gets saved before executing interrupt code.  And some of those registers
are changing at a high rate, e.g. the Program Counter changes more or
less every cycle.  So without detailed knowledge about the register set,
we know that some bits are changing at a high rate and therefore a good
entropy source.  Hashing all registers together will collect that
entropy and thereby seed the RNG.

Ted Ts'o didn't quite like my patch and changed it to only use a single
register per interrupt.  A counter is used to pick a different register
on every interrupt, so we eventually cycle through the register set.
Advantage of Ted's change is to reduce CPU overhead.  Disadvantage is
that it may take a bit longer until enough entropy is collected.  See
commit ee3e00e9e710.

---

To do a bit of theoretical handwaving and satisfy the ivory tower, the
two clocks need to be unsynchronized and high-precision.  Unsynchronized
ensures that the two clocks will randomly drift wrt. each other.  High
precision means the drift translates into flipped bits.

Clocks don't need to be accurate, monotonic or any of the other things
one usually wants from good clocks.  If a clock is off by a second or
even a millenium, we can still extract entropy.  Monotonically counting
up is one way of flipping bits, erratically counting up or down or
making large jumps works just as well.

That means that CPU registers are actually a clock for our purposes.
While you probably can't tell the time of day from it, you have close to
nanosecond precision on most systems.

The interrupts shouldn't be synchronized - or mostly unsynchronized.
F.e.  It is ok if interrupts can only get triggered on even cycles.  In
that case the low bit of the cycle is predictable, but the second lowest
bit remains random.  Since it is hard to know whether such
synchronization is present on any of the 20+ major architecturs the
Linux kernel supports, it's best to just hash everything and assume
there will be entropy somewhere - and use a sufficiently good hash
function to extract that entropy.

---

What remains to be done - mostly because I've been lazy/busy and didn't
send a patch yet - is to handle the two hard cases.  On bootup there is
no entropy yet and the system has to collect some 100 bits of entropy as
quickly as possible.  So Ted's optimization is useful for machines with
100k interrupts per second in the steady state.  But it hurts for the
first few seconds after bootup.

Obvious solution would be a bootup mode.  The first 256 (or so)
interrupts will use all registers instead of just one.  Pessimistically
we can assume each interrupt to collect 1 bit of entropy in bootup mode.
With HZ=100, we should have a fully loaded RNG after 2.6s with timer
interrupts alone.  I believe this method will collect more than 1 bit of
entropy on any hardware and invite people to try proving me wrong.

Final hard problem are virtual machines with snapshots.  Hundreds of VMs
can resume from the same snapshot, with identical RNG state, and now we
essentially have the same problem as on bootup.  But there is no
indication that a bootup mode would be useful now.

Best I can come up with is to collect good entropy (full register set)
at a low rate, maybe once every second.  That should still be cheap
enough for systems with 100k interrupts per second, while quickly
remixing the entropy pool after resume.  It would still be a bad idea to
generate ssh keys right after a resume, but there is only so much you
can do when faced with a malicious administrator.

Joern