Links

Some background reading.

Laws of Computing

Moore's Law - (Wikipedia)

A key point not normally mentioned is that "clock reach" which is dependent on wire resistance does not scale well. So, although the number of transistors keeps going up, design methodology needs to change - i.e. designs do not scale automatically. Under 45nm design methodology needs to handle "high sigma" Silicon where manufacturing variability is high and hence timing is inconsistent.

Devices of a few nm can be built, but not reliably. Future nano-scale chips will need to be built with fault-tolerance and redundancy (currently only seen in FPGAs and memories).

Amdahl's Law - (Wikipedia)

All systems have a bottleneck, if you fix one bottleneck another one will appear - see The Goal.

Legacy Effects

Von Neumann syndrome - (Wikipedia)

It is useful to remember that current computing architectures have a long history. A particular driving force is that memory chips are manufactured by different companies and on different processes from CPUs. The economics of shrinking Silicon and a shift to using IP based design flows may change that.

Computer Architectures

Harvard Architecture - (Wikipedia)

Processor-In-Memory - (Wikipedia)

PIM has not been a successful in the past because of issues with processing and yield (see VN syndrome above) and the lack of good programming tools, ParC attempts to address the latter problem.

Reconfigurable Computing - (Wikipedia)

A premise behind believing in the future success of PIM and Reconfigurable Computing (using FPGAs) is that computer performance is tied to the physical distance that data is moved during processing. This ties into Amdahl's law in that if you look at a computing operation as fetch-calculate-propagate then the fetch/propagate times are limited by the speed of light and can dominate over the calculate phase, e.g. in a desktop PC there is usually at least 5cm between the CPU and the memory, so under optimal conditions that will take at least 2ns to traverse, the CPU cycle time is under 1ns these days, therefore the fetch/propagate times dominate and a faster processor doesn't help much if you need more performance.

Ideally PIM/Reconfigurable computing only uses long data links where latency is not an issue. The human brain is probably PIM-like in its implementation, see: artificial_neural_network.

New advances in 3D-IC are likely to produce systems where processors are stacked with memory in a PIM-like manner.