Speculative Execution

With speculative execution, the operations are evaluated, but the final results are not stored in the program registers or data memory until the processor can be certain that these instructions should actually have been executed.

Eliminate write/read Dependency

the outcome of a memory read depends on a recent memory write.

Loop Unrolling

for (int i=0; i<n-3; i+=4)  // note the n-3 bound for starting i + 0..3
  sum1 += data[i+0];
  sum2 += data[i+1];
  sum3 += data[i+2];
  sum4 += data[i+3];
sum = sum1 + sum2 + sum3 + sum4;
// if n%4 != 0, handle final 0..3 elements with a rolled up loop or whatever

In general, we have found that unrolling a loop and accumulating multiple values in parallel is a more reliable way to achieve improved program performance.


In computer scienceCPU Instruction pipelining is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some CPU Instruction by dividing incoming instructions into a series of sequential steps (the eponymous “pipeline”) performed by different processor units with different parts of instructions processed in parallel.

A key feature of pipelining is that it increases the throughput of the system (i.e., the number of customers served per unit time), but it may also slightly increase the latency