Speculative Execution
With speculative execution, the operations are evaluated, but the final results are not stored in the program registers or data memory until the processor can be certain that these instructions should actually have been executed.
Eliminate write/read Dependency
the outcome of a memory read depends on a recent memory write.
Loop Unrolling
In general, we have found that unrolling a loop and accumulating multiple values in parallel is a more reliable way to achieve improved program performance.
Pipelining
In computer science, CPU Instruction pipelining is a technique for implementing instruction-level parallelism within a single processor. Pipelining attempts to keep every part of the processor busy with some CPU Instruction by dividing incoming instructions into a series of sequential steps (the eponymous “pipeline”) performed by different processor units with different parts of instructions processed in parallel.
A key feature of pipelining is that it increases the throughput of the system (i.e., the number of customers served per unit time), but it may also slightly increase the latency