A common problem in modern architectures with multiprocessor caches is called false sharing. This occurs when each individual processor is attempting to use data in another memory region and attempts to store it in the same cache line.
This causes the cache line — which contains data another processor can use — to be overwritten again and again. Effectively, different threads make each other wait by inducing cache misses in this situation. See also: How and when to align to cache line size?