java high concurrency volatile implementation principle

2020-06-15 09:06:22
OfStack
java high concurrency volatile implementation principle

Abstract: Both synchronized and Volatile play important roles in multithreaded concurrent programming. Volatile is a lightweight synchronized, which guarantees the "visibility" of Shared variables in multi-processor development. Visibility means that when one thread modifies a Shared variable, another thread can read the modified value. It has less overhead than the synchronized in some cases

1. Definition:

The java programming language allows threads to access Shared variables. To ensure that Shared variables are updated accurately and accurately, the thread should ensure that the variable is obtained separately through exclusive locks. The Java language provides volatile, which is more convenient than locking in some cases. If a field is declared as volatile, the java thread memory model ensures that all threads see the value of this variable as 1

2. Implementation principle of volatile

So how does Volatile guarantee visibility? Get the assembly instructions generated by the JIT compiler from the tool under the x86 processor to see what CPU does when writing to Volatile.

Java code: instance = new Singleton(); //instance is the volatile variable

Assembly code: 0 x01a3de1d: x0 movb $0, 0 x1104800 esi (%); 0 x01a3de24: lock addl $0 x0 esp (%);

The es58EN-32 architecture software developer's manual shows that the instruction with the lock prefix causes two things in multi-core processors.

Data from the current processor cache row is written back to system memory.

This write back operation will invalidate data cached in the memory address in other CPU.

Processor in order to improve the processing speed, not directly and memory to communicate, but first the system memory data read internal cache (L1 L2 or other) after operation, but after the operation I don't know when will write memory, if the statement Volatile variables to write operation, JVM will send processor 1 Lock prefix instruction, will cache the data the variable is written back to system memory. But if written back to memory, if other processor cache value or the old, to perform calculation operations will have a problem, so on a multiprocessor, in order to ensure that each processor cache is 1 to, the consistency will implement caching protocols, each processor by sniffing the spread of the data on the bus to check the value of the cache is expired, when the processor found himself cache line corresponding to the memory address has been changed, and will be set for the current processor cache line in invalid state, when the processor wants to modify the data operation, will be forced to read the data from the system memory to the processor cache.

The Lock prefix instruction causes the processor cache to write back to memory. The Lock prefix instruction causes the processor's LOCK# signal to be spoken during the execution of the instruction. In a multiprocessor environment, the LOCK# signal ensures that the processor has exclusive use of any Shared memory during the duration of the signal. (Because it locks the bus, other CPU cannot access the bus, which means no access to system memory), but in recent processors, LOCK # signal 1 does not lock the bus, but locks the cache, because locking the bus is expensive. The impact of a lock operation on the processor cache is detailed in Section 8.1.4. For Intel486 and Pentium processors, the LOCK# signal is always declared on the bus during the lock operation. However, in P6 and more recent processors, the LOCK# signal is not claimed if the accessed memory region is already cached inside the processor. Instead, it locks the cache of this memory region and writes back to memory, and USES the cache 1 alignment mechanism, known as "cache locking," to ensure atomicity of the modification. This mechanism prevents simultaneous modification of the memory region data cached by more than two processors.

A cache written back to memory by one processor will invalidate the caches of the other processors. IA-32 processors and Intel 64 processors use the MESI (Modified, exclusive, Shared, invalid) control protocol to maintain 1 consistency between internal caches and other processor caches. When operating on multi-core processor systems, es90EN-32 and Intel 64 processors can sniff other processors for access to system memory and their internal caches. They use sniffer technology to keep their internal cache, system memory, and cached data from other processors on the bus. For example, in Pentium and P6 family processors, if a processor is sniffed to detect that another processor intends to write a memory address that is currently processing the Shared state, the processor being sniffed will invalidate its cache line and force a cache line fill the next time the same memory address is accessed

Thank you for reading, I hope to help you, thank you for your support to this site!