Memory controller transactions
To help with IPC, the memory controller should support transactions. Since all CPUs must use the same memory controller in order to use the same RAM, this is analogous to a DBS supporting transactions that multiple clients may simultaneously connect to. By supporting it in hardware at the point where memory updating is necessarily done one-at-atime anyway, you could eliminate the convolutions and dilemmas of trying to eliminate race conditions in software.
Actually, I don't know how transactions work very much. But the memory controller could implement a lock so that the first CPU to request a transaction can have exclusive use until the transaction is over. I hear that transactions normally use a queue, but that doesn't seem necessary in this case. In the same amount of time it would take a cpu to queue a transaction, the memory controller could already have committed the last one to memory. So it's really just about a lock.
Perhaps while one core/cpu is completing a "transaction", another can be modifying memory in a different place, that's defined to not be in the area of that transaction. For example, the opcode for a "transaction" could include a number of bytes of contiguous memory to be considered as part of that transaction.