Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Linux does not work on systems without coherent caches between CPU accesses (some IO device incoherence is allowed). There is no cache shootdown IPI like that required[1].

ARM and x86 both can delay stores behind later loads. "Timeliness" of when stores might become visible is almost never specified exactly by any ISA or implementation (maybe some real-time CPUs, but Linux does not depend on such), but you will never get into the situation where a memory operation reads "stale" data beyond a fairly strict specification of memory consistency.

ARM does have some weaker ordering than x86, but this is all about how operations within a single CPU/thread behave. An ARM CPU can perform two loads out of order, and perform two stores out of order (with respect to what other CPUs can observe). Use barriers to order those, and you have ~same semantics as x86, and those barriers don't need to "reach out" on the interconnect or to the caches of other CPUs. Once the data leaves your store queues, both x86 and ARM, all other CPUs will see the result, because for it to be accepted into coherent caches, all other copies in other caches need to be invalidated first.

[1] Linux does work on some CPUs where instruction caches and/or the instruction fetch/execution pipeline are not coherent with data operations so there can be some situations on some CPUs where code modification may need to send IPIs to other CPUs to flush their instruction caches or pipelines, so speaking purely about memory data operations and data caches above.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: