← Back to context

Comment by saagarjha

5 months ago

Apple has been working on this for years. It's not like they started thinking about memory tagging when Daniel decided to turn it on in GrapheneOS.

GrapheneOS made our own integration of MTE for hardened_malloc and has done significant work on it. It wasn't simply something we turned on. ARM designed and built the feature which was made available in Cortex cores. Google's Tensor uses standard Cortex cores so unlike Qualcomm they didn't need to make their own implementation. Google integrated it into Android and did some work to make it available on Pixels along with fixing many bugs it uncovered, although definitely not all of them. We had to fix many of the issues. Apple had to make their own hardware implementation because they have their own cores, which Qualcomm finally got done too.

Pixels are not the only Android devices with MTE anymore and haven't been for a while. We've tried it on a Samsung tablet which we would have liked to be able to support if Samsung allowed it and did a better job with updates.

GrapheneOS is not a 1 person project and not a hobby project. I wasn't the one to implement MTE for hardened_malloc and have not done most of the work on it. The work was primarily done by Dmitry Muhomor who is the lead developer of GrapheneOS and does much more development work on the OS than I do. That has been the case for years. GrapheneOS is not my personal project.

We've done a large amount of work on it including getting bugs fixed in Linux, AOSP and many third party apps. Our users are doing very broad testing of Android apps with MTE and reporting issues to developers. There's a specific crash reporting system we integrated for it to help users provide usable information to app developers. The hard part is getting apps to deal with their memory corruption bugs and eventually Google is going to need to push for that by enabling heap MTE by default at a new target API level. Ideally stack allocation MTE would also be used but it has a much higher cost than heap MTE which Apple and Google are unlikely to want to introduce for production use.

Android apps were historically largely written in Java which means they have far fewer memory corruption bugs than desktop software and MTE is far easier to deploy than it otherwise would be. Still, there are a lot of native libraries and certain kinds of apps such as AAA games with far more native code have much bigger issues with MTE.

  • None of this is wrong but none of this really has any impact on what Apple decided to do. In fact Apple very specifically chose not to go in this direction as they describe in their blog post.

    • The side channel fixes and new MTE instruction features are not specific to Apple. Apple's blog post has some significant misleading claims and omissions. It's marketing material, not a true technical post without massive bias. It's aimed at putting down the existing deployments of MTE, hyping up what they've done and even downplaying the factually widespread exploits of Apple devices which are proven to be happening. If they're not aware of how widespread the exploits of their devices are including by low level law enforcement with widely available tools, that's quite strange.

      10 replies →

  • Is MTE on GrapheneOS restricted to some (newest?) Pixel models? Or does it work with all models that are currently supported by GrapheneOS itself?

    • It's available since October 2023 when it launched on the Pixel 8. We integrated it into hardened_malloc that month and deployed it in production. We've been working on further research and improvements based on MTE since then.

      GrapheneOS always uses it for the kernel, all of the base OS processes including apps with a couple exceptions, user installed apps opting into it and user installed apps solely written in Java/Kotlin which are very common on Android. For other user installed apps, there's a toggle for users to opt-in and most apps work with it already. For apps not known to work with it, there's a user-facing system for MTE crash reports and users can make an exception. Users can't disable it for base OS apps or apps which should work due to opting in or being pure Java/Kotlin.

      Apple uses it for the kernel and parts of the base OS. They require opt-in by app developers and discourage doing it.

      GrapheneOS is working on improvements to the kernel integration, Chromium PartitionAlloc integration and other aspects of it. We'll enable enforcement of tags for untagged memory once that's available, but we're also expanding the tagging. As an example, fully enabling stack allocation tagging has a more than acceptable performance cost for GrapheneOS but not Apple or Google. That's something we've been actively testing and will be deploying.

I didn't mean to imply Apple (and Google) hadn't been spearheading multi-year efforts to ship this in collaboration with Arm, I regret a little that it came across that way. Just that it would be nice to see production use of it acknowledged even just as a passing comment.

As an outsider I am quite ignorant to what security developments these companies are considering and when the trade-offs are perhaps too compromising for them to make it to production. So I can't appreciate the scale of what Apple had to do to reach this stage, whereas with GrapheneOS I know they favour privacy/security on balance. I use that as a weak signal to gauge how committed Apple/Google/Microsoft are to realising those kinds of goals too.

  • ARM largely built and shipped it on their own. Cortex cores were the first real world implementation. Pushing ARM to care about it as a security feature instead of only a bug finding feature is something Apple and Google are probably responsible for doing. Pixels are not the only Android devices making MTE but were the first to take advantage of the CPU support by actually setting it up and making it available for use. There are other Android devices doing that now too.

    Qualcomm has to make their own implementation which has significantly delayed widespread availability. Exynos and MediaTek have it though.

  • Personally I had no idea anyone had shipped this. I knew that MTE existed, though I don’t think I knew about EMTE.

    Nice to hear it’s already in use in some forms.

    And of course it seems pretty obvious that if this is in the new iPhones it’s going to be in the M5 or M6 chips.

    • ARM shipped it as a standard feature of Cortex cores significantly after it was added as an ISA extension. MediaTek and Exynos provide it and Snapdragon is approaching shipping an implementation.

      Google set it up for usage on Pixels, and then later Samsung and others did too. Pixel 8 was the first device where it was actually usable and production quality. GrapheneOS began using it in production nearly immediately after it launched on the Pixel 8.

So Apple did research and Daniel just “turned it on”?! I am not talking about Hardware part even then you're biased and dismissive of other's effort.

  • Shipping MIE (or even MTE) is a many-year effort that requires several parties. I appreciate that Daniel and the GrapheneOS team have been working on making sure the allocator is MTE aware, as well as (I assume) updating Android code to work under MTE. However, to actually ship this, you need someone to design the feature itself, then threat model it, release hardware for it, plumb it through the build system and make sure the OS is aware of it, and then there's a bunch of ongoing work that needs to be done so that it can be released. Much of this work was done by Google and Arm, not Daniel, involving dozens if not hundreds of engineers.

    Daniel's position on MTE for a while has been that Google is dragging their feet in turning it on, but he fails to understand that there is more to it than just flipping a switch that he does in his OS. To actually productionize it requires a huge amount of effort that Apple put in here and Daniel, as talented as he is, really can't do. We know this because Google was not able to do it even though they wanted to. (For the avoidance of doubt: Google does want to turn on MTE, they're not just dawdling "just because". The current MTE implementation is not good enough for them.)

  • It certainly isn't something you can just turn on. I don't know how hardened_malloc works, but one problem is that C malloc() doesn't know the type of memory it's allocating, which is naturally an issue when you need to… allocate typed memory.

    You can fix this insofar as you control the compiler and calls to malloc(), which you don't, because third party code may have wrappers around it.

    • MTE is not about typed memory. It's for detecting invalid memory accesses outside of an object or outside of the lifetime of the object in general. hardened_malloc is the main place GrapheneOS implements MTE for userspace. In the kernel, it's implemented in various allocators and in Chromium in PartitionAlloc. The kernel and PartitionAlloc allocators have typed allocator designed unlike malloc. It's still possible to do partitioning for malloc via size classes and call locations.

      1 reply →