As has become tradition for Ars at Google I/O, we recently sat down with some of the people who make Android to learn about the OS directly from the people who make it. For 2019, the talk was all about Android Q and this year’s big engineering effort, Project Mainline. Mainline’s goal is to enable Google (and sometimes OEMs!) to directly update core parts of the OS without pushing out a whole system update.
If that sounds technical and challenging, well, it is.
This year running the Ars Android Interview Gauntlet we have veteraninterviewee Dave Burke, VP of engineering for Android. As the head of Team Android, Burke is an encyclopedia of Android knowledge and always manages to come up with insightful answers to my grab bag of esoteric questions. And returning for the second year in a row is Iliyan Malchev, principal engineer at Android, the lead of Project Treble, and all-around Linux integration guru.
But to help up the ante for this latest deep dive, Ars was also joined by Anwar Ghuloum, Android’s senior director of engineering and the lead of Project Mainline. Ghuloum’s insight was especially welcomed given this year’s I/O headliner: as “The Next Great Android Update Project,” Mainline was easily the biggest news to come out of the conference.
So, buckle up for a long Android Q(&A) if you will—but first, some background on Mainline.
Project Mainline: A “fundamental shift” in Android OS development
For years, we’ve seen Google continually work to chop Android up into more easily updatable pieces. Early on in Android’s life, the Google apps and core system apps were offloaded to the Android app store, allowing Google to pump out new user-facing features whenever it wanted. Google Play Services then took many developer APIs and offloaded to the Android app store, allowing Google to pump out developer-facing API updates whenever it wanted. More recently Android 8.0 brought us Project Treble, which separated the OS from the hardware support, allowing for easier update development.
With Android Q, the big new modularization effort is “Project Mainline.” Along the same lines as Google’s early-days move to put apps in the Play Store, Mainline modularizes several core system components and moves those to the Play Store. Mainline goes deeper into the system than the surface-level apps, though—these are big chunks of system functionality like the media framework and ART, the Android RunTime.
Traditionally, the Play Store has distributed apps only in the form of APK files, but for many of the components being modularized in Project Mainline, they wouldn’t work if packaged up as an APK. Since the APK system was built for system and user-level apps, there are limitations for things like permissions and when they can turn on in the boot up process. For modularizing these core components, Google came up with something more powerful than an APK: the “APEX” file type. APEX files can have essentially root-level permissions, and they get to start up very early in the boot process, allowing Google (or your OEM) to update many more components. APK files are packages for system- and user-level apps, and APEX files are packages for core system components. This table shows the first batch of them in Android Q:
In the future, we’ll probably see Project Mainline modules grow to encompass more and more of the Android system. For this first Android Q release, though, Google chose to focus on three themes: “Consistency,” “Security,” and “Privacy.” Before our I/O interview, Google provided us with the above table of the Project Mainline components in Android Q, detailing which components are being modularized and what the recommendations are for OEMs. And that brought us to the first question.
What follows is a transcript, with some of the interview lightly edited for clarity. For a fuller perspective, we’ve also included some topical background comments in italics.
Ars: So I have this Project Mainline table, which details which component are recommended or not. How did you go about picking what is and is not mandatory?
Anwar Ghuloum, the head of Project Mainline: Ideally, we’d want everything to be mandatory. The way we worked on these modules was to talk to all our device manufacturers and say, “Hey, we’re doing this, work with us on it.” They upstreamed a bunch of code. They had a bunch of future requests for things that they were beginning the process of working on, and, for those modules where we could actually meet all those requirements, we made those mandatory. For the modules where there are still gaps, we made them optional for this release, and for the next release they’ll be mandatory. So that gives us time to get to parody, because we don’t want to regress their device experience, but pushing these modules, we want to make sure their stuff gets in.
Dave Burke, VP of engineering: I think part of this work is upstreaming with our partners. When I say partners, I’m talking about device makers. They add changes into the device they build, and we want to get them all upstreamed to mainline code branch, so we have consistency. It just takes time.
Ghuloum: Yeah, I mean, we’ve done a ton of upstreaming. It’s amazing. For some of these packages, we upstreamed more in the last year than we’ve upstreamed in the previous 10 years.
Burke: Yeah, it’s important.
Ghuloum: What we explained to our teams is that the premise of using a Mainline module is that you will get to release once a month. That you are actively working with the partners, co-developing, planning your roadmap, and stuff like that. People in the team have been pretty compelled by that, and excited about it.
Ars: Oh, is that the plan—a once-a-month release for Mainline modules?
Ghuloum: Well, that’s our trained cadence and that’s driven by our security update schedule, because some of the components are security sensitive. The media component in particular comprises primarily codecs and extractors. One of the reasons that’s a module is that we looked at vulnerabilities over the last year, and nearly 40 percent of patch vulnerabilities in our security updates came from those modules. So, we’re like, “Hey, what if we could just push these out to the entire ecosystem, instead of putting the burden on the OEM to take these, test them, and push them out themselves?”
Burke: The other thing is, we often hear from developers on what we could do to make their lives easier on Android. One of the things that comes up often is fragmentation of slightly different behaviors in different parts of the OS, even within the same manufacturer—the media framework is one they bring up. And so more consistency there is good for the developers, too. It reduces errors and the work they have to do, and it increases the quality of apps, which is good for users.
Ghuloum: I was calling this “bug consistency” yesterday.
Burke: (laughing) Bug consistently! Yeah, that’s true.
Ghuloum: There’s this module called “ANGLE;” it’s basically OpenGL implemented on Vulcan. Right now it’s mandatory for OEMs, but developers can opt in to whether they use it or not. The idea is to lean into the kind of the support for Vulcan that’s coming on all these devices. Having a consistent GL implementation—not necessarily a bug-free one, because we never ship bug-free software, nobody ever does—but the thing for game devs that they struggle with is, they’re used to bugs in drivers, but all these different bugs and different drivers are super painful. We can make that much more consistent.
Burke: The other way to think about this is: it’s generally good hygiene. You look at the GPS rollover that happened on April 6, for example. It grounded some airplanes because they couldn’t cope with the clock rolling over. There’s always something that’s going to happen in software, and you want to have the ability to get this updated, especially, like, really low-level stuff—like with Conscript, which is our secure library, SSL library, and TLS. That’s updatable, as well. And that’s another area that when certificates expire, or you’re cert provider suddenly goes out of business, you can fix that.
Ghuloum: Or, the BoringSSL bugs.
Burke: Or, the BoringSSL bugs, yeah exactly. They’re kind of unsexy but fundamental components in the system.
Iliyan Malchev, Project Treble lead: Years ago, there was a bug in Bionic that was introduced by one of our partners, who had the sign tables wrong. So, trade functions were randomly failing, and a range of the curve in a way could break games. So, stuff like this is incredibly hard to catch before you ship.
Ghuloum: And the developers had to live with it for the next few years—unless it ever gets patched. I’ve seen that with some of our own first party apps, [they] have to work around bugs throughout the ecosystem. It’s just spaghetti code.
Ars: So, was there a test update that went out to Beta Q users?
Ghuloum: Yes—actually, as of Beta 2, we started pushing updates. There are threads on Reddit about this.
Ars: Right, OK.
Ghuloum: Their devices were rebooting and, yes, we were pushing updates, testing updates, and we were rebooting people’s devices. We’re only doing this in the Beta, actually. During production, when Q ships, all reboots will just be organic reboots from the user. We looked at the numbers, and it looks like over a couple of weeks, that gets us to a reasonable saturation level of people taking the update. Plus, we have monthly security updates we’re going to be rebooting at least once a month, anyway. So, you’ll just take it. We don’t want to put UX in the user’s face. If there’s an update waiting, we just wait for them to reboot.
Ars: OK. Do you think that’s what the final version is going to look like—kind of a quiet background thing that won’t be very visible?