Hi everyone.
As promised, here is our quarterly update of Q1 and what’s coming in the next quarter. I’ll start with a topic deep dive, and go into what we’ve done and will do next afterwards. I’m eager to hear feedback on this format.
The Startup Problem
Since we launched we’ve been tackling many issues, the main topic being long startup times. So why was Native Access such a pain to start up, and why did it take so long to improve that?
The core problem lies with the tasks that we were executing. As mentioned in our previous Devtalks, the new Dark-themed Native Access (NA2) and the NTK Daemon are two separate entities working together, and that makes it so we can iterate faster by tackling problems in each entity independently. But when it came to startup times, we were very dependent on each other to go through the following steps:
- Install NA2
- Have NA2 install the NTK Daemon
- Connect NA2 to the user’s Native ID (log in, if the user isn’t already logged in)
- Retrieve all the product details from our Content Management System (CMS)
- Retrieve the logged-in account’s licenses and subscriptions (or known products)
- Activate the user’s products
- Check the status of the user’s products, whether they’re installed, broken, etc.
- Do some filtering to display what’s relevant for the user (i.e. filter out bundle licenses)
All these steps are still necessary today, and initially we had these tasks executed in sequential order. This excludes additional tasks, such as setting up the helper tool for system permissions to install products, or asking to install Rosetta 2, just to name a few. While we coded these steps above, we already experienced extensive startup times. Though we knew we’d want to keep tackling this, we couldn’t hold off on launching the other improvements in NA2 any longer.
After we launched NA2 with version 2.1.0, we were well aware of the long start up times and issues within each phase of the startup process. We definitely underestimated the user impact though, as our teams were experiencing a max of 30 seconds, while users would be experiencing up to a few minutes, regularly. To get an idea of where we started near launch, we had the following time ranges:
- 0-10s startup times (30%)
- 10-30s startup times (54%)
- 30-60s startup times (11%)
- 60s+ startup times (5%)
September 27, 2022
In Q3 2022, we stabilized many parts of NA2. With release 2.6.0 we adjusted the activation system, skipping already activated products. Although a minor improvement, we still expected a slight shift in our metrics, but at the time there was one factor we didn’t consider: overall traffic. As shown, between 20 September and 29 September, we went from:
- 0-10s (21%)
- 10-30s (57%)
- 30-60s (15%)
- 60s+ (7%)
to:
- 0-10s (18%)
- 10-30s (59%)
- 30-60s (14%)
- 60s+ (9%)
The worse startup times were due to an increase in overall traffic, from 14.8k to 21.2k daily active users. But in the meantime, the NA2 team was already busy completely overhauling the way the interface interacts with state changes.
November 21, 2022
Internally, we were excited about the things coming in 3.0.0: changing the entire architecture to respond to backend events reactively as opposed to proactively, thereby not blocking the app in all of the steps mentioned above (mainly post login) and being much more performant. This took us all summer to complete, as we found many opportunities to stabilize and improve other parts of the app along the way.
While we definitely realized the performance upgrades, something was still holding back the expected start time improvements. On this day, we experienced a large influx of users following the annual cyber sales period, and our backend servers were overloaded. The work to bring them back online and stabilize them coincidentally led to improvements in the startup times. Calls were taking less long and our app was reacting to them way more efficiently, leading to the following changes overnight:
- 0-10s (54%, up from 13.6%)
- 10-30s (39%, down from 69%)
- 30-60s (3%, down from 13%)
- 60s+ (2%, down from 3%)
These trends continued. We felt like we were out of the danger zone, but still had work to do to achieve 75% of startups taking 0-10s.
February 20, 2023
In Q4, we collaborated across departments to generally overhaul the product activation flow to further improve startup time. The problem was that we were checking if we needed to activate all of the user’s products every single startup, which takes a lot of time. We tackled some of this earlier on NA2’s side by skipping some products that were already activated, but it needed more support from more parts of the company to have a larger impact, as well as just activating products with updated statuses as opposed to checking if every product needed activating. While activating all products makes sense for the first time startup, every startup after that we only care about account or status updates, such as subscriptions running out, new products, or transferred products. But even those, we could do when they happen rather than check on startup, and even more so, we could move the moment we activate over to when we install a product as well. All these points culminated in release 3.2.0, which improved startup times significantly:
- 0-10s (80%)
- 10-30s (16%)
- 30-60s (2%)
- 60s+ (2%)
This was beyond our goal of 75% of startups taking 10 seconds or less! We recognize that Daemon installation times can be further improved, but we still expect further improvements as more people use version 3.2.0. As of releasing this post, we’re sitting at 85% of all startup times in the 0-10s time window.
So what now?
We’re extremely excited about the startup time improvements. We have a few more steps that we’re getting to here and there, like mitigating the amount of times users startup Native Access and need to install dependencies again, and we’re also aware of some users still experiencing bad startup experiences. But we’re beginning to focus on other topics.
We can still make large improvements to start up times through a new initiative: Daemon Stabilization. We recognize that the Daemon is crashing more often than we’d like, affecting processes such as startup and installations. To get a better understanding why these crashes occur, we’ll work on improving our diagnostic systems, which will help us further our needs to improve download success rates, have better app session performances, and a more solid startup experience.
Another thing we’ve decided to bump up in priority is stabilizing the download queue. Our current problem is that NA2 and the NTK Daemon are sharing bits of the functionality, which leads to inconsistencies in our diagnostics tracking, downloads disappearing when closing an NA2 session, as well as creating difficulties in testing and quality assurance. We will move this functionality completely into the Daemon, while also addressing bugs and making minor improvements, such as retrying failed deployments in cases where the network connection was lost. So far, we’ve found several major improvements we can make, and are excited to ship this in Q2.
But even more exciting is the fact that we finalized UX discussions about uninstall, which means work can get started. For Q2, we intend to give you the ability to uninstall products, starting with content products and progressing through all other types of products.
And lastly, I’m loving the response to our transparency efforts! To make sure you’re in the loop for all the exciting things happening in Native Access, here’s a preview of the current initiatives we’re working on:
Note: The order of topics and priorities are not fully locked in and subject to change. Everything in progress is in active development. No accurate time frame can be given at this stage.
And of course, a reminder that our team reads all incoming feedback, so please do drop a feature request here if you have one. For any support you might need, please visit our support channel and drop a ticket there. The support team has been helping us tremendously by keeping an eye on emerging concerns, frustrations, and negative trends, and we couldn’t do all that we’re doing without them.
Conclusions
Native Access 2 is fulfilling its promise to be more transparent. This quarter we’ve given you a glimpse into our startup issues, and next quarter we’ll talk more about how we’ve been tackling our shortcomings with deployment success rate.
A few important things: with 3.3.0, we deprecated Rosetta 2 and officially dropped support for Mac OS X 10.14. You should be able to use Native Access still, but for any issues you might face we’ll implore you to update your operating system first. Additionally, add serial messaging is now more complete, so you’ll get more clear feedback if something goes wrong, and the ability to take action. Please look out for our release notes of any bug fixes that might solve some concerns you are facing, as we’re hard at work trying to make things better.
Eager to see your feedback and discussions down below. Really enjoyed it last time!