In recent years, a lot of new
things have appeared on
instagram.com . Lots of. For example - storytelling tools, filters, creative tools, notifications, direct messages. However, as the project grew, all this gave one sad side effect, which was that instagram.com performance began to decline. Over the past year, the Instagram development team has made continuous efforts to fix this. This led to the fact that the total loading time of the Instagram feed (feed page) decreased by almost 50%.
Today we publish a translation of the first material from a series of articles devoted to the story of how instagram.com was accelerated.
About optimizing the performance of web projects
Performance improvement over the past year (Instagram feed, Display Done metric, ms.)
One of the most important approaches to improving the performance of web applications is to properly prioritize loading and processing resources and to reduce browser downtime during page loading. In our case, many of these optimizations have proven to be more effective than reducing the size of the code. As a rule, we had no complaints about the size of the code. It was compact enough. Its dimensions began to bother us only after a lot of small improvements were made to the project (we also plan to tell about code size optimization). Such improvements, in addition, had less impact on the project development process. They required less code changes and less refactoring. As a result, we initially concentrated our efforts precisely on this area, starting with preloading resources.
A story about preloading images, JavaScript code and materials needed to complete queries, as well as where to be careful
The general principle of our optimizations was to inform the browser as soon as possible about what resources are needed to load the page. We, as project developers, in many cases knew in advance what exactly would be needed for this. But the browser could have no idea about this until a certain part of the page materials was loaded and processed. The resources in question, for the most part, included those that are dynamically loaded using JavaScript (for example, other scripts, images, materials needed to execute XHR requests). The fact is that the browser cannot detect these dependent resources until it parses and executes some JavaScript code.
Instead of waiting until the browser itself finds these resources, we could give it a hint, following which it could immediately start downloading them. We did this using the
preload
HTML attributes. It looks something like this:
<link rel="preload" href="my-js-file.js" as="script" type="text/javascript" />
We use similar hints for two types of resources on critical page loading paths. This is dynamically loaded JavaScript code and dynamically loaded materials of GraphQL XHR-requests for receiving data. Dynamically loaded scripts are scripts that are loaded using constructs of the form
import('...')
for specific client routes. We maintain a correspondence list of server entry points and client route scripts. As a result, when we, on the server, receive a request to load the page, we know about the scripts for which client routes you need to download. As a result, we can, when generating the HTML-code of the page, add appropriate hints to it.
For example, when working with the
FeedPage
entry point
FeedPage
we know that the client router will eventually complete a request to download
FeedPageContainer.js
. As a result, we can add the following construction to the page code:
<link rel="preload" href="/static/FeedPageContainer.js" as="script" type="text/javascript" />
Similarly, if we know that a GraphQL query is planned to be executed for the entry point of a particular page, this means that we need to preload materials to expedite the execution of this query. This is especially important due to the fact that the execution of such GraphQL queries sometimes takes a lot of time, and the page cannot render until the query results are returned. Because of this, we need to make the server as early as possible engage in the formation of responses to such requests.
<link rel="preload" href="/graphql/query?id=12345" as="fetch" type="application/json" />
Changes in page loading features are especially noticeable on slow connections. By simulating a fast 3G connection (the first waterfall graph below, which illustrates the situation when resource preloading is not used), we can see that loading
FeedPageContainer.js
and executing the GraphQL query associated with it start only after loading
Consumer.js
. However, in the case where preloading is used, loading the
FeedPageContainer.js
script and executing the GraphQL query can begin immediately after the HTML page is available. This, in addition, reduces the time required to download any secondary scripts that use lazy loading mechanisms to work with. Here,
FeedSidebarContainer.js
and
ActivityFeedBox.js
(which depend on
FeedPageContainer.js
) begin to load almost immediately after processing
Consumer.js
.
Preload not used
Preload used
Benefits of prioritizing preload
In addition to using the
preload
attribute to start loading resources faster, using this mechanism has another advantage. It consists in increasing the network priority of asynchronous script loading. This becomes important when using asynchronously loaded scripts in critical paths to load pages, since by default they load with low priority. As a result, the priority of XHR requests and images related to the area of the page visible to users will be higher than that of materials outside the viewing area. But this can lead to situations where the critical scripts needed to render the page are either blocked or forced to share the bandwidth with other resources. If you're interested,
here's a detailed account of Chrome’s resource priorities. Thoughtful use of the preload mechanism (we’ll talk more about this below) gives the developer a certain level of control over how the browser prioritizes the process of initial loading the page. This is especially true for those cases when the developer knows what resources are important for the correct display of the page.
Preload Prioritization Issues
The problem of preloading resources lies precisely in the fact that it gives the developer additional leverage to influence the priority of resource loading. This means that the developer has more responsibility for the proper prioritization. For example, when testing a site in regions in which the speed of mobile and WiFi networks is very low, and in which there is a high percentage of packet loss, we noticed that the request that is executed during processing of the
<link rel="preload" as="script">
gets a higher priority than the request that is executed when processing the
<script />
JavaScript bundles used in critical page rendering paths. This leads to an increase in the overall page load time.
The source of this problem was how we placed the preload tags on our pages. Namely, we added preload hints only for bundles, which are part of the current page, which we were going to load asynchronously with the client router.
<link rel="preload" href="SomeConsumerRoute.js" as="script" /> <link rel="preload" href="..." as="script" /> ... <script src="Common.js" type="text/javascript"></script> <script src="Consumer.js" type="text/javascript"></script>
For example, on the logout page we load
SomeConsumerRoute.js
to
Common.js
and
Consumer.js
, and since preload resources are loaded with a higher priority, but they are not parsed, this blocks
Common.js
and
Consumer.js
parsing
Consumer.js
. The Chrome Data Saver development team found a similar preload problem and
described their solution to this problem. In their case, it was decided to always place constructions for preloading asynchronous resources after the
<script />
tags of those resources that use these asynchronous resources. We decided to preload all the scripts and place the corresponding constructions in the code in the order in which they will be needed. This allows us to start preloading all the script resources of the page as quickly as possible. This includes tags for synchronously loading scripts that cannot be added to HTML until specific server data is placed on the page. This allows us to control the loading order of scripts.
Here is the markup that preloads all JavaScript bundles.
<link rel="preload" href="Common.js" as="script" /> <link rel="preload" href="Consumer.js" as="script" /> <link rel="preload" href="SomeConsumerRoute.js" as="script" /> ... <script src="Common.js" type="text/javascript"></script> <script src="Consumer.js" type="text/javascript"></script> <script src="SomeConsumerRoute.js" type="text/javascript" async></script>
Image preload
One of instagram.com’s main work areas is Feed. It is an image and video page that supports endless scrolling. We fill this page like this. First, download the initial set of publications, and then, as the user scrolls the page, load additional sets of materials. However, we would not want the user to wait for new materials to be loaded every time he reaches the bottom of the tape. As a result, in order to make it convenient to work with this page, we upload new sets of materials before the user reaches the end of the tape.
In practice, this, for several reasons, is not an easy task:
- We need to download materials that are not visible to the user, so that they do not take network and processor resources from the materials that he is viewing.
- We would not want to transmit unnecessary data over the network, trying too hard to preload publications that the user might not even see. But, on the other hand, if we do not preload enough materials, this will often mean the risk that the user will “run into” the end of the tape.
- The instagram.com project is designed to work on various devices and on screens of various sizes. As a result, we display the images in the tape using the
srcset
attribute of the <img>
. This attribute allows the browser, given the size of the screen, to decide which image resolution to use. This means that it’s not so easy for us to determine the resolution of the images that need to be downloaded in advance. In addition, there is a risk of preloading images that the browser will not use.
The approach we used to solve this problem was to create an abstraction of the prioritization task, which is responsible for queuing asynchronous tasks (in this case, these are tasks for preloading the next set of publications for output in the tape). A similar task is initially queued with priority
idle
(here
requestIdleCallback
used). This means that such a task will not start until the browser is busy with any other important work. However, if the user scrolls the page close enough to the place where the current set of downloaded publications ends, the priority of this task of preloading materials changes to
high
. This is done by canceling the standby callback, after which the preloading process immediately starts.
At the beginning and in the middle of the tape, the data preload task has priority idle, and at the end of the tape, priority is high
After the download of JSON data is completed for the next batch of publications, we queue a repeating background task for preloading images from this batch. Image preloading is performed in the order in which publications are displayed in the feed, rather than in parallel. This allows us to prioritize the tasks of loading data and display images for publications that are closest to the place on the page that the user sees. To download images of the correct size, we use a hidden media component, the parameters of which correspond to the parameters of the current tape. Inside this component there is an
<img>
element that uses the
srcset
attribute, the same one that is used to display real publications in the feed. This means that we can provide the browser with the ability to make decisions about which images to preload. As a result, the browser will use the same logic when displaying images that it used when preloading them. It also means that we, using a similar media component, can preload images for other areas of the site. Such as user profile pages.
The overall effect of the above improvements resulted in a 25% reduction in the time required to upload photos. We are talking about the length of time between the moment the publication code is added to the DOM and the moment the image from the publication is loaded and displayed. In addition, this led to a 56% reduction in the time that users, having reached the end of the feed, spent waiting to download new materials.
Dear readers! Do you use data preloading mechanisms to optimize your web projects?