Acceleration instagram.com. Part 1

In recent years, a lot of new things have appeared on instagram.com . Lots of. For example - storytelling tools, filters, creative tools, notifications, direct messages. However, as the project grew, all this gave one sad side effect, which was that instagram.com performance began to decline. Over the past year, the Instagram development team has made continuous efforts to fix this. This led to the fact that the total loading time of the Instagram feed (feed page) decreased by almost 50%.







Today we publish a translation of the first material from a series of articles devoted to the story of how instagram.com was accelerated.



About optimizing the performance of web projects









Performance improvement over the past year (Instagram feed, Display Done metric, ms.)



One of the most important approaches to improving the performance of web applications is to properly prioritize loading and processing resources and to reduce browser downtime during page loading. In our case, many of these optimizations have proven to be more effective than reducing the size of the code. As a rule, we had no complaints about the size of the code. It was compact enough. Its dimensions began to bother us only after a lot of small improvements were made to the project (we also plan to tell about code size optimization). Such improvements, in addition, had less impact on the project development process. They required less code changes and less refactoring. As a result, we initially concentrated our efforts precisely on this area, starting with preloading resources.



A story about preloading images, JavaScript code and materials needed to complete queries, as well as where to be careful



The general principle of our optimizations was to inform the browser as soon as possible about what resources are needed to load the page. We, as project developers, in many cases knew in advance what exactly would be needed for this. But the browser could have no idea about this until a certain part of the page materials was loaded and processed. The resources in question, for the most part, included those that are dynamically loaded using JavaScript (for example, other scripts, images, materials needed to execute XHR requests). The fact is that the browser cannot detect these dependent resources until it parses and executes some JavaScript code.



Instead of waiting until the browser itself finds these resources, we could give it a hint, following which it could immediately start downloading them. We did this using the preload



HTML attributes. It looks something like this:



 <link rel="preload" href="my-js-file.js" as="script" type="text/javascript" />
      
      





We use similar hints for two types of resources on critical page loading paths. This is dynamically loaded JavaScript code and dynamically loaded materials of GraphQL XHR-requests for receiving data. Dynamically loaded scripts are scripts that are loaded using constructs of the form import('...')



for specific client routes. We maintain a correspondence list of server entry points and client route scripts. As a result, when we, on the server, receive a request to load the page, we know about the scripts for which client routes you need to download. As a result, we can, when generating the HTML-code of the page, add appropriate hints to it.



For example, when working with the FeedPage



entry point FeedPage



we know that the client router will eventually complete a request to download FeedPageContainer.js



. As a result, we can add the following construction to the page code:



 <link rel="preload" href="/static/FeedPageContainer.js" as="script" type="text/javascript" />
      
      





Similarly, if we know that a GraphQL query is planned to be executed for the entry point of a particular page, this means that we need to preload materials to expedite the execution of this query. This is especially important due to the fact that the execution of such GraphQL queries sometimes takes a lot of time, and the page cannot render until the query results are returned. Because of this, we need to make the server as early as possible engage in the formation of responses to such requests.



 <link rel="preload" href="/graphql/query?id=12345" as="fetch" type="application/json" />
      
      





Changes in page loading features are especially noticeable on slow connections. By simulating a fast 3G connection (the first waterfall graph below, which illustrates the situation when resource preloading is not used), we can see that loading FeedPageContainer.js



and executing the GraphQL query associated with it start only after loading Consumer.js



. However, in the case where preloading is used, loading the FeedPageContainer.js



script and executing the GraphQL query can begin immediately after the HTML page is available. This, in addition, reduces the time required to download any secondary scripts that use lazy loading mechanisms to work with. Here, FeedSidebarContainer.js



and ActivityFeedBox.js



(which depend on FeedPageContainer.js



) begin to load almost immediately after processing Consumer.js



.









Preload not used









Preload used



Benefits of prioritizing preload



In addition to using the preload



attribute to start loading resources faster, using this mechanism has another advantage. It consists in increasing the network priority of asynchronous script loading. This becomes important when using asynchronously loaded scripts in critical paths to load pages, since by default they load with low priority. As a result, the priority of XHR requests and images related to the area of ​​the page visible to users will be higher than that of materials outside the viewing area. But this can lead to situations where the critical scripts needed to render the page are either blocked or forced to share the bandwidth with other resources. If you're interested, here's a detailed account of Chrome’s resource priorities. Thoughtful use of the preload mechanism (we’ll talk more about this below) gives the developer a certain level of control over how the browser prioritizes the process of initial loading the page. This is especially true for those cases when the developer knows what resources are important for the correct display of the page.



Preload Prioritization Issues



The problem of preloading resources lies precisely in the fact that it gives the developer additional leverage to influence the priority of resource loading. This means that the developer has more responsibility for the proper prioritization. For example, when testing a site in regions in which the speed of mobile and WiFi networks is very low, and in which there is a high percentage of packet loss, we noticed that the request that is executed during processing of the <link rel="preload" as="script">



gets a higher priority than the request that is executed when processing the <script />



JavaScript bundles used in critical page rendering paths. This leads to an increase in the overall page load time.



The source of this problem was how we placed the preload tags on our pages. Namely, we added preload hints only for bundles, which are part of the current page, which we were going to load asynchronously with the client router.



 <!--   ,    --> <link rel="preload" href="SomeConsumerRoute.js" as="script" /> <link rel="preload" href="..." as="script" /> ... <!-- ,      --> <script src="Common.js" type="text/javascript"></script> <script src="Consumer.js" type="text/javascript"></script>
      
      





For example, on the logout page we load SomeConsumerRoute.js



to Common.js



and Consumer.js



, and since preload resources are loaded with a higher priority, but they are not parsed, this blocks Common.js



and Consumer.js



parsing Consumer.js



. The Chrome Data Saver development team found a similar preload problem and described their solution to this problem. In their case, it was decided to always place constructions for preloading asynchronous resources after the <script />



tags of those resources that use these asynchronous resources. We decided to preload all the scripts and place the corresponding constructions in the code in the order in which they will be needed. This allows us to start preloading all the script resources of the page as quickly as possible. This includes tags for synchronously loading scripts that cannot be added to HTML until specific server data is placed on the page. This allows us to control the loading order of scripts.



Here is the markup that preloads all JavaScript bundles.



 <!--      --> <link rel="preload" href="Common.js" as="script" /> <link rel="preload" href="Consumer.js" as="script" /> <!--   ,    --> <link rel="preload" href="SomeConsumerRoute.js" as="script" /> ... <!-- ,      --> <script src="Common.js" type="text/javascript"></script> <script src="Consumer.js" type="text/javascript"></script> <script src="SomeConsumerRoute.js" type="text/javascript" async></script>
      
      





Image preload



One of instagram.com’s main work areas is Feed. It is an image and video page that supports endless scrolling. We fill this page like this. First, download the initial set of publications, and then, as the user scrolls the page, load additional sets of materials. However, we would not want the user to wait for new materials to be loaded every time he reaches the bottom of the tape. As a result, in order to make it convenient to work with this page, we upload new sets of materials before the user reaches the end of the tape.



In practice, this, for several reasons, is not an easy task:





The approach we used to solve this problem was to create an abstraction of the prioritization task, which is responsible for queuing asynchronous tasks (in this case, these are tasks for preloading the next set of publications for output in the tape). A similar task is initially queued with priority idle



(here requestIdleCallback



used). This means that such a task will not start until the browser is busy with any other important work. However, if the user scrolls the page close enough to the place where the current set of downloaded publications ends, the priority of this task of preloading materials changes to high



. This is done by canceling the standby callback, after which the preloading process immediately starts.









At the beginning and in the middle of the tape, the data preload task has priority idle, and at the end of the tape, priority is high



After the download of JSON data is completed for the next batch of publications, we queue a repeating background task for preloading images from this batch. Image preloading is performed in the order in which publications are displayed in the feed, rather than in parallel. This allows us to prioritize the tasks of loading data and display images for publications that are closest to the place on the page that the user sees. To download images of the correct size, we use a hidden media component, the parameters of which correspond to the parameters of the current tape. Inside this component there is an <img>



element that uses the srcset



attribute, the same one that is used to display real publications in the feed. This means that we can provide the browser with the ability to make decisions about which images to preload. As a result, the browser will use the same logic when displaying images that it used when preloading them. It also means that we, using a similar media component, can preload images for other areas of the site. Such as user profile pages.



The overall effect of the above improvements resulted in a 25% reduction in the time required to upload photos. We are talking about the length of time between the moment the publication code is added to the DOM and the moment the image from the publication is loaded and displayed. In addition, this led to a 56% reduction in the time that users, having reached the end of the feed, spent waiting to download new materials.



Dear readers! Do you use data preloading mechanisms to optimize your web projects?








All Articles