The author of the article says that StackPath is satisfied with the level of confidence in the quality of the code that was achieved thanks to the applied testing system. Here he wants to share a description of the testing principles developed by the company and talk about the tools used.
Testing Principles
Before talking about specific tools, it is worth thinking about the answer to the question of what good tests are. Before starting work on our portal for clients, we formulated and wrote down the principles that we would like to follow when creating tests. What we did in the first place is exactly what helped us with the choice of tools.
Here are the four principles in question.
▍ Principle number 1. Tests should be understood as optimization tasks
An effective testing strategy is to solve the problem of maximizing a certain value (in this case, the level of confidence that the application will work correctly) and minimizing certain costs (here the "costs" are represented by the time required to support and run the tests). When writing tests, we often ask the following questions related to the principle described above:
- What is the likelihood that this test will find an error?
- Does this test improve our testing system and are the costs of resources needed to write it worth the benefits derived from it?
- Is it possible to get the same level of confidence in the entity being tested that this test gives by creating another test that is easier to write, maintain and run?
▍ Principle No. 2. Overuse of mox should be avoided.
One of my favorite explanations of the term “mok” was given in this presentation from the Assert.js 2018 conference. The speaker opened the question more deeply than I'm going to open it here. In the speech, the creation of mokas is compared with “punching holes in reality”. And I think that this is a very visual way of perceiving moks. Although there are mokas in our tests, we compare the decrease in the “cost” of the tests that mokas provide by simplifying the process of writing and running tests, with the decrease in the value of the tests that causes another hole to be made in reality.
Previously, our programmers relied heavily on unit tests written so that all child dependencies were replaced by mokas using the shallow enzyme rendering API. Entities rendered in this way were then checked using Jest snapshots. All such tests were written using a similar pattern:
it('renders ', () => { const wrapper = shallow(); // , expect(wrapper).toMatchSnapshot(); });
These tests are filled with reality in many places. This approach makes it very easy to achieve 100% code coverage with tests. When writing such tests, you have to think very little, but if you do not check all the numerous integration points, such tests are not particularly valuable. All tests can successfully complete, but this does not give much confidence in the operability of the application. And even worse, all mokas have a hidden “price” that you have to pay after the tests are written.
▍ Principle No. 3. Tests should facilitate code refactoring, and not complicate it
Tests like the one shown above complicate refactoring. If I find that in many places in the project there is duplicate code, and after a while I format this code as a separate component, then all tests for the components in which I will use this new component will fail. Components derived using shallow rendering techniques are already something else. Where I used to have repeated markup, now there is a new component.
More complex refactoring, which involves adding some components to a project and removing some other components, leads to even more confusion. The fact is that you have to add new tests to the system and remove unnecessary tests from it. Snapshot regeneration is a simple task, but what is the value of such tests? Even if they are capable of finding a mistake, it would be better if they missed it in a series of snapshot changes and simply checked new snapshots without spending too much time on it.
As a result, such tests do not particularly help refactoring. Ideally, no test should fail if I perform refactoring, after which what the user sees and what he interacts with has not changed. And vice versa - if I changed what the user is contacting, at least one test should fail. If tests follow these two rules, then they are an excellent tool to ensure that something that users encounter does not change during refactoring.
▍ Principle No. 4. Tests should reproduce how real users work with the application
I would like the tests to fail only if something has changed that the user interacts with. This means that tests should work with the application in the same way as users work with it. For example, a test must truly interact with form elements and, just like a user, must enter text in text input fields. Tests should not access components and independently call methods of their life cycle, should not write something into the state of components, or do something that relies on the intricacies of component implementation. Since, ultimately, I want to check the part of the system that is in contact with the user, it is logical to strive to ensure that the tests when interacting with the system reproduce the actions of real users as close as possible.
Testing tools
Now that we have defined the goals we want to achieve, let's talk about which tools we have chosen for this.
▍TypeScript
Our code base uses TypeScript. Our backend services are written in Go and interact with each other using gRPC. This allows us to generate typed gRPC clients for use on a GraphQL server. GraphQL server resolvers are typed using the types generated by graphql-code-generator . And finally, our queries, mutations, as well as subscription components and hooks are fully typed. Full coverage of our code base with types eliminates a whole class of errors caused by the fact that the data form is not what the programmer expects. Generating types from the schema and protobuf files ensures that our entire system, in all parts of the stack of technologies used, remains homogeneous.
▍Jest (unit testing)
As a framework for code testing, we use Jest and @ testing-library / react . In tests created using these tools, we test functions or components in isolation from the rest of the system. We usually test functions and components that are most often used in an application, or those that have many ways to execute code. Such paths are difficult to verify during integration or end-to-end (E2E) testing.
Unit tests for us are a means to test small parts. Integration and end-to-end tests do an excellent job of checking the system on a larger scale, allowing you to check the overall level of application health. But sometimes you need to make sure that small details work, and writing integration tests for all possible uses of the code is too expensive.
For example, we need to check that keyboard navigation works in the component responsible for working with the drop-down list. But at the same time, we would not want to check all possible variants of such behavior when testing the entire application. As a result, we carefully test navigation in isolation, and when testing pages using the appropriate component, we pay attention only to checking higher-level interactions.
Testing tools
▍Cypress (integration tests)
Integration tests created using Cypress are the core of our testing system. When we started creating the StackPath portal, these were the first tests we wrote, since they are highly valuable with very little overhead for creating them. Cypress displays our entire application in a browser and runs test scripts. Our entire frontend works in exactly the same way as when users work with it. True, the network layer of the system is replaced by mokami. Each network query that would normally get to the GraphQL server returns conditional data to the application.
Using mocks to simulate the network layer of an application has many strengths:
- Tests are faster. Even if the project backend is extremely fast, the time required to return responses to requests made during the entire test suite can be quite substantial. And if Moki is responsible for the return of answers, the answers are returned instantly.
- Tests are becoming more reliable. One of the difficulties of performing complete end-to-end testing of a project is that it is necessary to take into account the variable state of the network and server data, which can change. If real access to the network is simulated with the help of moxas, this variability disappears.
- It is easy to reproduce situations that require the exact repetition of certain conditions. For example, in a real system it will be difficult to make certain requests stably fail. If you need to check the correct response of the application to unsuccessful requests, then moki easily allow you to reproduce emergency situations.
Although replacing the entire backend with mok seems a daunting task, all conditional data are typed using the same generated TypeScript types that are used in the application. That is - this data, at least - in terms of structure, is guaranteed to be equivalent to what a normal backend would return. During most of the tests, we quite peacefully put up with the downsides of using mooks instead of real server calls.
In addition, programmers are very pleased to work with Cypress. Tests run in the Cypress Test Runner. Test descriptions are displayed on the left, and the test application runs in the main
iframe
element. After starting the test, you can study its individual stages and find out how the application behaved at one time or another. Since the tool for running tests itself works in the browser, you can use the developer’s browser tools to debug the tests.
When writing front-end tests, it often happens that it takes a lot of time to compare what the test does with the state of the DOM at a certain point in the test. Cypress greatly simplifies this task, since the developer can see everything that happens with the application under test. Here is a video clip that demonstrates this.
These tests perfectly illustrate our testing principles. The ratio of their value to their "price" suits us. Tests very similarly reproduce the actions of the real user interacting with the application. And only the network layer of the project was replaced by mokami.
▍Cypress (end-to-end testing)
Our E2E tests are also written using Cypress, but in them we do not use moki either to simulate the network level of a project or to simulate anything else. When performing tests, the application accesses the real GraphQL server, which works with real instances of backend services.
End-to-end testing is extremely valuable to us. The fact is that it is the results of such testing that let us know whether something works as expected or does not work. No mocks are used during such testing, as a result, the application works in exactly the same way as when it is used by real clients. However, it should be noted that end-to-end tests are “more expensive” than others. They are slower, more difficult to write, given the possibility of short-term failures during their implementation. More work is required to ensure that the system remains in a known state before running the tests.
Tests usually need to be run at a time when the system is in some known state. After the test is completed, the system switches to another known state. In the case of integration tests, it is not difficult to achieve this behavior of the system, since calls to the API are replaced by mokas, and, as a result, each test run occurs in previously known conditions controlled by the programmer. But in the case of E2E tests, it is already more difficult to do this, since the server data warehouse contains information that may change during the test. As a result, the developer needs to find some way to ensure that when the test starts, the system is in a previously known state.
At the beginning of the end-to-end test run, we run a script that, by making direct calls to the API, creates a new account with stacks, sites, workloads, monitors, and the like. Each test session implies the use of a new instance of such an account, but everything else from time to time remains unchanged. The script, having done all that is needed, forms a file containing the data that is used to run the tests (usually it contains information about instance identifiers and domains). As a result, it turns out that the script allows you to bring the system into a previously known state before running the tests.
Since end-to-end testing is “more expensive” than other types of testing, we, in comparison with integration tests, write fewer end-to-end tests. We strive to ensure that tests cover critical application features. For example, this is registering users and their login, creating and setting up a site / workload, and so on. Thanks to extensive integration tests, we know that, in general, our frontend is functional. And end-to-end tests are needed only to make sure that when connecting the frontend to the backend, something does not happen that other tests cannot detect.
Cons of our comprehensive testing strategy
Although we are very pleased with the tests and the stability of the application, there are also disadvantages to using a comprehensive testing strategy like ours.
To begin with - the application of such a testing strategy means that all team members should be familiar with many testing tools, and not just with one. Everyone needs to know Jest, @ testing-library / react and Cypress. But at the same time, developers not only need to know these tools. They also need to be able to make decisions about in which situation which one should be used. Is it worth it to test some new opportunity to write an end-to-end test, or is the integration test enough? Is it necessary, in addition to the end-to-end or integration test, to write a unit test to verify the small details of the implementation of this new feature?
Undoubtedly, this, so to speak, “burdens the head” of our programmers, while using the only tool they would not experience such a load. Usually, we start with integration tests, and after that, if we see that the feature under study is of particular importance and strongly depends on the server part of the project, we add the appropriate end-to-end test. Or we start with unit tests, doing this if we believe that a unit test will not be able to verify all the subtleties of implementing a certain mechanism.
Of course, we are still faced with situations where it is not clear where to start. But, as we constantly have to make decisions regarding tests, certain patterns of common situations begin to emerge. For example, we usually test form validation systems using unit testing. This is done due to the fact that during the test you need to check many different scenarios. At the same time, everyone in the team knows about this and does not waste time planning a testing strategy when one of them needs to test the form validation system.
Another drawback of the approach we use is the complication of collecting data about code coverage with tests. Although this is possible, it is much more complicated than in a situation where one is used to test a project. Although the pursuit of beautiful numbers of code coverage by tests can lead to a deterioration in the quality of tests, such information is valuable in terms of finding "holes" in the used test suite. The problem with using several testing tools is that, in order to understand which part of the code has not been tested, you need to combine reports on code coverage with tests received from different systems. It is possible, but it is definitely much more difficult than reading a report generated by any one means for testing.
Summary
When using many testing tools, we were faced with difficult tasks. But each of these tools serves its own purpose. In the end, we believe that we did the right thing by including them in our code testing system. Integration tests - this is where it is best to start creating a test system at the beginning of work on a new application or when equipping tests of an existing project. It will be useful to try to add end-to-end tests to the project as early as possible, checking the most important features of the project.
When there are end-to-end and integration tests in the test suite, this should lead to the fact that the developer will receive a certain level of confidence in the health of the application when any changes are made to it. If, during the course of the work on the project, errors began to appear in it that are not detected by the tests, it is worth considering which tests could catch these errors and whether the appearance of errors indicates flaws in the entire testing system used in the project.
Of course, we did not immediately come to our current testing system. In addition, we expect that this system, as our project grows, will develop. But now we really like our approach to testing.
Dear readers! What strategies do you follow in frontend testing? What frontend testing tools do you use?