💃 💪🏻 👩🏼‍💻 Benchmarking ORM used to create Android applications 🙆🏾 🌅 👧🏿

Hello, Habr! My name is Artyom Dobrovinsky and I am an Android developer at FINCH .

Once, wrapping myself in the smoke of a morning cigar, I studied the sources of one ORM for Android. Seeing there a package called benchmarks

immediately looked there, and was surprised that all the evaluations were done using Log.d(System.nanoTime())

. This is not the first time I've seen this. To be honest, I even saw benchmarks made using System.currentTimeMillis()

. The collapsed awareness that something needs to be changed forced me to put aside a glass of whiskey and sit down at the keyboard.

Why is this article written

The situation with understanding how to measure code performance in Android is sad.

How many don’t talk about profilers, and in 2019, someone remains confident that the JVM does everything that the developer wrote and in the exact order in which the code is written. In reality, there is nothing further from the truth.

In fact, the unfortunate virtual machine fights off a billion of careless button-readers who write their own code, never having strained how the processor will work with all of this. This battle has been going on for several years, and she has a million tricky optimizations in her sleeve that (if ignored) will turn any measurement of program performance into a waste of time.

That is, developers sometimes do not consider it necessary to measure the performance of the code, and even more often do not know how. The difficulty lies in the fact that to conduct a performance assessment, it is necessary to create the most similar and ideal conditions for all cases - only this way you can get useful information. These conditions are created by solutions not written on the knee.

If you need arguments about whether to use third-party frameworks for measuring performance, you can always read Alexei Shipilev and marvel at the depth of the problem. Everything is in the article by reference: why warmup is needed before benchmarking, why System.currentTimeMillis()

cannot be trusted at all when counting elapsed time, and jokes for 300. Excellent reading.

Why can I talk about this?

The fact is that I am a comprehensively developed developer: I not only own the Android SDK as if it were my pet-project, but for another month I wrote code for the backend.

When I brought my first microservice to the review, and there was no benchmarking in README

, he looked at me with a misunderstanding. I remembered this and never repeated this mistake again. Because he left in a week.

Go.

What are we measuring

As part of the case for benchmarking databases for Android, I decided to measure the speed of initialization and write / read speed for such ORMs as Paper, Hawk, Realm and Room.

Yes, I measure in one NoSQL test and a relational database - what is the next question?

Than we measure

It would seem that if we are talking about the JVM, then the choice is obvious - there is a glorified , perfected and impeccably documented JMH . But no, it does not start instrumentation tests for Android.

Google Calipher follows them - with the same result.

There is a fork of Calipher called Spanner - which for many years has been zeppercay and encourages the use of Androidx Benchmark .

Let us focus on the latter. If only because we had no choice.

Like everything that was added to Jetpack and not rethought when migrating from the Support Library, Androidx Benchmark looks and behaves as if it were written in a week and a half as a test task, and no one else will ever touch it. Plus, this lib is a little past - because, it is more for evaluating UI tests. But for want of the best, you can work with her. This will save us at least from obvious mistakes , and also help with warming up.

To reduce the ridiculousness of the results, I will run all the tests 10 times and calculate the average.

Testing device - Xiaomi A1. Not the weakest in the market, "clean" Android.

Connecting a library to a project

There are excellent instructions on connecting Andoridx Benchmark to a project. I strongly advise you not to be lazy and connect a separate module for making measurements.

Experiment progress

All our benchmarks will be executed in the following order:

First, we initiate the database in the body of the test.
Then, in the benchmarkRule.scope.runWithTimingDisabled

block, we generate data that we feed the database. The code placed in this circuit will not be taken into account in the evaluation.
In the same closure we add the logic of clearing the database; make sure the database is empty before writing.
The following is the logic of writing and reading. Be sure to initialize the variable with the result of reading so that the JVM does not remove this logic from the execution count as unused.
We measure the performance of database initialization in a separate function.
We feel like a man of science.

The code can be found here . If you are lazy to walk, the metering function for PaperDb looks like this:

 @Test fun paperdbInsertReadTest() = benchmarkRule.measureRepeated { //   (     ) benchmarkRule.scope.runWithTimingDisabled { Paper.book().destroy() if (Paper.book().allKeys.isNotEmpty()) throw RuntimeException() } //    repository.store(persons, { list -> Paper.book().write(KEY_CONTACTS, list) }) val persons = repository.read { Paper.book().read<List<Person>>(KEY_CONTACTS, emptyList()) } }

Benchmarks for the rest of ORM look similar.

results

Initialization

test name	mean	one	2	3	4	5	6	7	8	nine	10
HawkInitTest	49_512	49_282	50_021	49_119	50_145	49_970	50_047	46_649	50_230	49_863	49_794
PaperdbInitTest	224	223	223	223	233	223	223	223	223	223	223
RealmInitTest	218	217	217	217	217	217	217	217	227	217	217
RoomInitTest	61_695.5	63_450	59_714	58_527	59_175	63_544	62_980	63_252	59_670	63_868	62_775

The winner is Realm, in second place is Paper. What Room is doing, you can still imagine that Hawk does almost the same amount of time - it is completely incomprehensible.

Writing and reading

test name	mean	one	2	3	4	5	6	7	8	nine	10
HawkInsertReadTest	278_736_469.2	278_098_654	283_956_846	276_748_308	282_447_384	272_609_500	284_699_653	271_869_770	278_719_693	278_836_115	279_378_769
PaperdbInsertReadTest	173_519_957.3	172_953_347	174_702_000	169_740_846	174_401_192	173_930_037	174_179_616	173_937_460	173_739_115	176_215_038	171_400_922
RealmInsertReadTest	111_644_042.3	108_501_578	110_616_078	102_056_461	112_946_577	111_701_231	114_922_962	106_198_000	118_742_498	120_888_230	109_866_808
RoomInsertReadTest	1_863_499_483.3	187_250_3614	1_837_078_614	1_872_482_538	1_827_338_460	1_869_147_999	1_857_126_229	1_842_427_537	1_870_630_652	1_878_862_538	1_907_396_652

Here again, the winner of Realm, but in these results smacks of failure.

The four times difference between the two “slowest” databases and sixteen times between the “fastest” and the “slowest” is very suspicious. Even taking into account that the difference is stable.

Conclusion

Measuring the performance of your code is at least out of curiosity. Even if we are talking about the most industry-launched cases (such as the evaluation of instrumental tests for Android).

There are all reasons to attract third-party frameworks for this business (rather than writing your own with timing and cheerleaders).

The situation in code bases is such that everyone is trying to write in a clean architecture; for most, the module with business logic is a java module - connecting a module with JMH nearby and checking the code for bottlenecks - it works for a day. And the benefits - for many years to come.

Happy coding!

PS: If an attentive reader knows about the framework for conducting benchmarks of instrumental tests for Android, not listed in the article - please share in the comments.

PPS: The test repository is open for pull requests.

Benchmarking ORM used to create Android applications