The caching technique allows you to create more scalable applications, storing the results of some queries in a fast in-memory storage. However, improperly implemented caching can greatly degrade the user's impression of your application. This article contains some basic concepts about caching, various rules and taboos that I have learned from several past projects.
Does your project run fast and have no performance issues?
Forget about caching. Seriously :)
It will greatly complicate the read operations from the database without any benefits.
True, Mohamed Said at the beginning of this article makes some calculations and proves that in some cases, optimizing the application for milliseconds can save a ton of money in your AWS account. So if the projected savings on your project are more than $ 1.86, then maybe caching is a good idea.
When an application wants to get some data from the database, for example, the Post entity by its id, it generates a unique caching key for this case ( 'post_' . $id
is quite suitable) and tries to find the value by this key in the fast key-value storage (memcache, redis, or another). If the value is there, then the application uses it. If not, it takes it and the database and stores in the cache with this key for future use.
Maintaining this value in the cache is not always a good idea, since this Post entity can be updated, but the application will always receive the old, cached value.
Therefore, caching functions usually ask what time this value should be stored.
After this time expires, memcache or redis will โforgetโ about it and the application will take a fresh value from the database.
Example:
public function getPost($id): Post { $key = 'post_' . $id; $post = \Cache::get($key); if($post === null) { $post = Post::findOrFail($id); \Cache::put($key, $post, 900); } return $post; }
Here I put the Post entity in the cache for 15 minutes (since version 5.8 laravel uses seconds in this parameter, before there were minutes). The Cache
facade also has a convenient remember
method for this case. This code does exactly the same thing as the previous one:
public function getPost($id): Post { return \Cache::remember('post_' . $id, 900, function() use ($id) { return Post::findOrFail($id); }); }
The Laravel documentation has a Cache chapter that explains how to install the necessary drivers for your application and the main functionality.
All standard Laravel drivers store data as strings. When we ask to save an instance of the Eloquent model in the cache, it uses the serialize function to get the string from the object. The unserialize function restores the state of an object when we get it from the cache.
Almost any data can be cached. Numbers, strings, arrays, objects (if they can be correctly serialized, see the descriptions of functions from the links earlier).
Eloquent entities and collections can be easily cached and are the most popular values โโin the Laravel application cache. However, the use of other types is also practiced quite widely. The Cache::increment
method is popular for implementing various counters. Also, atomic locks are quite useful when developers are fighting race conditions .
The first candidates for caching are requests that are executed very often, but their execution plan is not the easiest. The best example is the top 5 articles on the main page, or the latest news. Caching such values โโcan greatly improve the performance of the main page.
Usually, fetching entities by id using Model::find($id)
is very fast, but if this table is heavily loaded with numerous update, insert and delete queries, reducing the number of select queries will give a good respite to the database. Entities with hasMany
relationships that will load every time are also good candidates for caching. When I worked on a project with 10+ million visitors a day, we cached almost any select request.
Key decay after a specified time helps to update the data in the cache, but this does not happen right away. The user can change the data, but for some time he will continue to see the old version of them in the application. The usual dialogue on one of my past projects:
: , ! : , 15 ( , )...
This behavior is very inconvenient for users and the obvious decision to delete old data from the cache when we updated it quickly comes to mind. This process is called disability. For simple keys like "post_%id%"
, "post_%id%"
not very difficult.
Eloquent events can help, or if your application generates special events like PostPublished
or UserBanned
it can be even simpler. Example with Eloquent events. First you need to create event classes. For convenience, I will use an abstract class for them:
abstract class PostEvent { /** @var Post */ private $post; public function __construct(Post $post) { $this->post = $post; } public function getPost(): Post { return $this->post; } } final class PostSaved extends PostEvent{} final class PostDeleted extends PostEvent{}
Of course, according to PSR-4, each class must be in its own file. Set up the Post Eloquent class (using the documentation ):
class Post extends Model { protected $dispatchesEvents = [ 'saved' => PostSaved::class, 'deleted' => PostDeleted::class, ]; }
Create a listener for these events:
class EventServiceProvider extends ServiceProvider { protected $listen = [ PostSaved::class => [ ClearPostCache::class, ], PostDeleted::class => [ ClearPostCache::class, ], ]; } class ClearPostCache { public function handle(PostEvent $event) { \Cache::forget('post_' . $event->getPost()->id); } }
This code will remove the cached values โโafter each update or deletion of Post entities. Invalidating entity lists, such as top-5 articles or breaking news, will be a bit more complicated. I saw three strategies:
Just do not touch these values. Usually, this does not bring any problems. Itโs okay that the new news will appear in the list of the latter a bit later (of course, if this is not a big news portal). But for some projects, it is really important to have fresh data in these lists.
Each time you update a publication, you can try to find it in the cached lists, and if it is there, delete this cached value.
public function getTopPosts() { return \Cache::remember('top_posts', 900, function() { return Post::/* top-5*/()->get(); }); } class CheckAndClearTopPostsCache { public function handle(PostEvent $event) { $updatedPost = $event->getPost(); $posts = \Cache::get('top_posts', []); foreach($posts as $post) { if($updatedPost->id == $post->id) { \Cache::forget('top_posts'); return; } } } }
It looks ugly, but it works.
If the order of the items in the list is unimportant, then only the id of the entries can be stored in the cache. After receiving the id, you can create a list of keys of the form 'post_'.$id
and get all the values โโusing the Cache::many
method, which gets a lot of values โโfrom the cache in one request (this is also called multi get).
Invalidation of the cache is not in vain called one of the two difficulties in programming and is very difficult in some cases.
Caching entities with relationships requires increased attention.
$post = Post::findOrFail($id); foreach($post->comments...)
This code performs two SELECT
queries. Getting entity by id
and comments by post_id
. We implement caching:
public function getPost($id): Post { return \Cache::remember('post_' . $id, 900, function() use ($id) { return Post::findOrFail($id); }); } $post = getPost($id); foreach($post->comments...)
The first request was cached, and the second was not. When the cache driver writes Post to the cache, comments
are not yet loaded. If we want to cache them too, then we must load them manually:
public function getPost($id): Post { return \Cache::remember('post_' . $id, 900, function() use ($id) { $post = Post::findOrFail($id); $post->load('comments'); return $post; }); }
Both requests are now cached, but we must invalidate the values โโof 'post_'.$id
every time a comment is added. It is not very efficient, therefore it is better to store the comment cache separately:
public function getPostComments(Post $post) { return \Cache::remember('post_comments_' . $post->id, 900, function() use ($post) { return $post->comments; }); } $post = getPost($id); $comments = getPostComments($post); foreach($comments...)
Sometimes the essence and attitude are strongly connected with each other and are always used together (order with details, publication with translation into the desired language). In this case, storing them in one cache is quite normal.
If invalidation is implemented on the project, cache keys are generated in at least two places: for calling Cache::get
/ Cache::remember
and for calling Cache::forget
. I have already encountered situations when this key was changed in one place, but not in another, and the disability broke. The usual advice for such cases is constants, but cache keys are generated dynamically, so I use special classes that generate keys:
final class CacheKeys { public function postById($id): string { return 'post_' . $id; } public function postComments($postId): string { return 'post_comments' . $id; } } \Cache::remember(CacheKeys::postById($id), 900, function() use ($id) { $post = Post::findOrFail($id); }); // .... \Cache::forget(CacheKeys::postById($id));
Key lifetimes can also be rendered in constants for the sake of better readability. These 900 or 15 * 60 increase the cognitive load when reading code.
When implementing write operations, such as changing the title or text of a publication, it is tempting to use the getPost
method written earlier:
$post = getPost($id); $post->title = $newTitle; $post->save();
Please do not do so. The value in the cache may be outdated, even if invalidation is done correctly. A small race condition and publishing will lose the changes made by another user. Optimistic locks will help at least not to lose changes, but the number of erroneous requests can greatly increase.
The best solution is to use completely different entity selection logic for read and write operations (hello, CQRS). In write operations, you always need to select the latest value from the database. And do not forget about locks (optimistic or pessimistic) for important data.
I think this is enough for an introductory article. Caching is a very complex and spacious topic, with traps for developers, but the performance gain sometimes outweighs all the difficulties.