Python v3.x: how to increase decorator speed without registration and SMS

At the beginning was this article . Then a comment appeared on her . As a result, I delved into reading the materiel, buried myself in debag and was able to optimize the code from the first part of this story. I propose to walk with me along the main points.



First, I want to thank Mogost . Thanks to his comment, I redefined the approach to Python. I had previously heard that there are a lot of uneconomical guys among the pythonists (when dealing with memory), but now it turned out that I somehow invisibly joined this party.



So, let's begin. Let’s speculate, and what were the bottlenecks in general.



Persistent if:

if isinstance(self.custom_handlers, property): if self.custom_handlers and e.__class__ in self.custom_handlers: if e.__class__ not in self.exclude:
      
      







and this is not the limit. Therefore, I removed part of the ifs, transferred something to __init__, i.e. to where it will be called once. Specifically, checking for property in the code should be called once, because the decorator is applied to the method and assigned to it. And the property of the class, respectively, will remain unchanged. Therefore, there is no need to check the property constantly.



A separate point is if in. The profiler showed that each such in has a separate call, so I decided to combine all the handlers into one dict. This allowed to avoid ifs in general, instead of using simply:

 self.handlers.get(e.__class__, Exception)(e)
      
      







so in self.handlers we have a dict, which as the default contains a function that raises the remaining exceptions.



Of course, wrapper deserves special attention. This is the same function that is called every time the decorator is called. Those. here it is better to avoid unnecessary checks and all kinds of loads to the maximum, if possible, putting them out in __init__ or in __call__. Here's what wrapper was before:

 def wrapper(self, *args, **kwargs): if self.custom_handlers: if isinstance(self.custom_handlers, property): self.custom_handlers = self.custom_handlers.__get__(self, self.__class__) if asyncio.iscoroutinefunction(self.func): return self._coroutine_exception_handler(*args, **kwargs) else: return self._sync_exception_handler(*args, **kwargs)
      
      







the number of checks goes through the roof. This will all be called on every call to the decorator. Therefore, wrapper became like this:

  def __call__(self, func): self.func = func if iscoroutinefunction(self.func): def wrapper(*args, **kwargs): return self._coroutine_exception_handler(*args, **kwargs) else: def wrapper(*args, **kwargs): return self._sync_exception_handler(*args, **kwargs) return wrapper
      
      







recall, __call__ will be called once. Inside __call__, depending on the degree of asynchrony of the function, we return the function itself or coroutine. And I also want to note that asyncio.iscoroutinefunction makes an additional call, so I switched to inspect.iscoroutinefunction. Actually, benches (cProfile) for asyncio and inspect:



  ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 <string>:1(<module>) 1 0.000 0.000 0.000 0.000 coroutines.py:160(iscoroutinefunction) 1 0.000 0.000 0.000 0.000 inspect.py:158(isfunction) 1 0.000 0.000 0.000 0.000 inspect.py:179(iscoroutinefunction) 1 0.000 0.000 0.000 0.000 {built-in method builtins.exec} 1 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
      
      







  ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 0.000 0.000 <string>:1(<module>) 1 0.000 0.000 0.000 0.000 inspect.py:158(isfunction) 1 0.000 0.000 0.000 0.000 inspect.py:179(iscoroutinefunction) 1 0.000 0.000 0.000 0.000 {built-in method builtins.exec} 1 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
      
      







Full code:

 from inspect import iscoroutinefunction from asyncio import QueueEmpty, QueueFull from concurrent.futures import TimeoutError class ProcessException(object): __slots__ = ('func', 'handlers') def __init__(self, custom_handlers=None): self.func = None if isinstance(custom_handlers, property): custom_handlers = custom_handlers.__get__(self, self.__class__) def raise_exception(e: Exception): raise e exclude = { QueueEmpty: lambda e: None, QueueFull: lambda e: None, TimeoutError: lambda e: None } self.handlers = { **exclude, **(custom_handlers or {}), Exception: raise_exception } def __call__(self, func): self.func = func if iscoroutinefunction(self.func): def wrapper(*args, **kwargs): return self._coroutine_exception_handler(*args, **kwargs) else: def wrapper(*args, **kwargs): return self._sync_exception_handler(*args, **kwargs) return wrapper async def _coroutine_exception_handler(self, *args, **kwargs): try: return await self.func(*args, **kwargs) except Exception as e: return self.handlers.get(e.__class__, Exception)(e) def _sync_exception_handler(self, *args, **kwargs): try: return self.func(*args, **kwargs) except Exception as e: return self.handlers.get(e.__class__, Exception)(e)
      
      







And probably the example would be incomplete without timeit. Therefore, using the example from the above comment:

 class MathWithTry(object): def divide(self, a, b): try: return a // b except ZeroDivisionError: return '   ,   '
      
      







and an example from the text of the previous article ( ATTENTION! we pass e to the example from the text in the lambda. In the previous article this was not and was added only in the innovations):

 class Math(object): @property def exception_handlers(self): return { ZeroDivisionError: lambda e: '   ,   ' } @ProcessException(exception_handlers) def divide(self, a, b): return a // b
      
      







here are the results for you:

 timeit.timeit('math_with_try.divide(1, 0)', number=100000, setup='from __main__ import math_with_try') 0.05079065300014918 timeit.timeit('math_with_decorator.divide(1, 0)', number=100000, setup='from __main__ import math_with_decorator') 0.16211646200099494
      
      







In conclusion, I want to say that optimization, in my opinion, is a rather complicated process, and here it is important not to get carried away and not to optimize something to the detriment of readability. Otherwise, debiting on optimized will be extremely difficult.



Thank you for your comments. I look forward to comments on this article too :)



PS thanks to the comments of users of the Habr it was possible to accelerate even more, that's what happened:

 from inspect import iscoroutinefunction from asyncio import QueueEmpty, QueueFull from concurrent.futures import TimeoutError class ProcessException(object): __slots__ = ('handlers',) def __init__(self, custom_handlers=None): if isinstance(custom_handlers, property): custom_handlers = custom_handlers.__get__(self, self.__class__) raise_exception = ProcessException.raise_exception exclude = { QueueEmpty: lambda e: None, QueueFull: lambda e: None, TimeoutError: lambda e: None } self.handlers = { **exclude, **(custom_handlers or {}), Exception: raise_exception } def __call__(self, func): handlers = self.handlers if iscoroutinefunction(func): async def wrapper(*args, **kwargs): try: return await func(*args, **kwargs) except Exception as e: return handlers.get(e.__class__, handlers[Exception])(e) else: def wrapper(*args, **kwargs): try: return func(*args, **kwargs) except Exception as e: return handlers.get(e.__class__, handlers[Exception])(e) return wrapper @staticmethod def raise_exception(e: Exception): raise e
      
      







 timeit.timeit('divide(1, 0)', number=100000, setup='from __main__ import divide') 0.13714907199755544
      
      







Accelerated by 0.03 on average. Thanks to Kostiantyn and Yngvie .



PS Updated! I optimized the code even further, based on comments from the comments of onegreyonewhite and resetme . Replaced self.func with just func and made self.handlers into a variable. The execution was further accelerated, especially noticeable if there are more repetitions per zero. I quote timeit:

 timeit.timeit('t.divide_with_decorator(1, 0)', number=1000000, setup='from __main__ import t') 1.1116105649998644
      
      







Prior to this optimization, execution with the same number value took 1.24 on average.



PS I have optimized even more by introducing the raise_exception function from __init__ in @staticmethod and I am accessing it through a variable to remove the access through a point. Actually, the average execution time has become:

 timeit.timeit('t.divide_with_decorator(1, 0)', number=1000000, setup='from __main__ import t') 1.0691639049982768
      
      







this is for the method. And functions are called even faster (on average):

 timeit.timeit('div(1, 0)', number=1000000, setup='from __main__ import div') 1.0463485610016505
      
      






All Articles