Python buggy code: 10 most common mistakes developers make

About Python

Python is an interpreted, object-oriented, high-level programming language with dynamic semantics. Built-in high-level data structures combined with dynamic typing and dynamic binding make it very attractive for BRPS (rapid development of application tools), as well as for use as a scripting and connecting language for connecting existing components or services. Python supports modules and packages, thereby encouraging program modularity and code reuse.

About this article

The simplicity and ease of learning this language can be confusing for developers (especially those who are just starting to learn Python), so you can lose sight of some important subtleties and underestimate the power of the variety of possible solutions using Python.

With this in mind, this article introduces the “top 10” of subtle, hard-to-find errors that even advanced Python developers can make.

Mistake # 1: misusing expressions as default values for function arguments

Python allows you to indicate that a function can have optional arguments by setting a default value for them. This, of course, is a very convenient feature of the language, but can lead to unpleasant consequences if the type of this value is mutable. For example, consider the following function definition:

>>> def foo(bar=[]): # bar -    #      . ... bar.append("baz") #     ... ... return bar

A common mistake in this case is to think that the value of an optional argument will be set to a default value each time a function is called without a value for this argument. In the above code, for example, we can assume that by repeatedly calling the function foo () (that is, without specifying a value for the bar argument), it will always return “baz”, since it is assumed that every time foo () is called (without specifying the argument bar), bar is set to [] (that is, a new empty list).

But let's see what will actually happen:

 >>> foo() ["baz"] >>> foo() ["baz", "baz"] >>> foo() ["baz", "baz", "baz"]

BUT? Why does the function continue to add the default value “baz” to the existing list every time foo () is called, instead of creating a new list each time?

The answer to this question will be a deeper understanding of what is going on with Python “under the hood”. Namely: the default value for the function is initialized only once, during the definition of the function. Thus, the bar argument is initialized by default (i.e., an empty list) only when foo () is defined for the first time, but subsequent calls to foo () (i.e., without specifying the bar argument) will continue to use the same list that was created for the argument bar at the time of the first definition of the function.

For reference, a common “workaround” for this error is the following definition:

 >>> def foo(bar=None): ... if bar is None: # or if not bar: ... bar = [] ... bar.append("baz") ... return bar ... >>> foo() ["baz"] >>> foo() ["baz"] >>> foo() ["baz"]

Mistake # 2: misuse of class variables

Consider the following example:

 >>> class A(object): ... x = 1 ... >>> class B(A): ... pass ... >>> class C(A): ... pass ... >>> print Ax, Bx, Cx 1 1 1

Everything seems to be in order.

 >>> Bx = 2 >>> print Ax, Bx, Cx 1 2 1

Yeah, everything was as expected.

 >>> Ax = 3 >>> print Ax, Bx, Cx 3 2 3

What the hell?! We just changed Ax. Why did Cx change too?

In Python, class variables are treated like dictionaries and follow what is often called Method Resolution Order (MRO). Thus, in the code above, since the attribute x is not found in class C, it will be found in its base classes (only A in the above example, although Python supports multiple inheritance). In other words, C does not have its own property x independent of A. Thus, references to Cx are actually references to Ax. This will cause problems if these cases are not handled properly. So when learning Python, pay special attention to class attributes and working with them.

Mistake No. 3: Incorrect parameters for the exception block

Suppose you have the following piece of code:

 >>> try: ... l = ["a", "b"] ... int(l[2]) ... except ValueError, IndexError: # To catch both exceptions, right? ... pass ... Traceback (most recent call last): File "<stdin>", line 3, in <module> IndexError: list index out of range

The problem here is that the exception expression does not accept the list of exceptions specified in this way. Rather, in Python 2.x, the expression “except Exception, e” is used to bind the exception to an optional second given second parameter (in this case, e) to make it available for further inspection. As a result, in the above code, an IndexError exception is not caught by the except statement; rather, the exception ends with binding to a parameter named IndexError instead.

The correct way to catch multiple exceptions using the except statement is to specify the first parameter as a tuple containing all the exceptions that you want to catch. Also, for maximum compatibility, use the as keyword, as this syntax is supported in both Python 2 and Python 3:

 >>> try: ... l = ["a", "b"] ... int(l[2]) ... except (ValueError, IndexError) as e: ... pass ... >>>

Mistake # 4: misunderstanding Python scope rules

The scope in Python is based on the so-called LEGB rule, which is an abbreviation of Local (names assigned in any way inside a function (def or lambda), and not declared global in this function), Enclosing (name in the local scope of any statically including functions ( def or lambda), from internal to external), Global (names assigned at the top level of the module file, or by executing the global instructions in def inside the file), Built-in (names previously assigned in the built-in name module: open, range, SyntaxError, ...). It seems simple enough, right? Well, actually, there are some subtleties to how this works in Python, which brings us to the general more complex Python programming problem below. Consider the following example:

 >>> x = 10 >>> def foo(): ... x += 1 ... print x ... >>> foo() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in foo UnboundLocalError: local variable 'x' referenced before assignment

What is the problem?

The above error occurs because when you assign a variable in scope, Python automatically considers it local to that scope and hides any variable with the same name in any parent scope.

Thus, many are surprised when they receive an UnboundLocalError in previously running code, when it is modified by adding an assignment operator somewhere in the function body.

This feature is especially confusing to developers when using lists. Consider the following example:

 >>> lst = [1, 2, 3] >>> def foo1(): ... lst.append(5) #   ... ... >>> foo1() >>> lst [1, 2, 3, 5] >>> lst = [1, 2, 3] >>> def foo2(): ... lst += [5] # ...    ! ... >>> foo2() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 2, in foo UnboundLocalError: local variable 'lst' referenced before assignment

BUT? Why is foo2 falling while foo1 is working fine?

The answer is the same as in the previous example, but, according to popular belief, the situation here is more subtle. foo1 does not apply the assignment operator to lst, while foo2 does not. Bearing in mind that lst + = [5] is actually just a shorthand for lst = lst + [5], we see that we are trying to assign the value lst (so Python assumes it is in local scope). However, the value that we want to assign to lst is based on lst itself (again, it is now assumed to be in local scope), which has not yet been determined. And we get an error.

Mistake # 5: changing a list during iteration over it

The problem in the following piece of code should be fairly obvious:

 >>> odd = lambda x : bool(x % 2) >>> numbers = [n for n in range(10)] >>> for i in range(len(numbers)): ... if odd(numbers[i]): ... del numbers[i] # BAD: Deleting item from a list while iterating over it ... Traceback (most recent call last): File "<stdin>", line 2, in <module> IndexError: list index out of range

Removing an item from a list or array during iteration over it is a Python issue that is well known to any experienced software developer. But, although the above example may be fairly obvious, even experienced developers can embark on this rake in much more complex code.

Fortunately, Python includes a number of elegant programming paradigms that, when used correctly, can greatly simplify and optimize your code. An additional pleasant consequence of this is that in simpler code, the probability of falling into the error of accidentally deleting a list item during iteration over it is much less. One such paradigm is list generators. In addition, understanding the operation of list generators is especially helpful in avoiding this particular problem, as shown in this alternative implementation of the above code, which works just fine:

 >>> odd = lambda x : bool(x % 2) >>> numbers = [n for n in range(10)] >>> numbers[:] = [n for n in numbers if not odd(n)] # ahh, the beauty of it all >>> numbers [0, 2, 4, 6, 8]

Mistake # 6: misunderstanding how Python binds variables in closures

Consider the following example:

 >>> def create_multipliers(): ... return [lambda x : i * x for i in range(5)] >>> for multiplier in create_multipliers(): ... print multiplier(2) ...

You can expect the following output:

 0 2 4 6 8

But actually you get this:

 8 8 8 8 8

Surprise!

This is due to the late binding in Python, which means that the values of the variables used in closures are looked up during an internal function call. Thus, in the above code, whenever any of the returned functions is called, the value i is searched in the surrounding scope during its call (and by that time the cycle had already completed, so i had already been assigned the final result - value 4) .

The solution to this common Python problem would be:

 >>> def create_multipliers(): ... return [lambda x, i=i : i * x for i in range(5)] ... >>> for multiplier in create_multipliers(): ... print multiplier(2) ... 0 2 4 6 8

Voila! We use the default arguments here to generate anonymous functions to achieve the desired behavior. Some would call this solution elegant. Some are

thin. Some hate such things. But if you're a Python developer, anyway, it's important to understand.

Mistake # 7: creating cyclic module dependencies

Suppose you have two files, a.py and b.py, each of which imports the other, as follows:

In a.py:

 import b def f(): return bx print f()

In b.py:

 import a x = 1 def g(): print af()

First, try importing a.py:

 >>> import a 1

It worked just fine. This may surprise you. In the end, the modules import each other cyclically and this should probably be a problem, right?

The answer is that simply having cyclic import of modules is not in itself a problem in Python. If the module has already been imported, Python is smart enough not to try to re-import it. However, depending on the point at which each module tries to access functions or variables defined in another, you may actually run into problems.

So, returning to our example, when we imported a.py, it had no problems importing b.py, since b.py does not require that any of a.py be defined during its import. The only reference in b.py to a is a call to af (). But this call in g () and nothing in a.py or b.py does not call g (). So everything works fine.

But what happens if we try to import b.py (without first importing a.py, that is):

 >>> import b Traceback (most recent call last): File "<stdin>", line 1, in <module> File "b.py", line 1, in <module> import a File "a.py", line 6, in <module> print f() File "a.py", line 4, in f return bx AttributeError: 'module' object has no attribute 'x'

Oh oh. This is not good! The problem here is that during the import process of b.py it tries to import a.py, which in turn calls f (), which tries to access bx But bx has not yet been defined. Hence the AttributeError exception.

At least one solution to this problem is pretty trivial. Just modify b.py to import a.py into g ():

 x = 1 def g(): import a # This will be evaluated only when g() is called print af()

Now when we import it, everything is fine:

 >>> import b >>> bg() 1 # Printed a first time since module 'a' calls 'print f()' at the end 1 # Printed a second time, this one is our call to 'g'

Mistake # 8: intersecting names with module names in the Python standard library

One of the charm of Python is its many modules that come out of the box. But as a result, if you do not consciously follow this, you may encounter the fact that the name of your module may be with the same name as the module in the standard library supplied with Python (for example, in your code there may be a module with the name email.py, which will conflict with the standard library module with the same name).

This can lead to serious problems. For example, if any of the modules tries to import the version of the module from the Python standard library, and you have a module with the same name in the project, which will be mistakenly imported instead of the module from the standard library.

Therefore, care should be taken not to use the same names as in the modules of the Python standard library. It is much easier to change the name of the module in your project than to submit a request to change the name of the module in the standard library and get approval for it.

Mistake # 9: Failure to take into account the differences between Python 2 and Python 3

Consider the following foo.py file:

 import sys def bar(i): if i == 1: raise KeyError(1) if i == 2: raise ValueError(2) def bad(): e = None try: bar(int(sys.argv[1])) except KeyError as e: print('key error') except ValueError as e: print('value error') print(e) bad()

On Python 2, it will work fine:

 $ python foo.py 1 key error 1 $ python foo.py 2 value error 2

But now let's see how it will work in Python 3:

 $ python3 foo.py 1 key error Traceback (most recent call last): File "foo.py", line 19, in <module> bad() File "foo.py", line 17, in bad print(e) UnboundLocalError: local variable 'e' referenced before assignment

What just happened here? The "problem" is that in Python 3, an object in an exception block is not available outside of it. (The reason for this is that otherwise the objects in this block will be stored in memory until the garbage collector starts and removes references to them from there).

One way to avoid this problem is to keep the reference to the exception block object outside of this block so that it remains available. Here is the version of the previous example that uses this technique, thereby obtaining code that is suitable for both Python 2 and Python 3:

 import sys def bar(i): if i == 1: raise KeyError(1) if i == 2: raise ValueError(2) def good(): exception = None try: bar(int(sys.argv[1])) except KeyError as e: exception = e print('key error') except ValueError as e: exception = e print('value error') print(exception) good()

Run it in Python 3:

 $ python3 foo.py 1 key error 1 $ python3 foo.py 2 value error 2

Hooray!

Mistake # 10: improper use of the del method

Let's say you have a mod.py file like this:

 import foo class Bar(object): ... def __del__(self): foo.cleanup(self.myhandle)

And you are trying to do this from another another_mod.py:

 import mod mybar = mod.Bar()

And get a terrible AttributeError.

Why? Because, as reported here , when the interpreter shuts down, the module global variables all have the value None. As a result, in the above example, when __del__ was called, the name foo was already set to None.

The solution to this "task with an asterisk" is to use atexit.register (). Thus, when your program finishes execution (that is, when it exits normally), your handles are deleted before the interpreter completes its work.

With this in mind, the fix for the above mod.py code might look something like this:

 import foo import atexit def cleanup(handle): foo.cleanup(handle) class Bar(object): def __init__(self): ... atexit.register(cleanup, self.myhandle)

Such an implementation provides a simple and reliable way to call any necessary cleanup after a normal program termination. Obviously, the decision on how to deal with the object that is associated with the self.myhandle name is left to foo.cleanup, but I think you understood the idea.

Conclusion

Python is a powerful and flexible language with many mechanisms and paradigms that can significantly improve performance. However, as with any software tool or language, with a limited understanding or assessment of its capabilities, unforeseen problems may arise during development.

An introduction to the Python nuances covered in this article will help you optimize your language usage while avoiding some common mistakes.

All Articles