Internals Go: wrap loop variables in closure







Today I decided to translate for you a short article on the internals of the implementation of the so-called closures or closures. In addition, you will learn how Go tries to automatically determine whether to use a pointer / link or value in different cases. Understanding these things will avoid mistakes. And it’s just that all these insides are damn interesting, I think!







And I would like to invite you to Golang Conf 2019 , which will be held on October 7 in Moscow. I am a member of the program committee of the conference, and my colleagues and I chose many equally hardcore and very, very interesting reports. What I love!







Under the cut, I give the floor to the author.









There is a page on the Go wiki titled Frequent Mistakes . Curiously, there is only one example: misuse of loop variables with goroutines:







for _, val := range values { go func() { fmt.Println(val) }() }
      
      





This code will output the last value from the array of values ​​len (values) times. Fixing the code is very simple:







 // assume the type of each value is string for _, val := range values { go func(val string) { fmt.Println(val) }(val) }
      
      





This example is enough to understand the problem and never again make a mistake. But if you are interested in knowing the implementation details, this article will give you a deep understanding of both the problem and the solution.







Basic things: passing by value and passing by reference



In Go, there is a difference in passing objects by value and by reference [1]. Let's start with example 1 [2]:







 func foobyval(n int) { fmt.Println(n) } func main() { for i := 0; i < 5; i++ { go foobyval(i) } time.Sleep(100 * time.Millisecond) }
      
      





No one, most likely, has any doubt that the result will be displayed values ​​from 0 to 4. Probably in some random order.







Let's look at example 2 .







 func foobyref(n *int) { fmt.Println(*n) } func main() { for i := 0; i < 5; i++ { go foobyref(&i) } time.Sleep(100 * time.Millisecond) }
      
      





As a result, the following will be displayed:







5

5

5

5

5







Understanding why the result is just that will give us already 80% of the understanding of the essence of the problem. Therefore, let's take some time to find the reasons.







And the answer is right there in the Go language specification . The specification reads:







Variables declared in the initialization statement are reused in each loop.

This means that when the program is running, there is only one object or piece of memory for variable i, and not a new one is created for each cycle. This object takes on a new value at each iteration.







Let's look at the difference in the generated machine code [3] for the loop in examples 1 and 2. Let's start with example 1.







 0x0026 00038 (go-func-byval.go:14) MOVL $8, (SP) 0x002d 00045 (go-func-byval.go:14) LEAQ "".foobyval·f(SB), CX 0x0034 00052 (go-func-byval.go:14) MOVQ CX, 8(SP) 0x0039 00057 (go-func-byval.go:14) MOVQ AX, 16(SP) 0x003e 00062 (go-func-byval.go:14) CALL runtime.newproc(SB) 0x0043 00067 (go-func-byval.go:13) MOVQ "".i+24(SP), AX 0x0048 00072 (go-func-byval.go:13) INCQ AX 0x004b 00075 (go-func-byval.go:13) CMPQ AX, $5 0x004f 00079 (go-func-byval.go:13) JLT 33
      
      





The Go statement turns into a call to the runtime.newproc function. The mechanics of this process are very interesting, but let us leave this for the next article. Now we are more interested in what happens to the variable i. It is stored in the AX register, which is then passed by value through the stack to the foobyval function [4] as its argument. “By value” in this case looks like copying the value of the AX register onto the stack. And changing AX in the future does not affect what is passed into the foobyval function.







And here is what example 2 looks like:







 0x0040 00064 (go-func-byref.go:14) LEAQ "".foobyref·f(SB), CX 0x0047 00071 (go-func-byref.go:14) MOVQ CX, 8(SP) 0x004c 00076 (go-func-byref.go:14) MOVQ AX, 16(SP) 0x0051 00081 (go-func-byref.go:14) CALL runtime.newproc(SB) 0x0056 00086 (go-func-byref.go:13) MOVQ "".&i+24(SP), AX 0x005b 00091 (go-func-byref.go:13) INCQ (AX) 0x005e 00094 (go-func-byref.go:13) CMPQ (AX), $5 0x0062 00098 (go-func-byref.go:13) JLT 57
      
      





The code is very similar - with only one, but very important, difference. Now in AX is the address i, and not its value. Note also that increment and comparison for the loop are done on (AX), not AX. And then, when we put AX on the stack, we, it turns out, pass the address i to the function. The change (AX) will be seen in this way in goroutine too.







No surprises. In the end, we pass a pointer to a number in the foobyref function.

During operation, the cycle ends faster than any of the created goroutines starts to work. When they start working, they will have a pointer to the same variable i, and not to a copy. And what is the value of i at this moment? The value is 5. The very one on which the cycle stopped. And that is why all goroutines derive 5.







Methods with a value VS methods with a pointer



Similar behavior can be observed when creating goroutines that invoke any methods. This is indicated by the same wiki page. Look at example 3 :







 type MyInt int func (mi MyInt) Show() { fmt.Println(mi) } func main() { ms := []MyInt{50, 60, 70, 80, 90} for _, m := range ms { go m.Show() } time.Sleep(100 * time.Millisecond) }
      
      





This example displays the elements of the ms array. In random order, as we expected. A very similar example 4 uses a pointer method for the Show method:







 type MyInt int func (mi *MyInt) Show() { fmt.Println(*mi) } func main() { ms := []MyInt{50, 60, 70, 80, 90} for _, m := range ms { go m.Show() } time.Sleep(100 * time.Millisecond) }
      
      





Try to guess what the conclusion will be: 90, printed five times. The reason is the same as in simpler example 2. Here the problem is less noticeable due to syntactic sugar in Go when using pointer methods. If in the examples, when switching from example 1 to example 2, we changed i to & i, here the call looks the same! m.Show () in both examples, and the behavior is different.







Not a happy combination of two Go features, it seems to me. Nothing in the place of the call indicates transmission by reference. And you will need to look at the implementation of the Show method to see exactly how the call will happen (and the method, of course, can be in a completely different file or package).







In most cases, this feature is useful. We write cleaner code. But here, passing by reference leads to unexpected effects.







Short circuits



Finally we come to the closures. Let's look at example 5 :







 func foobyval(n int) { fmt.Println(n) } func main() { for i := 0; i < 5; i++ { go func() { foobyval(i) }() } time.Sleep(100 * time.Millisecond) }
      
      





He will print the following:







5

5

5

5

5







And this is despite the fact that i is passed by value to foobyval in the closure. Analogously to example 1. But why? Let's look at the assembler loop view:







 0x0040 00064 (go-closure.go:14) LEAQ "".main.func1·f(SB), CX 0x0047 00071 (go-closure.go:14) MOVQ CX, 8(SP) 0x004c 00076 (go-closure.go:14) MOVQ AX, 16(SP) 0x0051 00081 (go-closure.go:14) CALL runtime.newproc(SB) 0x0056 00086 (go-closure.go:13) MOVQ "".&i+24(SP), AX 0x005b 00091 (go-closure.go:13) INCQ (AX) 0x005e 00094 (go-closure.go:13) CMPQ (AX), $5 0x0062 00098 (go-closure.go:13) JLT 57
      
      





The code is very similar to Example 2: notice that i is represented by an address in the AX register. That is, we pass i by reference. And this despite the fact that foobyval is called. The body of the loop calls the function using runtime.newproc, but where does this function come from?







Func1 is created by the compiler, and it is a closure. The compiler has allocated the closure code as a separate function and calls it from main. The main problem with this allocation is how to deal with variables that closures use, but which are clearly not arguments.







This is what the body of func1 looks like:







 0x0000 00000 (go-closure.go:14) MOVQ (TLS), CX 0x0009 00009 (go-closure.go:14) CMPQ SP, 16(CX) 0x000d 00013 (go-closure.go:14) JLS 56 0x000f 00015 (go-closure.go:14) SUBQ $16, SP 0x0013 00019 (go-closure.go:14) MOVQ BP, 8(SP) 0x0018 00024 (go-closure.go:14) LEAQ 8(SP), BP 0x001d 00029 (go-closure.go:15) MOVQ "".&i+24(SP), AX 0x0022 00034 (go-closure.go:15) MOVQ (AX), AX 0x0025 00037 (go-closure.go:15) MOVQ AX, (SP) 0x0029 00041 (go-closure.go:15) CALL "".foobyval(SB) 0x002e 00046 (go-closure.go:16) MOVQ 8(SP), BP 0x0033 00051 (go-closure.go:16) ADDQ $16, SP 0x0037 00055 (go-closure.go:16) RET
      
      





It’s interesting here that the function has an argument of 24 (SP), which is a pointer to int: take a look at the line MOVQ (AX), AX, which takes a value before passing it to foobyval. In fact, func1 looks something like this:







 func func1(i *int) { foobyval(*i) }    main   - : for i := 0; i < 5; i++ { go func1(&i) }
      
      





Received the equivalent of example 2, and this explains the conclusion. In technical language, we would say that i is a free variable inside a closure and such variables are captured by reference in Go.







But is this always the case? Surprisingly, the answer is no. In some cases, free variables are captured by value. Here is a variation of our example:







 for i := 0; i < 5; i++ { ii := i go func() { foobyval(ii) }() }
      
      





This example will output 0, 1, 2, 3, 4 in random order. But why is the behavior here different from Example 5?







It turns out that this behavior is an artifact of the heuristic that the Go compiler uses when it works with closures.







We look under the hood



If you are not familiar with the architecture of the Go compiler, I recommend that you read my early articles on this topic: Part 1 , Part 2 .







The specific (as opposed to abstract) syntax tree that is obtained by parsing the code looks like this:







 0: *syntax.CallStmt { . Tok: go . Call: *syntax.CallExpr { . . Fun: *syntax.FuncLit { . . . Type: *syntax.FuncType { . . . . ParamList: nil . . . . ResultList: nil . . . } . . . Body: *syntax.BlockStmt { . . . . List: []syntax.Stmt (1 entries) { . . . . . 0: *syntax.ExprStmt { . . . . . . X: *syntax.CallExpr { . . . . . . . Fun: foobyval @ go-closure.go:15:4 . . . . . . . ArgList: []syntax.Expr (1 entries) { . . . . . . . . 0: i @ go-closure.go:15:13 . . . . . . . } . . . . . . . HasDots: false . . . . . . } . . . . . } . . . . } . . . . Rbrace: syntax.Pos {} . . . } . . } . . ArgList: nil . . HasDots: false . } }
      
      





The called function is represented by the FuncLit node, a constant function. When this tree is converted to AST (abstract syntax tree), highlighting this constant function as a stand-alone will be the result. This happens in the noder.funcLit method, which lives in gc / closure.go.







Then the tipe checker completes the transformation, and we get the following representation for the function in the AST:







 main.func1: . DCLFUNC l(14) tc(1) FUNC-func() . DCLFUNC-body . . CALLFUNC l(15) tc(1) . . . NAME-main.foobyval a(true) l(8) x(0) class(PFUNC) tc(1) used FUNC-func(int) . . CALLFUNC-list . . . NAME-main.il(15) x(0) class(PAUTOHEAP) tc(1) used int
      
      





Note that the value passed to foobyval is NAME-main.i, that is, we explicitly point to the variable from the function that wraps the closure.







At this stage, the compiler stage, called capturevars, that is, "capturing variables", comes into operation. Its purpose is to decide how to capture “private variables” (that is, free variables used in closures). Here is a comment from the corresponding compiler function, which also describes heuristics:







// capturevars is called in a separate phase after all type checks.

// It decides whether to capture the variable by value or by reference.

// We use capture by value for values ​​<= 128 bytes, which no longer change the value after capture (essentially constants).







When capturevars is called in Example 5, it decides that the loop variable i should be captured by reference, and adds the appropriate addrtaken flag to it. This is seen in the AST output:







 FOR l(13) tc(1) . LT l(13) tc(1) bool . . NAME-main.ia(true) g(1) l(13) x(0) class(PAUTOHEAP) esc(h) tc(1) addrtaken assigned used int
      
      





For the loop variable, the “by value” selection heuristic does not work, since the variable changes its value after the call (remember the quote from the specification that the loop variable is reused at each iteration). Therefore, the variable i is captured by reference.

In that variation of our example, where we have ii: = i, ii is not used anymore and therefore is captured by value [5].







Thus, we see a stunning example of overlapping two different features of a language in an unexpected way. Instead of using a new variable at each iteration of the loop, Go reuses the same one. This, in turn, leads to the triggering of heuristics and the choice of capturing by reference, and this leads to an unexpected result. The Go FAQ says that this behavior may be a design error.







This behavior (do not use a new variable) is probably a mistake when designing a language. Maybe we will fix it in future versions, but due to backward compatibility we can not do anything in Go version 1.

If you are aware of the problem, you will most likely not step on this rake. But keep in mind that free variables can always be captured by reference. To avoid errors, make sure that only read-only variables are captured when using goroutine. This is also important due to potential issues with data flights.










[1] Some readers have noticed that, strictly speaking, there is no concept of “passing by reference” in Go, because everything is passed by value, including pointers. In this article, when you see "pass by reference", I mean "pass by address" and it is explicit in some cases (such as passing & n to a function that expects * int), and in some cases implicit, as in later ones parts of the article.







[2] Hereinafter, I use time.Sleep as a quick and dirty way to wait for all goroutines to complete. Without this, main will end before the goroutines begin to work. The right way to do this would be to use something like WaitGroup or done channel.







[3] The assembler representation for all the examples in this article was obtained using the go tool compile -l -S command. The -l flag disables function inlining and makes assembler code more readable.







[4] Foobyval is not called directly, since the call goes through go. Instead, the address is passed as the second argument (16 (SP)) to the runtime.newproc function, and the argument to foobyval (i in this case) goes up the stack.







[5] As an exercise, add ii = 10 as the last line of the for loop (after calling go). What was your conclusion? Why?








All Articles