Common pitfalls when using goroutines
In this post I am going to cover some common cases and incidents that you are likely to experience when using goroutines and how to deal with them.
Table of contents:
Introduction
1. Waiting for the goroutines
2. Deadlocks
3. Getting unexpected results
4. Race conditions
Introduction
First, what’s a goroutine?
Golang is concurrent by nature.
To achieve concurrency, Go uses goroutines — functions or methods that run concurrently with other functions or methods.
Yes, even Golang’s main function is a goroutine!
goroutines can be viewed as lightweight threads, but unlike threads, they are not managed by the operating system, but by Golang’s runtime.
It’s very common for a Go application to have hundreds and even thousands of goroutines running concurrently.
(More on goroutines here)
Let’s start with a quick example and create a dummy hello.go file:
package mainimport (
"fmt"
"time"
)func hello() {
fmt.Println("Hello")
}func main() {
go hello()
time.Sleep(1 * time.Second)
fmt.Println("main function")
}
and the output should be:
$ go run hello.go
Hello
main function
Cool, our goroutine executed successfully.
However, as you start adding more functionality to the goroutines, you may also end up facing one of these common cases below.
Part 1: Waiting………
Let’s start with a simple one:
As you may have noticed, the use of time.Sleep is very common when demonstrating the basic functionality of goroutines.
So why is the sleep necessary here? Let’s check it out without the time.Sleep function:
package mainimport (
"fmt"
"time"
)func hello() {
fmt.Println("Hello")
}func main() {
go hello()
// time.Sleep(1 * time.Second) // now it's commented out
fmt.Println("main function")
}$ go run hello.go
main function
Hmm, now the goroutine’s output is missing. Why is that?
Because the program’s execution begins by initializing the main package and then invoking the function main
. When that function invocation returns, the program exits. It does not wait for other (non-main
) goroutines to complete.
It means that when the main function finishes it’s execution, it doesn’t wait for other goroutines to finish.
So now that we understand the necessity of waiting for other goroutines to finish, is there a more elegant and efficient way of waiting for goroutines to finish, instead of guessing how long it’s going to take the goroutine to finish?
Yes, there is! it’s called WaitGroups.
WaitGroups allow us to block until all goroutines within that waitgroup finish their execution.
An example WaitGroup implementation:
package mainimport (
"fmt"
"sync"
)func hello(wgrp *sync.WaitGroup) {
fmt.Println("Hello")
wgrp.Done() /////// notifies the waitgroup that it finished
}func main() {
var wg sync.WaitGroup
wg.Add(1) /////// adds an entry to the waitgroup counter
go hello(&wg)
wg.Wait() ////// blocks execution until the goroutine finishes
fmt.Println("main function")
}
Run the code:
$ time go run hello.go
Hello
main functionreal 0m0.230s
user 0m0.240s
sys 0m0.099s
Better and faster, because we don’t have to wait a fixed amount of time.
Part 2: Deadlocks
You may have previously seen this scary error before
fatal error: all goroutines are asleep - deadlock!
A deadlock happens when a group of goroutines are waiting for each other and none of them is able to proceed.
Remember, the main package is also goroutine.
package mainimport (
"fmt"
"sync"
)func hello(wgrp *sync.WaitGroup) {
fmt.Println("Hello")
wgrp.Done() /////// removing the wgrp.Done will cause a deadlock
}func main() {
var wg sync.WaitGroup
wg.Add(2) ///// 2 as the value will cause a deadlock
go hello(&wg)
wg.Wait() ////// blocks execution until the goroutine finishes
fmt.Println("main function")
}
1. wgrp.Done() marks the goroutine execution as finished. omitting this will also cause a deadlock.
2. wg.Add() receives the number of goroutines we should be waiting for.
possible values:
0 and the goroutine will not execute
1 will work as expected
2 and above will result in a deadlock
In both cases we’ll get a deadlock because the main function waits for the other goroutine to complete it’s execution:
Case 1: The goroutine will never mark it’s execution on the WorkGroup as done.
Case 2: wg.Add will continue waiting for more goroutines than expected to run.
Another case where you will get a deadlock is when there are no other goroutines to take what the sender sends, as this cannot happen in the same goroutine:
package mainimport "fmt"func main() {
c := make(chan string)
c <- "hello"
fmt.Println(<-c)
}
instead, do this:
package mainimport (
"fmt"
)func main() {
c := make(chan string)
go func() {
get := <-c
fmt.Println("get value:", get)
}()
fmt.Println("push to channel c")
c <- "hello" // send the value and wait until it's received.
}
Part 3: Unexpected results
Adding a for loop to the mix:
package mainimport (
"fmt"
"sync"
)func main() {
var wg sync.WaitGroup
players := []string{"James Harden", "Kyrie Irving", "Kevin Durant"}
wg.Add(len(players))for _, player := range players {
go func() {
fmt.Printf("printing player %s\n", player)
wg.Done()
}()
}
wg.Wait()
}$ go run hello.go
printing player Kevin Durant
printing player Kevin Durant
printing player Kevin Durant
Huh? isn’t it supposed to print different names each iteration?
Well.. it is, but the goroutines created inside the for loop will not necessarily execute sequentially.
Each goroutine starts randomly.
The workaround is fairly simple, we’ll just pass the current item of the iteration:
package mainimport (
"fmt"
"sync"
)func main() {
var wg sync.WaitGroup
players := []string{"James Harden", "Kyrie Irving", "Kevin Durant"}
wg.Add(len(players))for _, player := range players {
go func(baller string) { // add the current iterated player
fmt.Printf("printing player %s\n", baller)
wg.Done()
}(player) // add the current iterated player
}
wg.Wait()
}$ go run hello.go
printing player James Harden
printing player Kevin Durant
printing player Kyrie Irving
That’s better.
Part 4: Race conditions and sharing data between goroutines
Now it’s becoming a bit more complex and interesting:
Imagine that you have a banking application, where a customer can deposit and withdraw money.
As long as the application is single threaded and synchronous, there shouldn’t be any problem, but what happens if your application spins up hundreds or thousands of goroutines?
Consider this scenario:
A customer has a balance of $100, and deposits $50 to his account.
One goroutines sees the transactions, reads the current balance of $100 and adds $50 to the balance.
But wait, at the exact same time there was also a charge of $80 applied to the customer’s account to pay his bill at the local bar.
The second goroutine would read the then-current balance of $100, subtract $80 from the account, and update the account balance.
The customer would then check his account balance and see that it’s only $20 instead of $70, because the second goroutine overwrote the balance value when it processed it’s transaction.
To workaround this, we can use a Mutex.
Mutex? a Mutex (mutual exclusion) is a method used as a locking mechanism to ensure that only one Goroutine is accessing the critical section of code at any point of time.
More on Mutexes here.
This is how it’s going to look:
package mainimport (
"fmt"
"sync"
)var (
mutex sync.Mutex
balance int
)func init() {
balance = 100
}func deposit(val int, wg *sync.WaitGroup) {
mutex.Lock() // lock
balance += val
mutex.Unlock() // unlock
wg.Done()
}func withdraw(val int, wg *sync.WaitGroup) {
mutex.Lock() // lock
balance -= val
mutex.Unlock() // unlock
wg.Done()
}func main() {
var wg sync.WaitGroup
wg.Add(3)
go deposit(20, &wg)
go withdraw(80, &wg)
go deposit(40, &wg)
wg.Wait()
fmt.Printf("Balance is: %d\n", balance)
}$ go run mutex.go
Balance is: 80
Pay attention to the mutex.Lock() and mutex.Unlock() commands that make it happen.
We still use the workgroup the same way as explained earlier.
There is another way to solve it, this time using channels.
Channels are the pipes that connect concurrent goroutines. You can send values into channels from one goroutine and receive those values into another goroutine.
Remember, the main function is also a goroutine.
(More on channels here)
In this example, we use a buffered channel.
This buffered channel is used to ensure that only one goroutine can access the critical section of code, which is the part that modifies the balance.
package mainimport (
"fmt"
"sync"
)var (
balance int
)func init() {
balance = 100
}func deposit(val int, wg *sync.WaitGroup, ch chan bool) {
ch <- true
balance += val
<-ch
wg.Done()
}func withdraw(val int, wg *sync.WaitGroup, ch chan bool) {
ch <- true
balance -= val
<-ch
wg.Done()
}func main() {
var wg sync.WaitGroup
ch := make(chan bool, 1) // create the channel
wg.Add(3)
go deposit(20, &wg, ch)
go withdraw(80, &wg, ch)
go deposit(40, &wg, ch)
wg.Wait()
fmt.Printf("Balance is: %d\n", balance)
}$ go run buf.go
Balance is: 80
We have created a buffered channel with the capacity of 1
, because we want to modify the balance only once per operation, and it’s passed to the deposit/withdraw goroutines.
So which one should we choose?
Generally, use channels when Goroutines need to communicate with each other and mutexes when only one Goroutine should access the critical section of code.
In this case, the best practice would be to use a Mutex.
I hope you find this post useful.