SizedWaitGroup: a Golang WaitGroup with throttling
What is a WaitGroup ?
In Golang, there is a very nice feature available in the sync
package: Wait Groups.
The purpose of this feature is to help developers to launch many concurrent goroutines and to easily wait for each routine of this group to end, hence its name.
Basic example:
package main
import (
"fmt"
"math/rand"
"sync"
"time"
)
func main() {
rand.Seed(time.Now().UnixNano())
var wg sync.WaitGroup{}
for i := 0; i < 50; i++ {
wg.Add(1)
go func() {
defer wg.Done()
query()
}()
}
wg.Wait() // Past this point, all routines are executed.
}
func query() {
... query the database ...
}
In this example, we start 50 goroutines which will be executed as fast as possible.
The wg.Wait()
ensures that past this point, all routines are ended. More precisely: all routines have called Done()
. Even more precisely: Done()
has been called the same amount of time as Add(1)
.
But sometimes, even if we want to execute things as fast as possible, we don’t necessarily want to overload everything.
Introducing SizedWaitGroup
SizedWaitGroup
is quite similar to WaitGroup
but adds the principle of throttling: you can specify the maximum amount of routines to spawn concurrently.
A typical use-case would be to execute as fast as possible a set of queries but without overloading the called database. Example:
package main
import (
"fmt"
"math/rand"
"time"
"github.com/remeh/sizedwaitgroup"
)
func main() {
rand.Seed(time.Now().UnixNano())
// Typical use-case:
// 50 queries must be executed as quick as possible
// but without overloading the database, so only
// 8 routines should be started concurrently.
swg := sizedwaitgroup.New(8)
for i := 0; i < 50; i++ {
swg.Add()
go func() {
defer swg.Done()
query()
}()
}
swg.Wait()
}
func query() {
... query the database ...
}
In this example, we can see that the use of the SizedWaitGroup
is quite similar to WaitGroup
but there is two differences:
- You provide a
limit
when creating theSizedWaitGroup
: in the example, only 8 routines can be spawned concurrently. In order to let start a new routine after having already started 8, one of those 8 should callDone()
. Add()
doesn’t take any argument on how much routines you want to count in theWaitGroup
: this difference is because in aSizedWaitGroup
, a call toAdd()
can be blocking!
Conclusion
Most of the time, this kind of tool is not needed: even for a database, receiving 100 queries « simultaneously » may not be a problem. But it doesn’t mean that you want to use the full set of available connections of a database and this is another use-case where the SizedWaitGroup
would reveal itself useful.
Sources are available here and feedbacks are welcome!