sync.Cond Test Broadcast – why check in a loop?

Issue

I was trying to use sync.Cond – Wait and Broadcast. I could not understand some parts of it:

The comment for Wait calls says:

   41    // Because c.L is not locked when Wait first resumes, the caller
   42    // typically cannot assume that the condition is true when
   43    // Wait returns.  Instead, the caller should Wait in a loop:
   44    //
   45    //    c.L.Lock()
   46    //    for !condition() {
   47    //        c.Wait()
   48    //    }
   49    //    ... make use of condition ...
   50    //    c.L.Unlock()

What is the reason this is required?

So this means the following program may not be correct (all though it works):

package main

import (
    "bufio"
    "fmt"
    "os"
    "sync"
)

type processor struct {
    n       int
    c       *sync.Cond
    started chan int
}

func newProcessor(n int) *processor {

    p := new(processor)
    p.n = n

    p.c = sync.NewCond(&sync.Mutex{})

    p.started = make(chan int, n)

    return p
}

func (p *processor) start() {

    for i := 0; i < p.n; i++ {
        go p.process(i)
    }

    for i := 0; i < p.n; i++ {
        <-p.started
    }
    p.c.L.Lock()
    p.c.Broadcast()
    p.c.L.Unlock()

}

func (p *processor) process(f int) {

    fmt.Printf("fork : %d\n", f)

    p.c.L.Lock()
    p.started <- f
    p.c.Wait()
    p.c.L.Unlock()
    fmt.Printf("process: %d - out of wait\n", f)
}

func main() {

    p := newProcessor(5)
    p.start()

    reader := bufio.NewReader(os.Stdin)
    _,_ =reader.ReadString('\n')

}

Solution

Condition variables doesn’t stay signaled, they only wake up other go routines that are blocking in .Wait(). So this presents a race condition unless you have a predicate where you check if you even need to wait, or if the thing you want to wait for has already happened.

In your particular case, you have added synchronization between the go routines calling .Wait() and the one calling .BroadCast() by using your p.started channel, in a manner that as far as I can tell should not present the race condition I’m describing further on in this post. Though I wouldn’t bet on it, and personally I would just do it the idiomatic way like the documentation describes.

Consider your start() function is executing the broadcast in these lines:

p.c.L.Lock()
p.c.Broadcast()

At that specific point in time consider that one of your other go routines have come to this point in your process() function

fmt.Printf("fork : %d\n", f)

The next thing that go routine will do is lock the mutex (which it isn’t going to own at least until the go routine in start() releases that mutex) and wait on the condition variable.

p.c.L.Lock()
p.started <- f
p.c.Wait()

But Wait is never going to return, since at this point there’s noone that will signal/broadcast it – the signal has already happened.

So you need another condition that you can test yourself so you don’t need to call Wait() when you already know the condition has happened, e.g.

type processor struct {
    n       int
    c       *sync.Cond
    started chan int
    done    bool //added
}

...

func (p *processor) start() {

    for i := 0; i < p.n; i++ {
        go p.process(i)
    }

    for i := 0; i < p.n; i++ {
        <-p.started
    }
    p.c.L.Lock()
    p.done = true //added
    p.c.Broadcast()
    p.c.L.Unlock()

}

func (p *processor) process(f int) {

    fmt.Printf("fork : %d\n", f)

    p.c.L.Lock()
    p.started <- f
    for !p.done { //added
        p.c.Wait()
    }
    p.c.L.Unlock()
    fmt.Printf("process: %d - out of wait\n", f)
}

Answered By – nos

Answer Checked By – Clifford M. (GoLangFix Volunteer)

Leave a Reply

Your email address will not be published.