Skip to content

sync.waitgroup in loop causes unmatch result. #6762

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
amoyyy opened this issue Nov 6, 2020 · 2 comments
Open

sync.waitgroup in loop causes unmatch result. #6762

amoyyy opened this issue Nov 6, 2020 · 2 comments
Labels
Bug This tag is applied to issues which reports bugs.

Comments

@amoyyy
Copy link

amoyyy commented Nov 6, 2020

V version: V 0.1.29 b14f779, commited at 2020-11-05 22:59:11 +0200
OS: Ubuntu 20.04

What did you do?


module main
import sync

const (
  data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
)

struct Foo{
mut:
    sum     i64
    mutex   &sync.Mutex
}

fn new_foo() &Foo{
    return &Foo {
        sum : 0
        mutex : sync.new_mutex()
    }
}

fn (mut foo Foo) refresh() {
    foo.mutex.m_lock()
    foo.sum = 0
    foo.mutex.unlock() 
}

fn (mut foo Foo) action() {
    mut wg := sync.new_waitgroup()
    wg.add(10)
    for num in data {
        go foo.op(num, mut wg)
    }
    wg.wait()
    println(foo.sum)
}

fn (mut foo Foo) op(num int, mut wg sync.WaitGroup) {
    foo.sum += num*(num-1)
    wg.done()
}

fn main() {
    mut foo := new_foo()
    for ii:=0;ii<1000;ii++{
        foo.refresh()
        foo.action()
    }
}

What did you expect to see?
results printed in each loop should be equal.

What did you see instead?
results are not consistent.

@software-is-art
Copy link

software-is-art commented Nov 6, 2020

It looks to me like you're triggering a race condition here:

foo.sum += num*(num-1)

Incrementing an integer isn't atomic unless using interlocked CPU instructions (which I don't think V exposes yet though it does have a reserved atomic keyword which I can only assume will be used for this).

Try changing op as follows and see if you can reproduce it still:

fn (mut foo Foo) op(num int, mut wg sync.WaitGroup) {
    foo.mutex.m_lock()
    foo.sum += num*(num-1)
    foo.mutex.unlock() 
    wg.done()
}

@amoyyy
Copy link
Author

amoyyy commented Nov 6, 2020

It looks to me like you're triggering a race condition here:

foo.sum += num*(num-1)

Incrementing an integer isn't atomic unless using interlocked CPU instructions (which I don't think V exposes yet though it does have a reserved atomic keyword which I can only assume will be used for this).

Try changing op as follows and see if you can reproduce it still:

fn (mut foo Foo) op(num int, mut wg sync.WaitGroup) {
    foo.mutex.m_lock()
    foo.sum += num*(num-1)
    foo.mutex.unlock() 
    wg.done()
}

Thanks for your advice.
By adding lock in op() , the consistency problem seems to be solved in my machine.
But when I set the loop times ii to a larger number (>100000 for instance), the codes just pause when running, I wonder if a deadlock occurs? By the way, using the sync.channel has the same problem in loop test.

@medvednikov medvednikov added the Bug This tag is applied to issues which reports bugs. label Jul 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug This tag is applied to issues which reports bugs.
Projects
None yet
Development

No branches or pull requests

3 participants