quant quant - 1 year ago 118
C++ Question

How to avoid false positives with Helgrind?

My thread synchronisation "style" seems to be throwing helgrind off. Here's a simple program that reproduces the problem:

#include <thread>
#include <atomic>
#include <iostream>

int main()
{
std::atomic<bool> isReady(false);

int i = 1;

std::thread t([&isReady, &i]()
{
i = 2;
isReady = true;
});

while (!isReady)
std::this_thread::yield();

i = 3;

t.join();

std::cout << i;

return 0;
}


As far as I can tell, the above is a perfectly well-formed program. However, when I run helgrind using the following command I get errors:

valgrind --tool=helgrind ./a.out


The output of this is:

==6247== Helgrind, a thread error detector
==6247== Copyright (C) 2007-2015, and GNU GPL'd, by OpenWorks LLP et al.
==6247== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==6247== Command: ./a.out
==6247==
==6247== ---Thread-Announcement------------------------------------------
==6247==
==6247== Thread #1 is the program's root thread
==6247==
==6247== ---Thread-Announcement------------------------------------------
==6247==
==6247== Thread #2 was created
==6247== at 0x56FBB1E: clone (clone.S:74)
==6247== by 0x4E46189: create_thread (createthread.c:102)
==6247== by 0x4E47EC3: pthread_create@@GLIBC_2.2.5 (pthread_create.c:679)
==6247== by 0x4C34BB7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==6247== by 0x5115DC2: std::thread::_M_start_thread(std::shared_ptr<std::thread::_Impl_base>, void (*)()) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==6247== by 0x4010EF: std::thread::thread<main::{lambda()#1}>(main::{lambda()#1}&&) (in /home/arman/a.out)
==6247== by 0x400F93: main (in /home/arman/a.out)
==6247==
==6247== ----------------------------------------------------------------
==6247==
==6247== Possible data race during read of size 1 at 0xFFF00035B by thread #1
==6247== Locks held: none
==6247== at 0x4022C3: std::atomic<bool>::operator bool() const (in /home/arman/a.out)
==6247== by 0x400F9F: main (in /home/arman/a.out)
==6247==
==6247== This conflicts with a previous write of size 1 by thread #2
==6247== Locks held: none
==6247== at 0x40233D: std::__atomic_base<bool>::operator=(bool) (in /home/arman/a.out)
==6247== by 0x40228E: std::atomic<bool>::operator=(bool) (in /home/arman/a.out)
==6247== by 0x400F4A: main::{lambda()#1}::operator()() const (in /home/arman/a.out)
==6247== by 0x40204D: void std::_Bind_simple<main::{lambda()#1} ()>::_M_invoke<>(std::_Index_tuple<>) (in /home/arman/a.out)
==6247== by 0x401FA3: std::_Bind_simple<main::{lambda()#1} ()>::operator()() (in /home/arman/a.out)
==6247== by 0x401F33: std::thread::_Impl<std::_Bind_simple<main::{lambda()#1} ()> >::_M_run() (in /home/arman/a.out)
==6247== by 0x5115C7F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==6247== by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==6247== Address 0xfff00035b is on thread #1's stack
==6247== in frame #1, created by main (???:)
==6247==
==6247== ----------------------------------------------------------------
==6247==
==6247== Possible data race during write of size 4 at 0xFFF00035C by thread #1
==6247== Locks held: none
==6247== at 0x400FAE: main (in /home/arman/a.out)
==6247==
==6247== This conflicts with a previous write of size 4 by thread #2
==6247== Locks held: none
==6247== at 0x400F35: main::{lambda()#1}::operator()() const (in /home/arman/a.out)
==6247== by 0x40204D: void std::_Bind_simple<main::{lambda()#1} ()>::_M_invoke<>(std::_Index_tuple<>) (in /home/arman/a.out)
==6247== by 0x401FA3: std::_Bind_simple<main::{lambda()#1} ()>::operator()() (in /home/arman/a.out)
==6247== by 0x401F33: std::thread::_Impl<std::_Bind_simple<main::{lambda()#1} ()> >::_M_run() (in /home/arman/a.out)
==6247== by 0x5115C7F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==6247== by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
==6247== by 0x4E476F9: start_thread (pthread_create.c:333)
==6247== by 0x56FBB5C: clone (clone.S:109)
==6247== Address 0xfff00035c is on thread #1's stack
==6247== in frame #0, created by main (???:)
==6247==
3==6247==
==6247== For counts of detected and suppressed errors, rerun with: -v
==6247== Use --history-level=approx or =none to gain increased speed, at
==6247== the cost of reduced accuracy of conflicting-access information
==6247== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)


Helgrind seems to be picking up my while loop as a race condition. How am I supposed to form this program to avoid helgrind throwing out false positives?

Answer Source

The problem is that Helgrind doesn't understand GCC's atomic builtins, so doesn't realise that they are race-free and impose ordering on the program.

There are ways to annotate your code to help Helgrind, see http://valgrind.org/docs/manual/hg-manual.html#hg-manual.effective-use (but I'm not sure how to use them here, I already tried what sbabbi shows and it only solves part of the problem).

I would avoid yielding in a busy loop anyway, it's a poor form of synchronization. It could be done with a condition variable like so:

#include <thread>
#include <atomic>
#include <iostream>
#include <condition_variable>

int main()
{
    bool isReady(false);
    std::mutex mx;
    std::condition_variable cv;

    int i = 1;

    std::thread t([&isReady, &i, &mx, &cv]()
    {
        i = 2;
        std::unique_lock<std::mutex> lock(mx);
        isReady = true;
        cv.notify_one();
    });

    {
        std::unique_lock<std::mutex> lock(mx);
        cv.wait(lock, [&] { return isReady; });
    }

    i = 3;

    t.join();

    std::cout << i;

    return 0;
}