m8mble m8mble - 1 month ago 5
C++ Question

Why doesn't gcc use memmove in std::uninitialized_copy?

copies into an uninitialized range of memory.
This could be done using
memmove
for bitwise copyable types.
I stepped through below example code in gdb (compiling with gcc 5.2.0).
Thereby I observed that
memmove
isn't used at all.

In the example
__is_trivial(Bar)
is used to determine whether
memmove
may be used.
It (correctly) evaluates to
false
as
Bar
has a non-trivial default constructor (cf. call to
std::__uninitialized_copy<false>::__uninit_copy(...)
in
bits/stl_uninitialized.h
line
123
ff).
But why is
__is_trivial
even relevant to
std::uninitialized_copy
?
According to Bjarne
std::is_trivially_copyable
should be sufficient. Note that the latter evaluates to
true
in the example, ie. the
memmove
optimization is applicable.

I'm aware, that the standard does not require any specific implementation of
std::uninitialized_copy
. I'm just wondering why
__is_trivial
is favored even if
std::is_trivially_copyable
is present as an applicable alternative to the gcc implementation?

Example Code:

#include <iostream>
#include <memory>
#include <vector>
#include <type_traits>

struct Bar
{
Bar () : v(42) {};
Bar(Bar const &) = default;
Bar(Bar &&) = default;
Bar & operator=(Bar &&) = default;
Bar & operator=(Bar const &) = default;
~Bar() = default;
int v;
};

int main() {
std::cout
<< std::is_trivially_move_constructible<Bar>::value
<< " " << std::is_trivially_copy_constructible<Bar>::value
<< " " << std::is_trivially_copyable<Bar>::value
<< " " << std::is_trivial<Bar>::value
<< " " << __is_trivial(Bar) << std::endl;
size_t const num_elements = 1 << 27;
std::vector<Bar> v(num_elements);
Bar * vc = (Bar *) std::malloc(num_elements * sizeof(Bar));
std::uninitialized_copy(v.begin(), v.end(), vc);
std::free(vc);
}


Example Output:
1 1 1 0 0


Update: We did some tests comparing actual runtimes of
memmove
,
uninitialized_copy
and a simple
for
loop. If
Bar
is trivial (cf.
__is_trivial(Bar)
),
uninitialized_copy
is as fast as
memmove
, if it's not,
uninitialized_copy
is as fast as our
for
loop. Overall
memmove
was only significantly faster (
2x
) on small
Bar
s (ie. change
int v;
to
char v;
). Otherwise the performance was essentially the same.

Edit: Correct references to
std::is_trivially_...
. State title more precisely.

Answer

For future readers: This has been submitted as a gcc enhancement here.

Comments