Alexis Wilke Alexis Wilke - 27 days ago 7
C++ Question

Why do I have to use std::string() even for "" to satisfy a template arguments?

I wrote a very simple template to tokenize a string as shown below.

However, I have a problem calling that function, I cannot use a C string for the

delimiters
or
trim_string
arguments. These have to be
std::string
(or whatever type of string
StringT
is, i.e.
std::wstring
).

So the following fails:

std::vector<std::string> tokens;
std::string str = "This string, it will be split, in 3.";
int count = tokenize_string(tokens, str, ",", true, " ");


To fix the problem I have to write:

std::vector<std::string> tokens;
std::string str = "This string, it will be split, in 3.";
int count = tokenize_string(tokens, str,
std::string(","), true, std::string(" "));


Is there a way to avoid having to use std::string() around the standard C strings in such a situation?

The errors I get with g++ looks like this:

/home/snapwebsites/snapwebsites/snapmanagercgi/daemon/snapmanagerdaemon.cpp: In member function ‘void snap_manager::manager_daemon::init(int, char**)’:
/home/snapwebsites/snapwebsites/snapmanagercgi/daemon/snapmanagerdaemon.cpp:103:71: error: no matching function for call to ‘tokenize_string(std::vector<std::__cxx11::basic_string<char> >&, const string&, const char [2], bool, const char [2])’
snap::tokenize_string(f_bundle_uri, bundle_uri, ",", true, " ");
^
In file included from /home/snapwebsites/snapwebsites/snapmanagercgi/daemon/snapmanagerdaemon.cpp:35:0:
/home/snapwebsites/BUILD/dist/include/snapwebsites/tokenize_string.h:46:8: note: candidate: template<class StringT, class ContainerT> size_t snap::tokenize_string(ContainerT&, const StringT&, const StringT&, bool, const StringT&)
size_t tokenize_string(ContainerT & tokens
^
/home/snapwebsites/BUILD/dist/include/snapwebsites/tokenize_string.h:46:8: note: template argument deduction/substitution failed:
/home/snapwebsites/snapwebsites/snapmanagercgi/daemon/snapmanagerdaemon.cpp:103:71: note: deduced conflicting types for parameter ‘const StringT’ (‘std::__cxx11::basic_string<char>’ and ‘char [2]’)
snap::tokenize_string(f_bundle_uri, bundle_uri, ",", true, " ");
^


The template:

template < class StringT, class ContainerT >
size_t tokenize_string(ContainerT & tokens
, StringT const & str
, StringT const & delimiters
, bool const trim_empty = false
, StringT const & trim_string = StringT())
{
for(typename StringT::size_type pos(0), last_pos(0); last_pos < str.length(); last_pos = pos + 1)
{
pos = str.find_first_of(delimiters, last_pos);

// no more delimiters?
//
if(pos == StringT::npos)
{
pos = str.length();
}

char const * start(str.data() + last_pos);
char const * end(start + (pos - last_pos));

if(start != end // if not (already) empty
&& !trim_string.empty()) // and there are characters to trim
{
// find first character not in trim_string
//
start = std::find_if_not(
start
, end
, [&trim_string](auto const c)
{
return trim_string.find(c) != StringT::npos;
});

// find last character not in trim_string
//
if(start < end)
{
reverse_cstring<typename StringT::value_type const> const rstr(start, end);
auto p = std::find_if_not(
rstr.begin()
, rstr.end()
, [&trim_string](auto const c)
{
return trim_string.rfind(c) != StringT::npos;
});
end = p.get();
}
}

if(start != end // if not empty
|| !trim_empty) // or user accepts empty
{
tokens.push_back(typename ContainerT::value_type(start, end - start));
}
}

return tokens.size();
}

Answer

The rule is that when you have three StringT const & parameters, StringT is deduced independently from the corresponding arguments, and the deduced type must match.

You can

  • Just use typename ContainerT::value_type for all three, if the container's value type is expected to be the correct string type; or
  • Block deduction of StringT from two of the three StringT-taking parameters,

    • Either at the call site by making the later two arguments non-deduced contexts with braced-init-lists:

      int count = tokenize_string(tokens, str, {","}, true, {" "});
      
    • Or in the function template itself, by wrapping the latter two StringT parameters into a non-deduced context:

      template < class StringT, class ContainerT >
      size_t tokenize_string(ContainerT & tokens
                           , StringT const & str
                           , typename std::decay<StringT>::type const & delimiters
                           , bool const trim_empty = false
                           , typename std::decay<StringT>::type const & trim_string = StringT())
      
  • Or take different type parameters for each and harmonize them later in the function template body.