M. Twombley M. Twombley - 2 months ago 6
C++ Question

How do I take the output of a parse and use it to look up in a symbols

As can be seen from the code I'm taking the output of one parse and using it to look up the number from the symbols in a second parse.
How do I do this as a single rule? Looking at the docs and doing a lot of searching leads me to believe this can be done with a local var, but I can't figure out how to use my symbols quad on that var.

int main()
{
using boost::phoenix::ref;
using qi::_1;
using qi::_val;
using qi::no_case;
using qi::_a;
using qi::symbols;
using qi::char_;
using qi::omit;

symbols<char, int> quad;
quad.add
("1", 1)
("2", 2)
("3", 3)
("4", 4)
("NE", 1)
("SE", 2)
("SW", 3)
("NW", 4)
;

std::wstring s = L"N44°30'14.950\"W";

std::wstring out;
int iQuad;
qi::parse(s.begin(), s.end(),
no_case[char_('N')] >> omit[*(qi::char_ - no_case[char_("NSEW")])] >> no_case[char_('W')],
out);
qi::parse(out.begin(), out.end(), quad, iQuad);
return 0;
}

Answer

Yes it can be done with a local var.

However, that demotes symbols to a regular map. So let's use that¹

1. The simplest thing

Firstly, I'd consider doing the simplest thing:

#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <iostream>

namespace Rules {
    namespace qi = boost::spirit::qi;

    qi::rule<std::wstring::const_iterator, int()> quad = qi::no_case [
        ('N' >> *~qi::char_("EW") >> 'E')[ qi::_val = 1 ] |
        ('S' >> *~qi::char_("EW") >> 'E')[ qi::_val = 2 ] |
        ('S' >> *~qi::char_("EW") >> 'W')[ qi::_val = 3 ] |
        ('N' >> *~qi::char_("EW") >> 'W')[ qi::_val = 4 ] 
    ];
}

int main() {
    for (std::wstring const s : {
            L"NE", L"SE", L"SW", L"NW",
            L"N44°30'14.950\"E", 
            L"N44°30'14.950\"W", 
            L"S44°30'14.950\"W", 
            L"S44°30'14.950\"E", 
            L"1", L"2", L"3", L"4",
        })
    {
        int iQuad;
        auto f = s.begin(), l = s.end();
        bool ok = parse(f, l, Rules::quad, iQuad);

        if (ok)
            std::wcout << L"Parsed: '" << s << L"' -> " << iQuad << L"\n";
        else
            std::wcout << L"Parse failed '" << s << L"'\n";

        if (f!=l)
            std::wcout << L"Remaining unparsed: '" << std::wstring(f,l) << L"'\n";
    }
}

Which prints

Live On Coliru

Parsed: 'NE' -> 1
Parsed: 'SE' -> 2
Parsed: 'SW' -> 3
Parsed: 'NW' -> 4
Parsed: 'N44?30'14.950"E' -> 1
Parsed: 'N44?30'14.950"W' -> 4
Parsed: 'S44?30'14.950"W' -> 3
Parsed: 'S44?30'14.950"E' -> 2
Parse failed '1'
Remaining unparsed: '1'
Parse failed '2'
Remaining unparsed: '2'
Parse failed '3'
Remaining unparsed: '3'
Parse failed '4'
Remaining unparsed: '4'

If you want to make the numerics parse as well, just add

qi::rule<std::wstring::const_iterator, int()> quad = qi::no_case [
    (qi::int_(1) | qi::int_(2) | qi::int_(3) | qi::int_(4)) [ qi::_val = qi::_1 ] |
    ('N' >> *~qi::char_("EW") >> 'E')[ qi::_val = 1 ] |
    ('S' >> *~qi::char_("EW") >> 'E')[ qi::_val = 2 ] |
    ('S' >> *~qi::char_("EW") >> 'W')[ qi::_val = 3 ] |
    ('N' >> *~qi::char_("EW") >> 'W')[ qi::_val = 4 ] 
];

All this can be optimized, but I'll venture the guess that it's more efficient than anything based on symbol and 2-phase parse

2. Using a map lookup

Just... use a map:

template <typename It> struct MapLookup : qi::grammar<It, int()> {
    MapLookup() : MapLookup::base_type(start) {
        namespace px = boost::phoenix;

        start = qi::as_string [
            qi::char_("1234") | 
            qi::char_("nsNS") >> qi::omit[*~qi::char_("weWE")] >> qi::char_("weWE")
        ] [ qi::_val = px::ref(_lookup)[qi::_1] ];
    }
  private:
    struct ci {
        template <typename A, typename B>
        bool operator()(A const& a, B const& b) const { return boost::ilexicographical_compare(a, b); }
    };
    std::map<std::string, int, ci> _lookup = { 
        { "NE", 1 }, { "SE", 2 }, { "SW", 3 }, { "NW", 4 },
        { "1" , 1 }, { "2",  2 }, { "3",  3 }, { "4",  4 } };
    qi::rule<It, int()> start;
};

See it Live On Coliru too.

3. Optimizing it

qi::symbol uses Tries. You might think that's faster. It is, in fact pretty fast for lookups. But not on very small keysets. On a node-based container. Using dynamically allocated temporary keys.

In other words, we can do much better:

template <typename It> struct FastLookup : qi::grammar<It, int()> {
    using key = std::array<char, 2>;

    FastLookup() : FastLookup::base_type(start) {
        namespace px = boost::phoenix;

        start = 
            qi::int_ [ qi::_pass = (qi::_1 > 0 && qi::_1 <= 4), qi::_val = qi::_1 ] |
            qi::raw [
                qi::char_("nsNS") >> qi::omit[*~qi::char_("weWE")] >> qi::char_("weWE")
            ] [ qi::_val = _lookup(qi::_1) ];
    }
  private:
    struct lookup_f {
        template <typename R> int operator()(R const& range) const {
            using key = std::tuple<char, char>;
            static constexpr key index[] = { key {'N','E'}, key {'S','E'}, key {'S','W'}, key {'N','W'}, };

            using namespace std;
            auto a = std::toupper(*range.begin());
            auto b = std::toupper(*(range.end()-1));
            return 1 + (find(begin(index), end(index), key(a, b)) - begin(index));
        }
    };

    boost::phoenix::function<lookup_f> _lookup;
    qi::rule<It, int()> start;
};

See it Live Again On Coliru


¹ if you insist you can use symbols in your own code

Comments