moduluses moduluses - 1 year ago 87
Java Question

Java StreamTokenizer taking a number and character without whitespace as separate tokens

I'm writing a parser with

StreamTokenizer
. I need an input like
"8a"
to echo an error that a number contains a char. Instead, it prints:

NUM: 8 ID: a


It seems to be identifying the char as a separate token, even though no whitespace separates them.

Is there a workaround?

Answer Source

You can override StringTokenizer's parseNumbers method to disable special handling of number characters. Please be aware this might be very risky and otherwise unsuitable.

As per javadoc https://docs.oracle.com/javase/7/docs/api/java/io/StreamTokenizer.html#parseNumbers():

 * When the parser encounters a word token that has the format of a
 * double precision floating-point number, it treats the token as a
 * number rather than a word, by setting the {@code ttype}
 * field to the value {@code TT_NUMBER} and putting the numeric
 * value of the token into the {@code nval} field.

Here comes example - I am not adding 'numeric' attribute to typical characters used in numbers:

    final Reader rd = new StringReader("8a");
    final StreamTokenizer tk = new StreamTokenizer(rd) {
        @Override
        public void parseNumbers() {
            // super.parseNumbers(); - by not calling super. I disable special handling of numeric characters
        }
    };

    tk.wordChars('a', 'z');
    tk.wordChars('0', '9');
    while ((tk.nextToken()) != StreamTokenizer.TT_EOF) {
        if (tk.ttype == StreamTokenizer.TT_WORD) {
            System.out.println("TT_WORD " + tk.sval);
        }
        if (tk.ttype == StreamTokenizer.TT_NUMBER) {
            System.out.println("TT_NUMBER " + tk.nval);
        }
    }

outputs:

TT_WORD 8a

With the above config, you could then get a String 8a and then do String.contains to check if a number is present inside.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download