speedRS speedRS - 5 months ago 15
Java Question

How do I parse delimited rows of text with differing field counts in to objects, while allowing for extension?

An example is as follows:


Basically, each
line needs to be parsed into a corresponding object, defining what each of those fields are. Some, such as the third field in
will be parsed as a

Each object will generally stay the same but there may be instances in which an additional field may be added, like so:


At the moment, I'm thinking of using the following type of algorithm:

List<String> segments = Arrays.asList(string.split("\r"); // Will always be a CR.
List<String> fields;
String fieldName;
for (String segment : segments) {
fields = Arrays.asList(segment.split("\\|");
fieldName = fields.get(0);
SEG1 seg1;
if (fieldName.compareTo("SEG1") == 0) {
seg1 = new Seg1();
} else if (fieldName.compareTo("SEG2") == 0) {
} else if (fieldName.compareTo("SEG3") == 0) {
} else {
// Erroneous/failure case.

Some fields may be optional as well, depending on the object being populated. My concern is if I add a new field to a class, any checks that use the expect field count number will also need to be updated. How could I go about parsing the rows, while allowing for new or modified field types in the class objects to populate?


If you can define a common interface for all to be parsed classes I would suggest the following:

interface Segment {}

class SEG1 implements Segment
    void setField1(final String field){};
    void setField2(final String field){};
    void setField3(final String field){};

enum Parser {
    SEGMENT1("SEG1") {
        protected Segment parse(final String[] fields)
            final SEG1 segment = new SEG1();
            return segment;

    private final String name;

    private Parser(final String name)
        this.name = name;

    protected abstract Segment parse(String[] fields);

    public static Segment parse(final String segment)
        final int firstSeparator = segment.indexOf('|');

        final String name = segment.substring(0, firstSeparator);
        final String[] fields = segment.substring(firstSeparator + 1).split("\\|");

        for (final Parser parser : values())
            if (parser.name.equals(name))
                return parser.parse(fields);

        return null;

For each type of segment add an element to the enum and handle the different kinds of fields in the parse(String[])method.