So I'm starting a project on directed graphs and topological sorting. I'm looking for the most efficient way to parse an input file of courses in the form:
void populateGraph(std::string filename, Graph* graph)
You're on the right track with a lot of things.
If you can assume that all the inputs you will encounter are just
COURSE_X and you can assume that all Xes form a continuous interval of ints starting from 1 (or 0) and span to the number of vertices, then you could just treat the vertices as numbers internally. If that is not the case, you can assign each vertex label a number (using std::unordered_map for example) and have this abstract structure anyway.
Now, if you choose to stick with this model, it is convenient to use, because your whole graph can be represented as
std::vector<std::list<int>>. You could substitute
int with some struct type if you want to store more info about the edge, like labels, weights, etc. Whenever you want to access a specific node's adjacency list, you just access the vector's cell under the node's id.
Obviously this is an adjacency list based solution. It's good for sparse graphs generally speaking. Think of it this way: if you're using a matrix for a sparse graph, a very big part of that allocated memory is not going to be used. Using adjacency graph eliminates this, but it takes away the constant access time to any arbitrary edge. This means that it can take linear time to check if a given edge exists. That being said, I wouldn't expect you to use that check in a topological sort. If you chose to use a matrix however, you would still need to map the vertices to numbers, to that part stays.
Last but not least, you can use pointers instead of integer ids. Basically you wouldn't look up vertices in the vector using an id, but you'd directly access a vertice through a stored pointer. You'd most likely still need to have the string -> pointer mapping, at least for the graph creation.