Paris P Paris P - 20 days ago 10
C++ Question

Graph representation with vectors vs lists in C++

So I had a programming contest problem which involved running a lot of DFS on different graphs.

At first I represented my graph (adjacency lists representation) as an vector of sets:

vector< set<int> > graph;


Also to initialize a graph according to the given number of nodes each time i used an empty set:

set<int> tmpSet;


and I initialized it like that:

for(int j=0;j<N;j++)//N was the number of nodes needed for the graph
graph.push_back(tmpSet);


and i used

graph.clear();


to empty the graph each time.

To insert edges afterwards i used the insert function of std::set.


//Insert directed edge from u to v
graph[u].insert(v);
graph[v].insert(u);


The result was the program consumed too much memory and was too slow to pass the tests. Same thing happened with std::list using the push_back function which is a constant time operation. Then when I changed to std::vector memory consumption became minimal and I passed the tests in 3secs while std::set and std::list couldn't pass them even in 20secs.


My quess is that it had something to do with freeing the space of the inner set and list, but how come vector behave differently?

So my question is if anyone can explain why was this happening so I can understand better how stl containers behave in situations like that where you have a container inside another container.

EDIT: Some extra info: The number of nodes was about N=3000 and the mumber of tests over 1000. That means I had to create over 1000 graphs all held in the variable 'graph'. Also I'm aware that set inserts in O(lgn) time while vector and list in O(1), so I understand why set would take only a bit longer than vector. But then why did the std::list failed also? Also let me mention that set and list finished with 100Mb usage of memory while vector finished with 3Mb

Ok final edit, here's my code to show exactly how I'm using the graph (the list version). Nowhere else in the program any freeing of memory or changing the graph data occurs.

vector< list<int> > graph;
list<int> tmpList;
int T; //num of test cases
int N; //num of nodes
int M; //num of edges
int main ()
{
int u,v;
scanf("%d",&T);//Read test cases
for(int i=0;i<T;i++){

scanf("%d %d",&N,&M);//Read number of nodes and number of edges
for(int j=0;j<N;j++)
graph.push_back(tmpList);

for(int j=0;j<M;j++){
scanf("%d %d",&u,&v);//Read edge from u to v
graph[u].push_back(v);
graph[v].push_back(u);
}
dfs();
graph.clear();
}
}

Answer

When you use std::set to hold adjacent node numbers you insert and get a element in logarithmic time which is slow. But when you use std::vector insert(push_back) and getting an element are done in constant time, therefore the difference in time. So you should use std::vector when you don't need to find some element in set, and use std::set otherwise.

The difference between std::list and std::vector may be because of clear function. For list it is linear but for vector it is atomized constant.

Comments