gsamaras gsamaras - 3 months ago 17
Linux Question

Number of subdirectories in a directory?

How to find the number of subdirectories in a specified directory in HDFS?




When I do
hadoop fs -ls /mydir/
, I get a Java heap space error, since the directory is too big, but what I am interested in is the number of subdirectories in that directory. I tried:

gsamaras@gwta3000 ~]$ hadoop fs -find /mydir/ -maxdepth 1 -type d -print| wc -l
find: Unexpected argument: -maxdepth
0


I know that the directory is not empty, thus 0 is not correct:

[gsamaras@gwta3000 ~]$ hadoop fs -du -s -h /mydir
737.5 G /mydir

Answer

The command to use is: hdfs dfs -ls -R /path/to/mydir/ | grep "^d" | wc -l

But this will also give you the error java.lang.OutOfMemoryError: Java heap space. In order to avoid the error, you need to increase the java heap space and run the same command as:

export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Xmx5g" and then

hdfs dfs -ls -R /path/to/mydir/ | grep "^d" | wc -l .....#For all sub-directories

OR

hdfs dfs -ls /path/to/mydir/ | grep "^d" | wc -l .....#For maxdepth=1