Ayyappan Ayyappan - 4 months ago 37
Java Question

RegEx for reading SRT file in android

I want to get the start time, end time and subtitle from subtitle file(.srt) in a android app. I'm using the regex for extracting the content. I placed the .srt file in assets folder. But the pattern didn't extract any content from the file. It is returning null. Is there any modification needed in regEx. The regex code and file content given below,

code::

protected static final String nl = "\\\n";
protected static final String sp = "[ \\t]*";
Pattern pattern =Pattern.compile("(\\d+)" + sp + nl
+ "(\\d{1,2}):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp
+ "-->" + sp + "(\\d\\d):(\\d\\d):(\\d\\d),(\\d\\d\\d)" + sp
+ "(X1:\\d.*?)??" + nl + "((.|\\\\n)*?)" + nl + nl);


file content::

2
00:00:02,373 --> 00:00:03,999
Ohh wooaah

3
00:00:06,190 --> 00:00:07,798
Ohh wooaah


4
00:00:09,743 --> 00:00:12,966
Ohh wooaah

Answer

Next time, provide what you've tried at least, and BTW here's a very good tutorial on regexp : http://www.regular-expressions.info/

String lineNumberPattern = "(\\d+\\s)";
String timeStampPattern = "([\\d:,]+)";
String contentPattern = "(.*)";

// the complete regexp : "(\\d+\\s)([\\d:,]+)( --> )([\\d:,]+)(\\s)(.*)"

String sampleLine = "2\n00:00:02,373 --> 00:00:03,999\nOhh wooaah\n";
Matcher matcher = Pattern.compile(lineNumberPattern + timeStampPattern + "( --> )" + timeStampPattern + "(\\s)" + contentPattern).matcher(sampleLine);

while(matcher.find()) {
    String start = matcher.group(2);
    String end = matcher.group(4);
    String content = matcher.group(6);
    // store those information somewhere
}