A. Pathan A. Pathan - 7 months ago 20
Java Question

Regex in java to derive first few characters of line in file and use that as filename

I have a requirement of

splitting 1 file into multiple files
.
The file will be of type:

1 T table1 "a,b,c,d,e,f"
2 W table1 "a,b,c,d,e,f"
3 D table1 "a,b,c,d,e,f"


I want to split this file into 3 files with
naming conventions
as

1_T_table1 , 2_W_table1 and 3_D_table1


I have already splitted files into 3 files but with naive names.
I want to name them as above.
Can anyone help me with naming these files :)
Below is the code :

Note: text.txt is the file getting split

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintWriter;
import java.lang.Object ;
import java.util.regex.Matcher ;
import java.util.regex.Pattern;


public class Test {

/**
* @param args
* @throws IOException
*/
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
String inputFile="C:\\path\\test.txt";
BufferedReader br = new BufferedReader(new FileReader(new File(inputFile)));
String line=null;
StringBuilder sb = new StringBuilder();
int count=1;
try {
while((line = br.readLine()) != null){
Matcher match = Pattern.compile("^[0-9]+ [TWD]").matcher(line);
while (match.find())
{
sb.append("split"+"\r\n ");

}

if(sb.length()!=0){
sb.append(line+"\r\n ");

}

}

int c = 0;
Pattern p = Pattern.compile("split");
Matcher m = p.matcher( sb.toString() );
while (m.find()) {
c++;
}
//System.out.println(c);
int index = 0;
for(int i=0;i<=c ;i++)
{

if(sb.length() > 0 && sb.toString().contains("split")){
File file = new File("C:\\path\\DOC_ID_"+i+".txt");
PrintWriter writer = new PrintWriter(file, "UTF-8");
index = sb.toString().indexOf("split",2);
//System.out.println(index);
if(index>0)
{
writer.println(sb.toString().substring(7,index));
sb.delete(0, index);
}
else
{
writer.println(sb.toString().substring(7,sb.length()));
sb.delete(0, sb.length());
}
writer.close();


}
}

} catch (Exception ex) {
ex.printStackTrace();
}
finally {
br.close();
}
}

}

Answer

Till I understood, you need this regex

(\d+)\s+([TWD])\s+([\w-]+).*

Regex Demo

Java Code

System.out.println(ln.replaceFirst("(\\d+)\\s+([TWD])\\s+([\\w-]+).*", "$1_$2_$3"));

Ideone Demo