dd90p dd90p - 11 days ago 6
Python Question

how to split single txt file into multiple txt files by Python

I have one single txt file, i would like to split it into many files according to the *TEXT ID

for example: the single txt file looks like this

*TEXT 017 01/04/63 PAGE 020
THE ALLIES AFTER NASSAU IN DECEMBER 1960, THE U.S . FIRST
PROPOSED TO HELP NATO DEVELOP ITS OWN NUCLEAR STRIKE FORCE . BUT EUROPE.....
*TEXT 018 01/04/63 PAGE 021
RUSSIA WHO'S IN CHARGE HERE ? IT WAS IN 1954 THAT NIKITA
KHRUSHCHEV LAUNCHED HIS GRANDIOSE " VIRGIN LANDS " GAMBLE . PART OF THE.....
*TEXT 019 01/04/63 PAGE 021
BERLIN ONE LAST RUN HANS WEIDNER HAD BEEN HOPING FOR MONTHS TO
ESCAPE DRAB EAST GERMANY AND MAKE HIS WAY TO THE WEST . THE ODDS WERE
AGAINST HIM, FOR WEIDNER, 40, WAS A....


how to split into multiple txt files??

filename:
TEXT017.txt

filename:
TEXT018.txt

filename:
TEXT019.txt

Answer

inspired by @n1c9 , I modified and added something to make it completed.

import re

raw_string = """*TEXT 017 01/04/63 PAGE 020
THE ALLIES AFTER NASSAU IN DECEMBER 1960, THE U.S . FIRST
PROPOSED TO HELP NATO DEVELOP ITS OWN NUCLEAR STRIKE FORCE . BUT EUROPE.....
*TEXT 018 01/04/63 PAGE 021
RUSSIA WHO'S IN CHARGE HERE ? IT WAS IN 1954 THAT NIKITA
KHRUSHCHEV LAUNCHED HIS GRANDIOSE " VIRGIN LANDS " GAMBLE . PART OF THE.....
*TEXT 019 01/04/63 PAGE 021
BERLIN ONE LAST RUN HANS WEIDNER HAD BEEN HOPING FOR MONTHS TO
ESCAPE DRAB EAST GERMANY AND MAKE HIS WAY TO THE WEST . THE ODDS WERE
AGAINST HIM, FOR WEIDNER, 40, WAS A...."""

split_strings = re.split('\n?(\*TEXT .*)\n', raw_string)
blocks = [s for s in split_strings if s] # filter some blank strings

for i in range(0, len(blocks), 2):
    # extract `019` from `*TEXT 019 01/04/63 PAGE 021`
    num = re.search('TEXT (\d+)', blocks[i]).group(1)

    # save content to `TEXT019.txt`
    filename = 'TEXT%s.txt' % num
    content = blocks[i+1]
    with open(filename, 'w+') as fp:
        fp.write(content)
Comments