Victor Victor - 6 months ago 9
Python Question

How to read some contents of xml files and write them into a text file?

I have a following xml file, I want to read the contents in

<seg>
and save them into a plain text file with Python. And I used the DOM module.



<?xml version="1.0"?>
<mteval>
<tstset setid="default" srclang="any" trglang="TRGLANG" sysid="SYSID">
<doc docid="ntpmt-dev-2000/even1k.cn.seg.txt">
<seg id="1">therefore , can be obtained having excellent properties ( good stability and solubility of the balance of the crystal as a pharmaceutical compound is not possible to predict .</seg>
<seg id="3">compound ( I ) are preferably crystalline , in particular , has good stability and solubility equilibrium and suitable for industrial prepared type A crystal is preferred .</seg>
<seg id="4">method B included in the catalyst such as DMF , and the like in the presence of a compound of formula ( II ) with thionyl chloride or oxalyl chloride to give an acyl chloride , in the presence of a base of the acid chloride with alcohol ( IV ) ( O ) by reaction of esterification .</seg>
</doc>
</tstset>
</mteval>





from xml.dom.minidom import parse
import xml.dom.minidom

dom = xml.dom.minidom.parse(r"path_to_xml file")
file = dom.documentElement
seg = dom.getElementsByTagName("seg")
for item in seg:
sent = item.firstChild.data
print(sent,sep='')

file = open(r'file.txt','w')
file.write(sent)
file.close()


while running above codes, it print all the lines on the screen successfully, but the file.txt only has one line of last
<seg>
(seg id=4), actually I want to save all the sentences into the file. Is there something wrong with my codes?

Answer

You're only calling file.write(sent) once, open the file before the loop, and then add the following line to this code:

file = open(r'file.txt','w')

for item in seg:
    sent = item.firstChild.data
    print(sent,sep='')
    file.write(sent) // <---- this line

file.close()
Comments