Kiper Kiper - 1 year ago 226
Python Question

docx to list in python

I am trying to read a docx file and to add the text to a list.
Now I need the list to contain lines from the docx file.


docx file:

"Hello, my name is blabla,
I am 30 years old.
I have two kids."


['Hello, my name is blabla', 'I am 30 years old', 'I have two kids']

I cant get it to work.

Using the
module from here:
github link

There is only one command of process and it returns all the text from docx file.

Also I would like it to keep the special characters like

Answer Source

docx2txt module reads docx file and converts it in text format.

You need to split above output using splitlines() and store it in list.

Code (Comments inline) :

import docx2txt

text = docx2txt.process("a.docx")

#Prints output after converting
print ("After converting text is ",text)

content = []
for line in text.splitlines():
  #This will ignore empty/blank lines. 
  if line != '':
    #Append to list

print (content)


After converting text is
 Hello, my name is blabla.

I am 30 years old.

I have two kids.

 List is  ['Hello, my name is blabla.', 'I am 30 years old. ', 'I have two kids.']

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download