tumbleweed tumbleweed - 1 month ago 12
Python Question

How to append in a pandas dataframe a sequence of lists?

I generated a list of strings as follows:

In:

for x in links:
full_content = driver.find_elements_by_xpath('apath')
full_content = [x.text for x in full_content]
print(full_content)


Out: (a very large sequence of lists)

['Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.']
['Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip']
...
['Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.']


I tried to append them with:

full_content = pd.DataFrame([x.text for x in full_content])


However, instead of generating a single dataframe it is actually generating one. How can I append the aforementioned sequence of lists into a single pandas dataframe without the quotes (
' '
)?:

col1
0 Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
1 Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip
...
3 Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Answer

So think I understand this is what you are trying to do. You want to create a pandas dataframe for each full_content and then append it to a list of frames. Finally you can merge all dataframes with pd.concat. import pandas as pd

frames = []
counter_from = 0
for x in links:    
    driver.get(x)
    full_content = driver.find_elements_by_xpath('.//*[@id="segment"]')    
    full_content = [x.text for x in full_content]
    len_items = len(full_content)
    counter_to = counter_from + len_items


    data = {'text' : pd.Series(full_content, 
                               index=[i for i in range(counter_from, counter_to))])}
    df = pd.DataFrame(data)
    frames.append(df)
    counter_from += len_items

result = pd.concat(frames)