Andres Azqueta Andres Azqueta - 5 months ago 7
Python Question

How to skip items in a loop

I am trying to create a list with all the newspapers articles from 5 different sources. They are stored in

JSON
format. All articles are stored in different files that contain that contain the newspaper and the year (time spam 2005-2015). The problem is that one of the newspapers is available for only 2014-15, therefore when I loop everything together I get error. This is my attempt:

import json
import nltk
import re
import pandas

appended_data = []

for i in range(2005,2016):
df0 = pandas.DataFrame([json.loads(l) for l in open('SDM_%d.json' % i)])
df1 = pandas.DataFrame([json.loads(l) for l in open('Scot_%d.json' % i)])
df2 = pandas.DataFrame([json.loads(l) for l in open('APJ_%d.json' % i)])
df3 = pandas.DataFrame([json.loads(l) for l in open('TH500_%d.json' % i)])
df4 = pandas.DataFrame([json.loads(l) for l in open('DRSM_%d.json' % i)])
appended_data.append(df0)
appended_data.append(df1)
appended_data.append(df2)
appended_data.append(df3)
appended_data.append(df4)


appended_data = pandas.concat(appended_data)

doc_set = appended_data.body


My question is; does this code does what I am aiming? (creating a single list with the
body
of all articles from each newspaper along time); and, how can I program it in a way that I skip the years 2005-2013 for the first newspaper (SDM)

Answer

For the skipping part, you can:

for i in range(2005,2016):
    if i > 2013:
        df0 = pandas.DataFrame([json.loads(l) for l in open('SDM_%d.json' % i)])
        appended_data.append(df0)
    df1 = pandas.DataFrame([json.loads(l) for l in open('Scot_%d.json' % i)])

To know whether the code performs as expected we'd need so sample data.

Comments