windboy windboy - 6 months ago 12
JSON Question

Dividing a text file into two different parts

I have written a simple script to collect a list of titles from the JSON file and generated a text file containing the list.

The result is as follows:

Animal geography
Autobiogeography
Chorography
Economic geography
Footloose industry
Geomorphometry
Health geography
Human geography
Military geography
Philosophy of geography
Physical geography
Political geography
Regional geography
Satirical cartography
Settlement geography
Transport geography
Vernacular geography
Visual geography
Category:Cartography
Category:Economic geography
Category:Geodemography
Category:Human geography
Category:Military geography
Category:Physical geography
Category:Political geography
Category:Regional geography
Category:Settlement geography
Category:Topography
Category:Toponymy
Category:Transportation geography
Category:Vernacular geography
Category:Geography by place


Problem:

The problem that I am facing right now is how to split the text file into two parts:

The first part is text file containing :

Animal geography
Autobiogeography
Chorography
Economic geography
Footloose industry
Geomorphometry
Health geography
Human geography
Military geography
Philosophy of geography
Physical geography
Political geography
Regional geography
Satirical cartography
Settlement geography
Transport geography
Vernacular geography
Visual geography


And a second text file containing those which begins with the word Category:

Category:Cartography
Category:Economic geography
Category:Geodemography
Category:Human geography
Category:Military geography
Category:Physical geography
Category:Political geography
Category:Regional geography
Category:Settlement geography
Category:Topography
Category:Toponymy
Category:Transportation geography
Category:Vernacular geography
Category:Geography by place


I am totally at a lost on how to do it. Please advice.

Sorry for the confusing title. I have no idea how to explain my problem.

Thank you.

Edit

For example I have extracted all the titles from this API (https://en.wikipedia.org/w/api.php?action=query&format=json&list=categorymembers&cmtitle=Category%3ABranches%20of%20geography&cmlimit=100) :

{
"batchcomplete":"",
"query":{
"categorymembers":[
{
"pageid":5259784,
"ns":0,
"title":"Animal geography"
},
{
"pageid":8670379,
"ns":0,
"title":"Autobiogeography"
},
{
"pageid":4254743,
"ns":0,
"title":"Chorography"
},
{
"pageid":177512,
"ns":0,
"title":"Economic geography"
},
{
"pageid":7907104,
"ns":0,
"title":"Footloose industry"
},
{
"pageid":5155886,
"ns":0,
"title":"Geomorphometry"
},
{
"pageid":2596739,
"ns":0,
"title":"Health geography"
},
{
"pageid":13372,
"ns":0,
"title":"Human geography"
},
{
"pageid":1794929,
"ns":0,
"title":"Military geography"
},
{
"pageid":5886597,
"ns":0,
"title":"Philosophy of geography"
},
{
"pageid":23263,
"ns":0,
"title":"Physical geography"
},
{
"pageid":1845092,
"ns":0,
"title":"Political geography"
},
{
"pageid":711230,
"ns":0,
"title":"Regional geography"
},
{
"pageid":42099944,
"ns":0,
"title":"Satirical cartography"
},
{
"pageid":33566568,
"ns":0,
"title":"Settlement geography"
},
{
"pageid":9710174,
"ns":0,
"title":"Transport geography"
},
{
"pageid":24644075,
"ns":0,
"title":"Vernacular geography"
},
{
"pageid":5329197,
"ns":0,
"title":"Visual geography"
},
{
"pageid":716309,
"ns":14,
"title":"Category:Cartography"
},
{
"pageid":2021084,
"ns":14,
"title":"Category:Economic geography"
},
{
"pageid":2245786,
"ns":14,
"title":"Category:Geodemography"
},
{
"pageid":1111700,
"ns":14,
"title":"Category:Human geography"
},
{
"pageid":7774333,
"ns":14,
"title":"Category:Military geography"
},
{
"pageid":2153059,
"ns":14,
"title":"Category:Physical geography"
},
{
"pageid":1898464,
"ns":14,
"title":"Category:Political geography"
},
{
"pageid":6645804,
"ns":14,
"title":"Category:Regional geography"
},
{
"pageid":44706236,
"ns":14,
"title":"Category:Settlement geography"
},
{
"pageid":6517504,
"ns":14,
"title":"Category:Topography"
},
{
"pageid":1086902,
"ns":14,
"title":"Category:Toponymy"
},
{
"pageid":41335672,
"ns":14,
"title":"Category:Transportation geography"
},
{
"pageid":24727902,
"ns":14,
"title":"Category:Vernacular geography"
}
]
}
}


I really appreciate if you can point me to the right direction on how to solve this problem.

Thank you all for your assistance and guidance.

3kt 3kt
Answer

You can try this :

with open('file.txt', 'r') as f:

    data = []
    category = []

    lines = f.readlines()

    for line in lines:
        if line.startswith('Category'):
            category.append(line)
        else:
            data.append(line)

    cat_file = open('category.txt', 'w')
    data_file = open('data.txt', 'w')

    cat_file.write(''.join(category))
    data_file.write(''.join(data))

    cat_file.close()
    data_file.close()

This reads the file file.txt line by line, and test if it begins with "Category". If it's the case, it adds the line to the category array, and if not, to the data array.

After processing the file, the program merges all the lines and write them to category.txt and data.txt.

Hope it'll be helpful.

Comments