Shivani Shivani - 5 months ago 20
Python Question

How do I maintain the order when splitting a string?

I am trying to create an ordered dictionary from a split string. How do I maintain the order of the split string? Sorry, my original example was confusing and contradicted the idea of an ordered dictionary. This is a different problem but I am not sure how to split the string as such.

My sample file "practice_split.txt" is as follows:

§1 text for chapter 1 §2 text for chapter 2 §3 text for chapter 3


I want my ordered dictionary to look like:

OrderedDict([('§1', 'text for chapter 1'), ('§2', 'text for chapter 2'), ('§3', 'text for chapter 3')])


instead of:

OrderedDict([('1 text for chapter 1 ', '\xc2\xa7'), ('\xc2\xa7', '3 text for chapter 3'), ('2 text for chapter 2 ', '\xc2\xa7')])


This is my code:

# -*- coding: utf-8 -*
import codecs
import collections
import re

with codecs.open('practice_split.txt', mode='r', encoding='utf-8') as document:
o_dict = collections.OrderedDict()

for line in document:
conv = line.encode('utf-8')
a = re.split('(§)', conv)
a = a[1:len(a)]

for i in range(1, len(a) - 1):
o_dict[a[i]] = a[i+1]
print o_dict


Thanks!

GWW GWW
Answer

From my understanding of your code your loop is incorrect. You want the first § with the first text entry. You also want to skip the § elements as a key to your dictionary, therefore you need a step of 2 for the loop. Finally, you may want to strip spaces off the start/end of the text.

for i in range(1, len(a), 2):
    o_dict["{}{}".format(a[i - 1], i / 2 + 1)] = a[i].strip()
print o_dict 

for k, v in o_dict.iteritems():
    print k.decode('utf-8'), v

Output:

OrderedDict([('\xc2\xa71', 'text for chapter 1'), ('\xc2\xa72', 'text for chapter 2'), ('\xc2\xa73', 'text for chapter 3')])

§1 text for chapter 1
§2 text for chapter 2
§3 text for chapter 3

Edit: I changed my code to reflect the edits to OPs question.

Comments