Alex Alex - 1 month ago 7
Python Question

Sorting a list of strings based on the first element from split (datetime)

I have a long list of strings, separated by commas (basically, csv files read line by line to strings, not performing a split on the separator):

lines[0] = "2017-08-01 13:45:58,mytext,mytext2,mytext3,etc"
lines[1] = "2017-08-01 15:45:58,mytextx,mytext2x,mytext3x,etcx"
lines[2] = "2017-08-01 19:45:58,mytexty,mytext2y,mytext3y,etcy"
lines[3] = "..."


From this post I know that the following code should work if my lines would only consist of datetimes:

lines_sorted = sorted(lines, key=lambda x: datetime.datetime.strptime(lines, '%Y-%m-%d %H:%M:%S'))


I thought I could use
partition
to extract tuples from all lines in files, where the first element contains the datetimepart:

for unsortedFile in glob('*.txt'):
with open(unsortedFile, 'r') as file:
lines = [line.rstrip('\n').partition(',') for line in file]
lines_sorted = sorted(lines, key=lambda x: datetime.datetime.strptime(lines[0], '%Y-%m-%d %H:%M:%S'))


..but of course, this does not work "TypeError: list indices must be integers or slices, not str" because
lines[0]
is not referencing the first tuple but the first item in lines-list. I also tried using
.strptime(lines[lambda][0], '%Y-%m-%d %H:%M:%S'))
but it is neither working.

I know I am doing something wrong.. any help is much appreciated.

[edit]
Here's the answer, from friendly comments below:

for unsortedFile in glob('*.txt'):
with open(unsortedFile, 'r', encoding="utf8") as file: #read each unsorted file to lines (list)
lines = [line.rstrip('\n') for line in file]
lines_sorted = sorted(lines,
key=lambda x: x.split(',', maxsplit=1)[0]
)
lines.clear()
with open(unsortedFile,'w', encoding="utf8") as file: #overwrite file
for line in lines_sorted:
file.write(line + '\n')

Answer Source

basically the key argument of the sorted function must be a function which takes a list item and returns a comparable object.
sorted will sort the list according to the image of the list items by this function, not the items themselves.

Here is an example, which is a mix of the suggested solutions :

lines_sorted = sorted(lines,
                      key=lambda x: x.split(',', maxsplit=1)[0]
                     )

With this code, every item which has the same date will be considered equal by sorted.