Peter Shaw Peter Shaw - 1 month ago 9
Swift Question

Reading a string char by char is very slow in my swift implementation

i have to read a file char by char in swift. The way I am doing it is to read a chunk from a FileHandler and returning the first character of a string.

This is my code so far:

/// Return next character, or nil on EOF.
func nextChar() -> Character? {
precondition(fileHandle != nil, "Attempt to read from closed file")

if atEof {
return nil
}

if self.stored.characters.count > 0 {
let c: Character = self.stored.characters.first!
stored.remove(at: self.stored.startIndex)
return c
}

let tmpData = fileHandle.readData(ofLength: (4096))
print("\n---- file read ---\n" , terminator: "")
if tmpData.count == 0 {
return nil
}

self.stored = NSString(data: tmpData, encoding: encoding.rawValue) as String!
let c: Character = self.stored.characters.first!
self.stored.remove(at: stored.startIndex)
return c
}


My problem with this is that the returning of a character is very slow.
This is my test implementation:

if let aStreamReader = StreamReader(path: file) {
defer {
aStreamReader.close()
}
while let char = aStreamReader.nextChar() {
print("\(char)", terminator: "")
continue
}
}


even without a print it took ages to read the file to the end.

for a sample file with 1.4mb it took more than six minutes to finish the task.

time ./.build/debug/read a.txt
real 6m22.218s
user 6m13.181s
sys 0m2.998s


Do you have an opinion how to speed up this part?

let c: Character = self.stored.characters.first!
stored.remove(at: self.stored.startIndex)
return c


Thanks a lot.
ps

++++ UPDATEED FUNCTION ++++

func nextChar() -> Character? {
//precondition(fileHandle != nil, "Attempt to read from closed file")

if atEof {
return nil
}

if stored_cnt > (stored_idx + 1) {
stored_idx += 1
return stored[stored_idx]
}

let tmpData = fileHandle.readData(ofLength: (chunkSize))
if tmpData.count == 0 {
atEof = true
return nil
}

if let s = NSString(data: tmpData, encoding: encoding.rawValue) as String! {
stored = s.characters.map { $0 }
stored_idx = 0
stored_cnt = stored.count
}
return stored[0];
}

Answer

Your implementation of nextChar is terribly inefficient.

You create a String and then call characters over and over and you update that set of characters over and over.

Why not create the String and then only store a reference to its characters. And then track an index into characters. Instead of updating it over and over, simply increment the index and return the next character. No need to update the string over and over.

Once you get to the last character, read the next piece of the file. Create a new string, reset the characters and the index.