Etan Etan - 1 month ago 19
Swift Question

Compatibility of SubSequence indices

For most Swift

Collections
, indices of a
Collection's
SubSequence
are compatible for use with the base
Collection
.

func foo<T: Collection>(_ buffer: T) -> T.Iterator.Element
where T.Index == T.SubSequence.Index
{
let start = buffer.index(buffer.startIndex, offsetBy: 2)
let end = buffer.index(buffer.startIndex, offsetBy: 3)
let sub = buffer[start ... end]
return buffer[sub.startIndex]
}


This works fine for most collections:

print(foo([0, 1, 2, 3, 4])) // 2


And even for
String.UTF8View
:

print(foo("01234".utf8) - 0x30 /* ASCII 0 */) // 2


But when using
String.CharacterView
, things start breaking:

print(foo("01234".characters)) // "0"


For the CharacterView, SubSequences create completely independent instances, i.e. Index starts again at 0. To convert back to a main String index, one has to use the
distance
function and add that to the
startIndex
of the
SubSequence
in the main
String
.

func foo<T: Collection>(_ buffer: T) -> T.Iterator.Element
where T.Index == T.SubSequence.Index, T.SubSequence: Collection, T.SubSequence.IndexDistance == T.IndexDistance
{
let start = buffer.index(buffer.startIndex, offsetBy: 2)
let end = buffer.index(buffer.startIndex, offsetBy: 3)
let sub = buffer[start ... end]

let subIndex = sub.startIndex
let distance = sub.distance(from: sub.startIndex, to: subIndex)
let bufferIndex = buffer.index(start, offsetBy: distance)
return buffer[bufferIndex]
}


With this, all three examples now correctly print 2.





  • Why are String SubSequence indices not compatible with their base String? As long as everything is immutable, it doesn't make sense to me why Strings are a special case, even with all the Unicode stuff. I've also noticed that substring functions return Strings and not Slices as most other collections do. However, substrings are still documented to be return in O(1). Strange magic.

  • Is there a way to constraint a generic function to restrict to collections where the SubSequence indices are compatible with the base Sequence?

  • Can one even assume that SubSequence indices are compatible for non-String collections, or is this just a coincidence, and one should always use
    distance(from:to:)
    to convert indices?


Answer

That has been discussed on swift-evolution, filed as bug report SR-1927 – Subsequences of String Views don’t behave correctly and recently been fixed in StringCharacterView.swift with commit.

With that fix String.CharacterView behaves like other collections in that its slices should use the same indices for the same elements as the original collection.

Comments