I am trying to extract n characters from a string using
mb_convert_encoding($mystring, "UTF-8", "Windows-1252")
UTF-8 uses so-called surrogates which extend the codepage beyond ASCII to accomodate many more characters.
A single UTF-8 character may be coded into one, two, three or four bytes, depending on the character.
You cut the string right in the middle of a multi-byte character:
[<-character->] [byte-0|byte-1] ^ You cut the string right here in the middle! [<-----character---->] [byte-0|byte-1|byte-2] ^ ^ Or anywhere here if it's 3 bytes long.
So the decoder has the first byte(s) but can't read the entire character because the string ends prematurely.
This causes all the effects you are witnessing.
The solution to this problem is here in Dezza's answer.