Kirill Kirill - 24 days ago 50
Swift Question

Decode HTML string in Swift 3

How can I decode my html string from:

<span>Bj&ouml;rn</span>


to

<span>Björn</span>


in Swift 3 ?

Rob Rob
Answer

Do you really need to preserve the <span> tags, while replacing the &ouml; symbol? One technique, suggested by Leo Dabus in Convert Unicode symbol or its XML/HTML entities into its Unicode number in Swift, converts the symbols includes round-tripping it through an attributed string.

In Swift 3:

extension String {
    func convertHtmlSymbols() throws -> String? {
        guard let data = data(using: .utf8) else { return nil }

        return try NSAttributedString(data: data, options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: String.Encoding.utf8.rawValue], documentAttributes: nil).string
    }
}

Or, in Swift 2:

extension String {
    func convertHtmlSymbols() throws -> String? {
        guard let data = dataUsingEncoding(NSUTF8StringEncoding) else { return nil }

        return try NSAttributedString(data: data, options: [NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType, NSCharacterEncodingDocumentAttribute: NSUTF8StringEncoding], documentAttributes: nil).string
    }
}

This converts Bj&ouml;rn to Björn, but it also strips out the HTML tags, too.

Comments