字串編碼和分解

Created: November-22, 2018

Swift String 由 Unicode 程式碼點組成。它可以以幾種不同的方式進行分解和編碼。

let str = "ที่👌①!"

分解字串

字串的 characters 是 Unicode 擴充套件字形叢集：

Array(str.characters)  // ["ที่", "👌", "①", "!"]

unicodeScalars 是構成字串的 Unicode 程式碼點（注意ที่是一個字形簇，但是 3 個程式碼點 –3607,3637,3656 - 所以得到的陣列的長度與 characters 不同）：

str.unicodeScalars.map{ $0.value }  // [3607, 3637, 3656, 128076, 9312, 33]

你可以將字串編碼和分解為 UTF-8 （一系列 UInt8s）或 UTF-16 （一系列 UInt16s）：

Array(str.utf8)   // [224, 184, 151, 224, 184, 181, 224, 185, 136, 240, 159, 145, 140, 226, 145, 160, 33]
Array(str.utf16)  // [3607, 3637, 3656, 55357, 56396, 9312, 33]

字串長度和迭代

字串的 characters，unicodeScalars，utf8 和 utf16 都是 Collection ，所以你可以得到他們的 count 並迭代它們：

// NOTE: These operations are NOT necessarily fast/cheap! 

str.characters.count     // 4
str.unicodeScalars.count // 6
str.utf8.count           // 17
str.utf16.count          // 7

for c in str.characters { // ...
for u in str.unicodeScalars { // ...
for byte in str.utf8 { // ...
for byte in str.utf16 { // ...