Swift3 如何 字符串中 过滤emoji表情

http://stackoverflow.com/questions/38408645/swift-regex-to-match-unicodes

The Unicode code point of the emoji you have shown is U+1F600.

(Unicode 9.0 Character Code Charts - Emoticons)

And your regex pattern (which may work for UTF-16 representation) [\uD800-\uDBFF\uDC00-\uDFFF] matches all non-BMP characters -- U+10000...U+10FFFF, which contains most of all emojis but also contains huge non-emoji characters.

So, as you say "[\uD800-\uDBFF\uDC00-\uDFFF]" was working, the equivalent pattern in NSRegularExpression is "[\\U00010000-\\U0010FFFF]".

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
var s="? emoji ?"
let regex = try! NSRegularExpression(pattern: "[\\U00010000-\\U0010FFFF]", options: [])
let replaced = regex.stringByReplacingMatchesInString(s, options: [], range: NSRange(0..<s.utf16.count), withTemplate: "*") //->"* emoji *"
var s="? emoji ?" let regex = try! NSRegularExpression(pattern: "[\\U00010000-\\U0010FFFF]", options: []) let replaced = regex.stringByReplacingMatchesInString(s, options: [], range: NSRange(0..<s.utf16.count), withTemplate: "*") //->"* emoji *"
var s="? emoji ?"
let regex = try! NSRegularExpression(pattern: "[\\U00010000-\\U0010FFFF]", options: [])
let replaced = regex.stringByReplacingMatchesInString(s, options: [], range: NSRange(0..<s.utf16.count), withTemplate: "*") //->"* emoji *"

(Addition) To see Unicode code points in your string literal:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
s.unicodeScalars.forEach {
print(String(format: "U+%04X ", Int($0.value)))
}
s.unicodeScalars.forEach { print(String(format: "U+%04X ", Int($0.value))) }
s.unicodeScalars.forEach {
    print(String(format: "U+%04X ", Int($0.value)))
}

For your example string, I get:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
U+1F600
U+0020
U+0065
U+006D
U+006F
U+006A
U+0069
U+0020
U+1F600
U+1F600 U+0020 U+0065 U+006D U+006F U+006A U+0069 U+0020 U+1F600
U+1F600 
U+0020 
U+0065 
U+006D 
U+006F 
U+006A 
U+0069 
U+0020 
U+1F600

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *