Skip to content Skip to sidebar Skip to footer

Using Collections.counter To Count Emojis With Different Colors

I would like to use the collections.Counter class to count emojis in a string. It generally works fine, however, when I introduce colored emojis the color component of the emoji is

Solution 1:

You'll have to split your string into separate clusters. Each of your emoji is really two codepoints; the emoji and a EMOJI MODIFIER FITZPATRICK TYPE X codepoint:

>>>print(emoji_string[0])
πŸ‘Œ
>>>print(emoji_string[1])
🏻
>>>print(emoji_string[:2])
πŸ‘ŒπŸ»
>>>print(ascii(emoji_string[:2]))
'\U0001f44c\U0001f3fb'
>>>import unicodedata>>>unicodedata.name(emoji_string[1])
'EMOJI MODIFIER FITZPATRICK TYPE-1-2'

You could use a regular expression to keep those with the preceding emoji:

import re

char_with_modifier = re.compile(r'(.[\U0001f3fb-\U0001f3ff]?)')
split_emoji = char_with_modifier.findall(emoji_string)

and count the result.

Demo:

>>> import re
>>> from collections import Counter
>>> emoji_string = "πŸ‘ŒπŸ»πŸ‘ŒπŸΌπŸ‘ŒπŸ½πŸ‘ŒπŸΎπŸ‘ŒπŸΏ">>> char_with_modifier = re.compile(r'(.[\U0001f3fb-\U0001f3ff]?)')
>>> Counter(char_with_modifier.findall(emoji_string))
Counter({'πŸ‘ŒπŸ»': 1, 'πŸ‘ŒπŸΌ': 1, 'πŸ‘ŒπŸ½': 1, 'πŸ‘ŒπŸΎ': 1, 'πŸ‘ŒπŸΏ': 1})

Solution 2:

import regex
from collections import Counter
emoji_string = "πŸ‘ŒπŸ»πŸ‘ŒπŸΌπŸ‘ŒπŸ½πŸ‘ŒπŸΎπŸ‘ŒπŸΏ"
data = regex.findall(r'\X',emoji_string)
print(Counter(data))

Expected output

Counter({'πŸ‘ŒπŸ»': 1, 'πŸ‘ŒπŸΌ': 1, 'πŸ‘ŒπŸ½': 1, 'πŸ‘ŒπŸΎ': 1, 'πŸ‘ŒπŸΏ': 1})

Post a Comment for "Using Collections.counter To Count Emojis With Different Colors"