Skip to content
Julien Marcou edited this page Oct 28, 2021 · 12 revisions

Unicode Emoji Wiki

Table of contents

Data source

Components metadata (skin tones & hair styles) :

  • code point
  • group
  • subgroup
  • version

And emojis metadata :

  • code points
  • group
  • subgroup
  • version
  • relationship between a base emoji and its variations
  • components used by a variation (skin tone & hair style)

Are all retrieved based on the fully-qualified emojis and components from the emoji-test.txt file, which is available on Unicode's website (https://unicode.org/Public/emoji/).
The version 14.0 has been used, and because the file's structure has been subject to changes, versions prior to the 12.1 are not able to produce the complete data set.

Components & emojis translations :

  • text-to-speech (description)
  • keywords

Are retrieved from the common/annotations/en.xml and common/annotationsDerived/en.xml files, which are available on Unicode's CLDR (Common Locale Data Repository) (https://github.com/unicode-org/cldr).
The release-40 tag has been used to ensure that the file structure doesn't change over time.

Data consolidation

Some consolidation of the data has been made to make them more useful.

An additional metadata named category has been added to each emojis to more conventionally reflect emojis grouping on mobile devices, making groups of emojis more balanced. Though it differs a little from Android, it's quite similar to how emojis are grouped on it. If you don't like it, you can make your own grouping logic using the original group and subgroup metadata.

Translations

Only the en (American English) locale is actually provided, as translations are quite heavy (~500kB per locale).

Until a way is found, so that people can load only the required locales to reduce the size of their projects, I recommend generating translations yourself using this repository :
You just need to change the unicodeCldrLocale variable inside the generate-unicode-emoji.cjs file to whatever locale is available on Unicode's CLDR and then run the npm run build command (or node generate-unicode-emoji.cjs command).

Fitzpatrick scale

Skin tones are based on the Fitzpatrick scale (https://en.wikipedia.org/wiki/Fitzpatrick_scale) :

Emoji Description Fitzpatrick scale
🏻 Light skin tone Type I and II
🏼 Medium-light skin tone Type III
🏽 Medium skin tone Type IV
🏾 Medium-dark skin tone Type V
🏿 Dark skin tone Type VI

Code points conversion

JavaScript natively supports code points in string using unicode escapes \u.

Code points between U+0000 and U+FFFF doesn't require to be surrounded by {} :

const heartEmoji = '\u2764\uFE0F';
console.log(heartEmoji); // ❀️

Code points greater than U+FFFF (named astral code points) are internally represented as surrogate pairs and need either to be broken down into two code points (the surrogate pair) or to be surrounded by {} :

const grinningEmojiWithSurrogatePair = '\uD83D\uDE00';
console.log(grinningEmojiWithSurrogatePair); // πŸ˜€

const grinningEmojiWithAstralCodePoint = '\u{1F600}'
console.log(grinningEmojiWithAstralCodePoint); // πŸ˜€

I recommend always surrounding the code points with {} to avoid error and improve readability.

const pirateFlagEmoji = '\u{1F3F4}\u{200D}\u{2620}\u{FE0F}';
console.log(pirateFlagEmoji); // πŸ΄β€β˜ οΈ

If you prefer, you can also programmatically retrieve an emoji using an array of code points like this :

const pirateFlagCodePoints = ['1F3F4', '200D', '2620', 'FE0F'];
const pirateFlagEmoji = String.fromCodePoint(
  ...pirateFlagCodePoints.map(codePoint => parseInt(codePoint, 16))
);
console.log(pirateFlagEmoji); // πŸ΄β€β˜ οΈ

And retrieve the code points of an emoji like this :

const pirateFlagEmoji = 'πŸ΄β€β˜ οΈ';
const pirateFlagCodePoints = Array.from(pirateFlagEmoji).map(character => {
  return character.codePointAt(0).toString(16).toUpperCase();
});
console.log(pirateFlagCodePoints); // ['1F3F4', '200D', '2620', 'FE0F']

Flag emojis conversion

You can convert country ISO 3166-1 codes to flag emojis using Code points conversion.

It's super easy, fortunately, the guys at Unicode created the flag emojis by just adding an offset (127397) to the ISO 3166-1 (alpha-2) country codes.

So, if you have a country code like US (United States of America), you just have to convert each char to its code point equivalent, then add the offset, and finally convert it back to char.

Here's how to convert a country code to its flag emoji equivalent :

const flagEmojiOffset = 127397;
const franceCountryCode = 'FR';
const franceFlagEmoji = String.fromCodePoint(
  ...Array.from(franceCountryCode).map(character => character.codePointAt(0) + flagEmojiOffset)
);
console.log(franceFlagEmoji); // 'πŸ‡«πŸ‡·'

And here's how to convert a flag emoji back to its country code equivalent :

const flagEmojiOffset = 127397;
const franceFlagEmoji = 'πŸ‡«πŸ‡·';
const franceCountryCode = String.fromCodePoint(
  ...Array.from(franceFlagEmoji).map(character => character.codePointAt(0) - flagEmojiOffset)
);
console.log(franceCountryCode); // 'FR'

Code points composition

Complex emojis and emoji's variations often consist of one or more base emojis.

Unicode uses the 200D code point as a ligature code point (zero-width joiner) between two base emojis to combine them :

const blackFlagEmojiEmoji = '\u{1F3F4}';
console.log(blackFlagEmojiEmoji); // 🏴

const skullAndCrossbonesEmoji = '\u{2620}\u{FE0F}';
console.log(skullAndCrossbonesEmoji); // ☠️

const ligatureCodePoint = '\u{200D}';
const pirateFlagEmoji =
  blackFlagEmojiEmoji +
  ligatureCodePoint +
  skullAndCrossbonesEmoji; 
console.log(pirateFlagEmoji); // πŸ΄β€β˜ οΈ

This even works for more complex compositions :

const womanEmoji = '\u{1F469}';
console.log(womanEmoji); // πŸ‘©

const heartEmoji = '\u{2764}\u{FE0F}';
console.log(heartEmoji); // ❀️

const kissEmoji = '\u{1F48B}';
console.log(kissEmoji); // πŸ’‹

const manEmoji = '\u{1F468}';
console.log(manEmoji); // πŸ‘¨

const ligatureCodePoint = '\u{200D}';
const womanAndManKissingEmoji =
  womanEmoji +
  ligatureCodePoint +
  heartEmoji +
  ligatureCodePoint +
  kissEmoji +
  ligatureCodePoint +
  manEmoji;
console.log(womanAndManKissingEmoji); // πŸ‘©β€β€οΈβ€πŸ’‹β€πŸ‘¨

Skin tone components must not use the ligature code point, and be placed directly after base emojis that support skin tone variations, however, if the base emojis ends up with the FE0F code point (which serves as a presentation selector), you'll need to remove it first :

// Emoji without presentation selector
const thumbsUpBaseEmoji = '\u{1F44D}';
console.log(thumbsUpBaseEmoji); // πŸ‘

const lightSkinToneComponent = '\u{1F3FB}';
console.log(lightSkinToneComponent); // 🏻

const thumbsUpWithLightSkinToneEmoji =
  thumbsUpBaseEmoji +
  lightSkinToneComponent;
console.log(thumbsUpWithLightSkinToneEmoji); // πŸ‘πŸ»
// Emoji with presentation selector
const victoryHandBaseEmoji = '\u{270C}\u{FE0F}';
console.log(victoryHandBaseEmoji); // ✌️

const darkSkinToneComponent = '\u{1F3FF}';
console.log(darkSkinToneComponent); // 🏿

const presentationSelectorCodePoint = '\u{FE0F}'
const victoryHandWithDarkSkinToneEmoji =
  victoryHandBaseEmoji.replace(presentationSelectorCodePoint, '') +
  darkSkinToneComponent;
console.log(victoryHandWithDarkSkinToneEmoji); // ✌🏿

Now you can combine both, skin tone variations and ligature code points to create even more complex emojis :

const personFacepalmingEmoji = '\u{1F926}'; // Note this is a genderless emoji
console.log(personFacepalmingEmoji); // 🀦

const mediumSkinToneComponent = '\u{1F3FD}';
console.log(mediumSkinToneComponent); // 🏽

const femaleSignEmoji = '\u{2640}\u{FE0F}';
console.log(femaleSignEmoji); // ♀️

const ligatureCodePoint = '\u{200D}';
const womanFacepalmingWithMediumSkinToneEmoji =
  personFacepalmingEmoji +
  mediumSkinToneComponent +
  ligatureCodePoint +
  femaleSignEmoji;
console.log(womanFacepalmingWithMediumSkinToneEmoji); // πŸ€¦πŸ½β€β™€οΈ
const womanEmoji = '\u{1F469}';
console.log(womanEmoji); // πŸ‘©

const mediumLightSkinToneComponent = '\u{1F3FC}';
console.log(mediumLightSkinToneComponent); // 🏼

const handshakeEmoji = '\u{1F91D}';
console.log(handshakeEmoji); // 🀝

const manEmoji = '\u{1F468}';
console.log(manEmoji); // πŸ‘¨

const mediumDarkSkinToneComponent = '\u{1F3FE}';
console.log(mediumDarkSkinToneComponent); // 🏾

const ligatureCodePoint = '\u{200D}';
const womanWithMediumLightSkinToneAndManWithMediumDarkSkinToneHoldingHandsEmoji =
  womanEmoji +
  mediumLightSkinToneComponent +
  ligatureCodePoint +
  handshakeEmoji +
  ligatureCodePoint +
  manEmoji +
  mediumDarkSkinToneComponent;
console.log(womanWithMediumLightSkinToneAndManWithMediumDarkSkinToneHoldingHandsEmoji); // πŸ‘©πŸΌβ€πŸ€β€πŸ‘¨πŸΎοΈ