Note
This project is a Typescript implamentation of namidaco/string_clean_utils
- Replaces diacritics & accents with original text
const normalized = StringCleanUtils.normalize('𝒉𝒂𝒓𝒍𝒆𝒚𝒔 𝒊𝒏 𝒉𝒂𝒘𝒂𝒊𝒊 - 𝒌𝒂𝒕𝒚 𝒑𝒆𝒓𝒓𝒚');
console.log(normalized); // 'harleys in hawaii - katy perry';
const normalized2 = StringCleanUtils.normalize('𝑻𝒉𝒆 ℚ𝕦𝕚𝕔𝕜 Brown Fox 𝔍𝔲𝔪𝔭𝔢𝔡 ⓞⓥⓔⓡ ʇɥǝ 𝗟𝗮𝘇𝘆 𝙳𝚘𝚐');
console.log(normalized2); // 'The Quick Brown Fox Jumped over the Lazy Dog';
- Remove symbols from text
const normalized = StringCleanUtils.removeSymbols('The [Quick }Brown Fox %Jumped over ^the Lazy @Dog');
console.log(normalized); // 'The Quick Brown Fox Jumped over the Lazy Dog';
- Remove symbols & whitespaces from text
const normalized = StringCleanUtils.removeSymbolsAndWhitespaces('The [Quick }Brown Fox %Jumped over ^the Lazy @Dog');
console.log(normalized); // 'TheQuickBrownFoxJumpedovertheLazyDog';
Important
Altho this project uses Bun
for testing, it is not required for using the library.
- To run tests, navigate to
test
:
cd test
- then run the following command:
bun test
-
original project source: https://github.com/namidaco/string_clean_utils/tree/main
-
confusable & diacritics rules are generated with
confusable_to_map.ts
relying onconfusables.txt
&diacritics.ts
- confusable source: https://www.unicode.org/Public/security/latest/confusables.txt
- diacritics source: https://www.npmjs.com/package/diacritics-map