Skip to content

Releases: fergiemcdowall/stopword

Even more Ukranian stopwords

06 Dec 16:09
Compare
Choose a tag to compare

What's Changed

Thanks to @imposeren for an even better Ukranian stopword-list.

Better Ukrainian stopword list

03 Dec 11:53
Compare
Choose a tag to compare

Improved stopword list for Ukrainian stopwords thanks to @imposeren

What's Changed

New Contributors

Full Changelog: v3.1.1...v3.1.2

v3.1.1 - Swapped Danish and Swedish with versions with better licenses

18 Aug 20:30
Compare
Choose a tag to compare
  • Removed CC BY-SA for Danish and Swedish with libraries that have MIT-license, so it's in line with the stopword-library.
  • Updated Swedish test since one additional word was now defined as a stopword
  • Updated 3rd-party.txt with the 3rd party license texts needed for the minified dist files.

Full Changelog: v3.0.1...v3.1.1

v3.0.1 Treeshaking for `module` and `browser`

27 Jan 06:59
Compare
Choose a tag to compare

package.json module and browser is now pointing to ./src/stopword.js to make it tree shakeable. And jsdelivr is pointed to ./dist/stopword.umd.min.js to not make issues for the CDN-usage.

The stopword module has gotten bigger over the years, so making treeshaking possible will make it possible to reduse unnecessary loading of languages not used.

Non-breaking release in most cases, but a big enough change to bump it to a new major version and there might be some corner cases that breaks.

Typo in Hungarian word

25 Mar 15:31
Compare
Choose a tag to compare

amelybol -> amelyből
Thanks to @dsdenes

Bug-fix: correct module-pointer to .mjs-file from package.json

10 Feb 11:31
Compare
Choose a tag to compare

One word per line in stopword files - better diffs

15 Jan 11:26
Compare
Choose a tag to compare
  • For better diff on small word-changes in existing stopword files. Minified dist-files are like before - Just as compact.
  • Some name changing on 3rd party license files to maybe not confuse GitHut as to which license is used for the library.

Fixing three word-numbers in porBr - Brazillian Portugese

05 Sep 20:08
Compare
Choose a tag to compare

Changing three numbers from Portugese to Brazillian Portugese

Number por porBr
19 Dezanove Dezenove
16 Dezasseis Dezesseis
17 Dezassete Dezessete

Thanks to rodfeal for spotting the error and the PR 🎉

Breaking changes! Import destructuring + 3 letter language codes + lots more

28 Feb 18:40
Compare
Choose a tag to compare

Breaking changes:

  • Import destructuring (Only ESM can not use the old sw. prefix, CJS can, and UMD will work like before, if you prefer that). If you're using CJS and not defining stopword language (using default english stopword list), you should be fine.
  • 3-letter ISO 639-3 language codes (swapping from ISO 639-1) - This is generally done to have the possibility for more languages, and short term more specifically to fit several sami languages.

Documentation to be almost backwards compatible:

  • What to do to still use ISO-639-1 codes.
  • What to do to still use sw.-prefix for function and variables (arrays of stopwords)

And lots more:

  • 5 languages added (stopword lists): Ukrainian, Lithuanian, Kurdish, Malay and Gujarati (Thansk to stopwords-iso).
  • Using batr for building CJS, ESM and UMD + testing (StandardJS, Playwright, AvaJS and Rollup-stuff in one devDep)
  • UI-tests for demo (testing UMD) + ESM and CJS tests
  • Minified builds and all licenses (stopword + 3rd party) in one file, pointed to from minified. 62 languages in 130 kb
  • Numbers from 0-9 in different scripts moved to it's own "language". Numbers should be handeled by regex, like words-n-numbers can do easily, but we're keeping this as a possibility to also remove numbers 0-9.
  • From TravisCI to GitHub Workflow for CI
  • For testing new languages added, we're using words-n-numbers to extract words (and/or numbers)

A leaner, more structured and more robust version

28 Jul 20:42
Compare
Choose a tag to compare
  • Now building CJS-, ESM- and UMD-scripts with minified alterantives.
  • import/require deconstructing now possible. Old style will also work if you want the sw. prefix for function and arrays.
  • ISO-639-3 language codes (swapping from ISO-639-1). Room for more languages.
  • The languages that aren't fully standard first has 3 characters that are actually following the standard followed by 2 characters in camelCase.
  • lggo -> lgg. We meant the 'o' for 'official', but lgg is the official in ISO-639-3, so the unofficial is now lggNd (the language array without diacritics)
  • If you want to use old codes from ISO-639-1 you could either rename on import or after import, do a i.e. const en = eng
  • Better license handling and visibility for third part libraries. There is now accumulated License file with all third party licenses listed. This is referenced in the minified scripts.
  • Moved to batr test library (rollup + standardjs + playwright dependencies all in one)
  • CI-testing moved from TravisCI to GitHub workflow / actions.