forked from jekyll/classifier-reborn
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
**Background:** The slow step of LSI is the SVD (singular value decomposition) of a matrix. With even a relatively small collection of documents (say, about 20 blog posts), the native ruby implementation it too slow to be usable. To work around this problem, classifier-reborn allows you to optionally use the `gsl` gem to make use of the [Gnu Scientific Library](https://www.gnu.org/software/gsl/) when performing matrix calculations. This performs at least an order of magnitude faster than the ruby-only matrix decomposition, and is fast enough that using LSI with Jekyll finishes in a reasonable amount of time. Unfortunately, [rb-gsl](https://github.com/SciRuby/rb-gsl) is unmaintained -- luckily, there's a commit on main that makes it compatible with Ruby 3, but nobody has released the gem so the only way to use rb-gsl with Ruby 3 right now is to specify the git hash in your Gemfile. See SciRuby/rb-gsl#67 **Changes:** In this PR, my goal is to provide an alternative matrix implementation that can perform the singular value decomposition quickly and works with Ruby 3. Doing so will allow classifier-reborn to be used with Ruby 3 without depending on the unmaintained/unreleased GSL gem. Options for ruby matrix libraries are somewhat limited, but [Numo](https://github.com/ruby-numo) seems to be more actively maintained than rb-gsl, and Numo has a working Ruby 3 implementation that can perform a singular value decomposition. This requires [numo-narray](https://github.com/ruby-numo/numo-narray) and [numo-linalg](https://github.com/ruby-numo/numo-linalg). My goal is to allow users to (optionally) use classifier-reborn the same way they would use it with GSL. That is, the user should install `numo-narray` and `numo-linalg` gems, and classifier-reborn will detect and use these if they are found.
- Loading branch information
Showing
8 changed files
with
97 additions
and
29 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters