Merge pull request #1 from atomflunder/v0.5.0

v0.5.0
atomflunder · Apr 27, 2022 · 0d55b57 · 0d55b57
2 parents 9111c80 + 7153742
commit 0d55b57
Show file tree

Hide file tree

Showing 10 changed files with 237 additions and 243 deletions.
diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md
@@ -0,0 +1,18 @@
+# Contributing to stringmatch
+
+First off, thanks for being interested in contributing to stringmatch! Every contribution is appreciated a lot. The following are some guidelines to get you started. They are *guidelines* and not strict rules.
+
+If you just want to ask a question, go ahead and visit the [GitHub Discussions Tab](https://github.com/atomflunder/stringmatch/discussions).
+
+## Bug reports
+
+While submitting a bug report, make sure to follow the template and be clear in how to reproduce the bug. If you already know how to fix the bug, go ahead and either describe it in the report, or submit a pull request directly.
+
+## Pull requests
+
+Submitting a pull request is just as straight-forward as submitting a bug report. Follow the template and you will be fine.   
+If you make any changes to the functionality of the code, please make sure to test the functionality beforehand, writing tests is greatly encouraged.  
+It would also be greatly appreciated if you stick to the general style of the library, but not really required.
+
+Thanks again for your interest in contributing!  
+If you still have doubt in contributing to this library, I can assure you there is no bad contribution.
diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml
@@ -1,4 +1,4 @@
-name: build
+name: Build
 
 on: [push, pull_request]
 

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,14 +2,26 @@
 
 This is a broad overview of the changes that have been made over the lifespan of this library.
 
+## v0.5.0 - 2022-04-27
+
+- Removed scorer argument from functions, added it into `__init__` in both Match() and Ratio()
+- Renamed *_with_score functions to *_with_ratio to be consistent with naming
+    - This affects the three functions added in v0.4.0
+- Removed Exceptions
+    - Returning a score of 0 instead of raising EmptySearchException
+    - Using "levenshtein" as default instead of raising InvalidScorerException
+    - Setting no limit instead of raising InvalidLimitException, if a limit less than 1 is set
+    - Updated docstrings to reflect these changes
+    - Updated tests to reflect these changes
+
 ## v0.4.1 - 2022-04-27
 
 - Added proper Python Versions to setup classifiers
 
 ## v0.4.0 - 2022-04-27
 
 - Added match_with_score, get_best_match_with_score and get_best_matches_with_score functions
-- Added tests for those functions
+    - Added tests for those functions
 - Updated documentation a bit
 
 ## v0.3.1 - 2022-04-26
@@ -26,7 +38,7 @@ This is a broad overview of the changes that have been made over the lifespan of
 - Made library public and installable via git
 - Added multiple scorers
 - Added new kwargs to Match functions
-- Added tests for those
+    - Added tests for those
 - Improved various functions
 - Added exception type
 - Some documentation improvements

diff --git a/README.md b/README.md
@@ -10,7 +10,13 @@ Inspired by [seatgeek/thefuzz](https://github.com/seatgeek/thefuzz), which did n
 - [Requirements](#requirements)
 - [Installation](#installation)
 - [Basic Usage](#basic-usage)
-  - [Additional Arguments](#additional-arguments)
+  - [Matching](#matching)
+  - [Ratios](#ratios)
+  - [Matching & Ratios](#matching--ratios)
+  - [Strings](#strings)
+- [Advanced Usage](#advanced-usage)
+    - [Keyword Arguments](#keyword-arguments)
+    - [Scoring Algorithms](#scoring-algorithms)
 - [Links](#links)
 
 ## Requirements
@@ -32,107 +38,149 @@ pip install -U git+https://github.com/atomflunder/stringmatch
 
 ## Basic Usage
 
+### Matching
+
+The match functions allow you to compare 2 strings and check if they are "similar enough" to each other, or get the best match(es) from a list of strings:
+
 ```python
-from stringmatch import Match, Ratio, Strings
+from stringmatch import Match
 
 match = Match()
-ratio = Ratio()
-strings = Strings()
 
-# Basic usage:
-match.match("searchlib", "srchlib")                   # returns True
-match.match("searchlib", "something else")            # returns False
+# Checks if the strings are similar.
+match.match("searchlib", "srchlib")           # returns True
+match.match("searchlib", "something else")    # returns False
 
-# Matching lists:
+# Returns the best match(es) found in the list.
 searches = ["searchli", "searhli", "search", "lib", "whatever", "s"]
-match.get_best_match("searchlib", searches)           # returns "searchli"
-match.get_best_matches("searchlib", searches)         # returns ['searchli', 'searhli', 'search']
+match.get_best_match("searchlib", searches)   # returns "searchli"
+match.get_best_matches("searchlib", searches) # returns ['searchli', 'searhli', 'search']
+```
+
+### Ratios
 
-# Ratios:
-ratio.ratio("searchlib", "searchlib")                 # returns 100
-ratio.ratio("searchlib", "srechlib")                  # returns 82
+You can get the "ratio of similarity" between strings like this:
+
+```python
+from stringmatch import Ratio
+
+ratio = Ratio()
+
+# Getting the ratio between the two strings.
+ratio.ratio("searchlib", "searchlib")   # returns 100
+ratio.ratio("searchlib", "srechlib")    # returns 82
+
+# Getting the ratio between the first string and the list of strings at once.
 searches = ["searchlib", "srechlib"]
-ratio.ratio_list("searchlib", searches)               # returns [100, 82]
+ratio.ratio_list("searchlib", searches) # returns [100, 82]
+```
+
+### Matching & Ratios
+
+You can also get both the match and the ratio together in a tuple using these functions:
 
-# Getting matches and ratios:
-match.match_with_score("searchlib", "srechlib")       # returns (True, 82)
+```python
+from stringmatch import Match
+
+match = Match()
 searches = ["test", "nope", "tset"]
-match.get_best_match_with_score("test", searches)     # returns ("test", 100)
-match.get_best_matches_with_score("test", searches)   # returns [("test", 100), ("tset", 75)]
-
-# Modify strings:
-# This is meant for internal use, but you can also use it yourself, if you choose to.
-strings.latinise("Héllö, world!")                     # returns "Hello, world!"
-strings.remove_punctuation("wh'at;, ever")            # returns "what ever"
-strings.only_letters("Héllö, world!")                 # returns "Hll world"
-strings.ignore_case("test test!", lower=False)        # returns "TEST TEST!"
+
+match.match_with_ratio("searchlib", "srechlib")       # returns (True, 82)
+match.get_best_match_with_ratio("test", searches)     # returns ("test", 100)
+match.get_best_matches_with_ratio("test", searches)   # returns [("test", 100), ("tset", 75)]
 ```
 
-### Additional Arguments
-You can pass in additional arguments for the `Match()` functions to customise your search further:
+### Strings
+
+This is primarily meant for internal usage, but you can also use this library to modify strings:
 
-#### `score=int`
+```python
+from stringmatch import Strings
 
+strings = Strings()
+
+strings.latinise("Héllö, world!")               # returns "Hello, world!"
+strings.remove_punctuation("wh'at;, ever")      # returns "what ever"
+strings.only_letters("Héllö, world!")           # returns "Hll world"
+strings.ignore_case("test test!", lower=False)  # returns "TEST TEST!"
+```
+
+## Advanced Usage
+
+### Keyword Arguments
+You can pass in additional arguments for the `Match()` functions to customise your search further:
+
+**`score=70`**  
 The score cutoff for matching, by default set to 70.
 
 ```python
 match("searchlib", "srechlib", score=85)    # returns False
 match("searchlib", "srechlib", score=70)    # returns True
 ```
 
-#### `limit=int`
+---
 
-The limit of how many matches to return. Only available for `Matches().get_best_matches()`. By default this is set to `5`.
+**`limit=5`**  
+The limit of how many matches to return. Only available for `Matches().get_best_matches()`. If you want to return every match set this to 0. By default this is set to `5`.
 
 ```python
 searches = ["limit 5", "limit 4", "limit 3", "limit 2", "limit 1", "limit 0"]
 get_best_matches("limit 5", searches, limit=2)  # returns ["limit 5", "limit 4"]
 get_best_matches("limit 5", searches, limit=1)  # returns ["limit 5"]
 ```
 
-#### `latinise=bool`
+---
 
+**`latinise=False`**  
 Replaces special unicode characters with their latin alphabet equivalents. By default turned off.
 
 ```python
 match("séärçh", "search", latinise=True)    # returns True
 match("séärçh", "search", latinise=False)   # returns False
 ```
 
-#### `ignore_case=bool`
+---
 
+**`ignore_case=False`**  
 If you want to ignore case sensitivity while searching. By default turned off.
 
 ```python
 match("test", "TEST", ignore_case=True)     # returns True
 match("test", "TEST", ignore_case=False)    # returns False
 ```
 
-#### `remove_punctuation=bool`
+---
 
-Removes commonly used punctuation symbols from the strings, like `.,;:!?` and so on. Be careful when using this, because if you pass in a string that is only made up of punctuation symbols, you will get an `EmptySearchException`. By default turned off.
+**`remove_punctuation=False`**  
+Removes commonly used punctuation symbols from the strings, like `.,;:!?` and so on. By default turned off.
 
 ```python
 match("test,---....", "test", remove_punctuation=True)  # returns True
 match("test,---....", "test", remove_punctuation=False) # returns False
 ```
 
-#### `only_letters=bool`
+---
 
-Removes every character that is not in the latin alphabet, a more extreme version of `remove_punctuation`. The same rules apply here, be careful when you use it or you might get an `EmptySearchException`. By default turned off.
+**`only_letters=False`**  
+Removes every character that is not in the latin alphabet, a more extreme version of `remove_punctuation`. By default turned off.
 
 ```python
 match("»»ᅳtestᅳ►", "test", only_letters=True)   # returns True
 match("»»ᅳtestᅳ►", "test", only_letters=False)  # returns False
 ```
 
-#### `scorer=str`
+### Scoring Algorithms
 
-The scoring algorithm to use, the available options are: [`"levenshtein"`](https://en.wikipedia.org/wiki/Levenshtein_distance), [`"jaro"`](https://en.wikipedia.org/wiki/Jaro–Winkler_distance#Jaro_similarity), [`"jaro_winkler"`](https://en.wikipedia.org/wiki/Jaro–Winkler_distance#Jaro–Winkler_similarity). Different algorithms will produce different results, obviously. By default set to `"levenshtein"`.
+You can pass in different scoring algorithms when initialising the `Match()` and `Ratio()` classes.  
+The available options are: [`"levenshtein"`](https://en.wikipedia.org/wiki/Levenshtein_distance), [`"jaro"`](https://en.wikipedia.org/wiki/Jaro–Winkler_distance#Jaro_similarity), [`"jaro_winkler"`](https://en.wikipedia.org/wiki/Jaro–Winkler_distance#Jaro–Winkler_similarity).   
+Different algorithms will produce different results, obviously. By default set to `"levenshtein"`.
 
 ```python
-match("test", "th test", scorer="levenshtein")  # returns True (score = 73)
-match("test", "th test", scorer="jaro_winkler") # returns False (score = 60)
+levenshtein_matcher = Match(scorer="levenshtein")
+jaro_winkler_matcher = Match(scorer="jaro_winkler")
+
+levenshtein_matcher.match("test", "th test")  # returns True (score = 73)
+jaro_winkler_matcher.match("test", "th test") # returns False (score = 60)
 ```
 
 

diff --git a/stringmatch/__init__.py b/stringmatch/__init__.py
@@ -1,8 +1,7 @@
 # flake8: noqa
-from .exceptions import *
 from .match import *
 from .ratio import *
 from .strings import *
 
 __title__ = "stringmatch"
-__version__ = "0.4.1"
+__version__ = "0.5.0"
diff --git a/stringmatch/exceptions.py b/stringmatch/exceptions.py