Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge dev for release 0.9.7 #195

Merged
merged 236 commits into from
May 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
236 commits
Select commit Hold shift + click to select a range
fc6fdd0
merge python3_10 integration
davebulaval Jul 6, 2022
a4922ac
improve codecov script
davebulaval Jul 6, 2022
6f82c6e
improve codecov script with test verbosity
davebulaval Jul 6, 2022
0732c59
improve codecov script with test verbosity
davebulaval Jul 6, 2022
3f74aae
add script to run tests on all python version supported
davebulaval Jul 6, 2022
802793c
fix path management
davebulaval Jul 6, 2022
bee19af
change mod of executable file
davebulaval Jul 6, 2022
15f0908
add interactive shell to handle conda
davebulaval Jul 6, 2022
acaa54f
remove shebang arg
davebulaval Jul 6, 2022
6e2a809
fix contributing and minor typo in tests script
davebulaval Jul 6, 2022
b3f1b4b
improv example code and remove dead example
davebulaval Jul 8, 2022
ebaeeb0
squash handling from url branch
davebulaval Jul 8, 2022
df9651a
cleanup dead file
davebulaval Jul 8, 2022
1b9800b
improve speed test code
davebulaval Jul 8, 2022
38e6115
add num_workers test fasttext under windows os condition
davebulaval Jul 12, 2022
9154a1d
add tests case for num_workers test in parser
davebulaval Jul 12, 2022
a6b28fd
simplified tests case windows
davebulaval Jul 12, 2022
de55e63
update changelog
davebulaval Jul 12, 2022
6b24e0c
fix windows os failing test due to num workers gt 0
davebulaval Jul 12, 2022
b1748e9
fix missing lower cassing windows os name
davebulaval Jul 12, 2022
6051053
Merge branch 'master' into dev
davebulaval Jul 13, 2022
c0417e2
add missing downlaod_from_url deprecated message and redirect to new …
davebulaval Jul 13, 2022
3f92b05
add major release todo list to track function to remove
davebulaval Jul 13, 2022
ad6c8f0
update changelog
davebulaval Jul 13, 2022
9a4f6a8
add pragma no cover to skip codecovv
davebulaval Jul 13, 2022
d3c100e
Merge branch 'master' into dev
davebulaval Jul 13, 2022
958f456
Merge branch 'master' into dev
davebulaval Jul 13, 2022
ca51ceb
improve variable naming
davebulaval Jul 16, 2022
99b1d1c
refactor position of non protected method
davebulaval Jul 16, 2022
79b5513
bump pylint and add django for codacy
davebulaval Jul 26, 2022
3b4b821
fix deepparse tools pylint
davebulaval Jul 26, 2022
d20f51e
fix network pylint
davebulaval Jul 26, 2022
1aef169
fix vectorizer modules
davebulaval Jul 26, 2022
cd7909e
fix torch member and parser modules
davebulaval Jul 26, 2022
b3a018f
refactor arguments init in cli and cycling import
davebulaval Jul 26, 2022
d06cb9c
fix circular import
davebulaval Jul 26, 2022
d3c63be
fix last pylint errors
davebulaval Jul 26, 2022
50ad57d
fix error in csv column names versus column name
davebulaval Jul 26, 2022
69c8f1a
fix list csv column names missing nargs
davebulaval Jul 26, 2022
80bec06
remove duplicate detection and fix with statement for temporary direc…
davebulaval Jul 26, 2022
831b668
fix oylint on test
davebulaval Jul 26, 2022
164ff05
push to 0.8.1
davebulaval Jul 26, 2022
3876b0d
Merge branch 'master' into dev
davebulaval Jul 26, 2022
b218ef1
simplification skipif test testing
davebulaval Jul 26, 2022
259b7ac
bug fix issue 141
davebulaval Jul 27, 2022
889cec3
fix missing csv dataset in test for csv integration test
davebulaval Jul 27, 2022
e73f07d
merge improvement for error handling of retrain and test API
davebulaval Jul 27, 2022
e7f9fe5
merge master
davebulaval Jul 27, 2022
b79b3ee
linting yml file
davebulaval Jul 28, 2022
1dce60b
improve run all tests script
davebulaval Jul 28, 2022
48fe197
improve run tests python envs
davebulaval Jul 29, 2022
d3b5e69
fix naming of tests and some typos
davebulaval Aug 4, 2022
3c4ffde
add save_model_eights method (#147)
davebulaval Aug 4, 2022
6b7ff29
bumb actions version (checkout and setup-python
davebulaval Aug 5, 2022
b635cf0
fixed actions/checkout setted to 4 instead of 3
davebulaval Aug 5, 2022
77f2927
add dependabot
davebulaval Aug 5, 2022
e329471
bump stale to v5
davebulaval Aug 5, 2022
b4fb9a5
add python 3.11 in linting
davebulaval Aug 5, 2022
680ab19
remove python 3.11 since not supported for now and add 3.10 in window…
davebulaval Aug 5, 2022
ceb9eff
revert windoes python 3.10 since still fail
davebulaval Aug 5, 2022
b3d76a4
Add codeql (#148)
davebulaval Aug 5, 2022
abe5814
add deprecated warnings class type on deprecated download_from_url_fn
davebulaval Aug 10, 2022
7318093
refactored dataset containter creation into a factory
davebulaval Aug 10, 2022
bcdfe94
fix errors for parsing cases
davebulaval Aug 10, 2022
8578a9a
moved arguments in dataset factory
davebulaval Aug 10, 2022
aeb1559
add tests case for new factory tool fn
davebulaval Aug 10, 2022
2971fad
added val dataset handling
davebulaval Aug 10, 2022
9cc7201
fixed tests and remove major release todo
davebulaval Aug 10, 2022
639f35a
added cleaning conda env
davebulaval Aug 10, 2022
7c7211e
improved scirpt with warmup training
davebulaval Aug 10, 2022
bd59261
Merge branch 'dev' into improve_retrain_api
davebulaval Aug 10, 2022
9e5c71e
remove fine_tuning script since in branch
davebulaval Aug 10, 2022
64048e0
Merge branch 'dev' into improve_retrain_api
davebulaval Aug 10, 2022
6021e49
fixed tests
davebulaval Aug 10, 2022
dc3bfc3
fixed test without clear num_workers arg
davebulaval Aug 10, 2022
a936803
remove fn download_from_url
davebulaval Aug 11, 2022
76c2c22
removed unecessary retrain in test api tests
davebulaval Aug 11, 2022
d4e4928
added verbose for test and improved tests for retrain test integration
davebulaval Aug 11, 2022
b36765a
updated changelog
davebulaval Aug 11, 2022
caa616b
fixed missing hint typing, improved internal doc, fixed train_ratio a…
davebulaval Aug 12, 2022
e40dd4b
add pylint step on code examples
davebulaval Aug 12, 2022
01e3e08
added missing typing, uniformization of assertFileExist fn, added int…
davebulaval Aug 12, 2022
b3c9713
remove comment in linting ci to bug fix if failling problem
davebulaval Aug 12, 2022
a5bf182
fix dead verbose retrain api flag
davebulaval Aug 12, 2022
4983b3a
add ini option for django
davebulaval Aug 12, 2022
13ba1ac
remove linting of code example since fail due to pylint-django and I …
davebulaval Aug 12, 2022
3c968b6
fixed django settings
davebulaval Aug 12, 2022
73ccc94
add steps to install depparse for code examples linting
davebulaval Aug 12, 2022
4634523
remove install -e
davebulaval Aug 12, 2022
2fbb44b
reinstaller install -e .
davebulaval Aug 12, 2022
4c318c0
add skip=no-member since it is mostly flase positive
davebulaval Aug 12, 2022
61f7030
removed no-member pylint disable
davebulaval Aug 12, 2022
ad14e04
add docker image
davebulaval Aug 12, 2022
1a32a64
formating
davebulaval Aug 14, 2022
acc8aac
merge solved
davebulaval Aug 15, 2022
79d38d2
formated README
davebulaval Aug 15, 2022
1763fdb
update changelog
davebulaval Aug 19, 2022
18be17e
merge uk example and fixes to doc
davebulaval Aug 19, 2022
cc38ef7
hot-fix choices handling in cli.download
davebulaval Aug 19, 2022
9d4ed7e
merge master
davebulaval Aug 19, 2022
2b961a5
linting and security template mv
davebulaval Aug 21, 2022
288172e
improved deepparse server error handling
davebulaval Aug 21, 2022
f7ede49
merge offline parsing
davebulaval Aug 21, 2022
83fb9fc
fix typo in all test run
davebulaval Aug 21, 2022
1678c0e
fixed error in module name and refactored errors module
davebulaval Aug 22, 2022
72ccaae
fixed reference packaging other deepparse module
davebulaval Aug 22, 2022
10a4e77
added missing hint typing
davebulaval Aug 22, 2022
6fa8427
add missing urllib3 dependancies
davebulaval Aug 22, 2022
e13cc93
improve workflow
davebulaval Aug 22, 2022
ad226e1
improve doc
davebulaval Aug 22, 2022
18da9ec
add download_models, fix bug in cache path handling and fixed examples
davebulaval Aug 23, 2022
9bb818b
update changelog
davebulaval Aug 23, 2022
400129d
refactored test and add download_models tests
davebulaval Aug 23, 2022
3578f0d
merge refactoring of download cli fn
davebulaval Aug 23, 2022
448271e
moved code for licensing
davebulaval Aug 23, 2022
2514bac
fixed typo in doc
davebulaval Aug 23, 2022
79602ad
Update CHANGELOG.md
davebulaval Aug 24, 2022
7fc47be
Merge remote-tracking branch 'origin' into dev
davebulaval Sep 1, 2022
3ed2ff9
added factories and tests
MAYAS3 Sep 7, 2022
799e0cc
Merge branch 'dev' of https://github.com/GRAAL-Research/deepparse int…
MAYAS3 Sep 7, 2022
cbbb5f8
added offline argument to model factory
MAYAS3 Sep 8, 2022
40b9131
merged refactor_download_model
davebulaval Sep 9, 2022
9bb6922
added data padders & tests
MAYAS3 Sep 15, 2022
e0e8c49
Merge branch 'dev' of https://github.com/GRAAL-Research/deepparse int…
MAYAS3 Sep 15, 2022
6af6344
black formatting
MAYAS3 Sep 15, 2022
af93269
added data padder factory & tests
MAYAS3 Sep 15, 2022
7c6b1eb
added docstring & preparing to refactor padder
MAYAS3 Sep 19, 2022
335a0f2
refactored data padder to solve LSP issue
MAYAS3 Sep 19, 2022
4a01e38
refactored vectorizer factory & temporarily removed type hinting from…
MAYAS3 Sep 21, 2022
0d5be3e
adjusted docstring
MAYAS3 Sep 22, 2022
5acf589
Hotfix `SSLError` when downloading model weights of model type: `bpem…
ajndkr Sep 23, 2022
c9f0fde
moved context wrapper in bpemb embedding model
davebulaval Sep 23, 2022
bb1de73
removed unused as err
davebulaval Sep 23, 2022
a84e147
added pylint skip for broad except to hotfix code
davebulaval Sep 23, 2022
f5ec436
added pylint skip for broad except to hotfix code
davebulaval Sep 23, 2022
90bb512
bump version and changelog
davebulaval Sep 23, 2022
1c66b53
Merge branch 'master' into dev
davebulaval Sep 23, 2022
8196f4d
Merge branch 'main' into dev
davebulaval Sep 23, 2022
761b065
added DataPadder docstring
MAYAS3 Sep 23, 2022
d609cbb
applied refurb (#160)
davebulaval Oct 21, 2022
e5966dd
wip - added DataProcessor and tests
MAYAS3 Oct 21, 2022
d6dcf1f
tweaked process_for_training method
MAYAS3 Oct 26, 2022
802144f
finished DataProcessor and tests
MAYAS3 Oct 27, 2022
07bdf08
removed obsolete tests
MAYAS3 Oct 31, 2022
5a5e8be
added DataProcessor docstring
MAYAS3 Oct 31, 2022
98e79c1
Bump docker/metadata-action from 4.0.1 to 4.1.1 (#161)
dependabot[bot] Nov 1, 2022
c4c9077
Bump docker/login-action from 2.0.0 to 2.1.0 (#162)
dependabot[bot] Nov 1, 2022
ffdea56
Bump pylint from 2.15.3 to 2.15.5 (#163)
dependabot[bot] Nov 1, 2022
b0825e4
Bump docker/build-push-action from 3.1.1 to 3.2.0 (#164)
dependabot[bot] Nov 1, 2022
83c3188
Bump black from 22.8.0 to 22.10.0 (#165)
dependabot[bot] Nov 1, 2022
ba4af17
fix black dependancy pyproject.toml
davebulaval Nov 1, 2022
c4674d6
added DataProcessorFactory and tests
MAYAS3 Nov 9, 2022
4c909a7
fix error in arg train ratio example and added assert in deepparse.re…
davebulaval Nov 14, 2022
be484b2
added error handling for macos and improved windows for case of num_w…
davebulaval Nov 16, 2022
13d49d6
fixed failling test and improved test for test_api
davebulaval Nov 22, 2022
dd47380
fixed windows tests
davebulaval Nov 23, 2022
1b536f7
Update CHANGELOG.md
davebulaval Nov 23, 2022
6de6511
Feat/add new tags to retrain cli (#167)
davebulaval Nov 24, 2022
542a780
bump version and changelog
davebulaval Nov 24, 2022
78d02f5
Merge branch 'main' into dev
davebulaval Nov 24, 2022
522a1bc
fix typo in doc retrain CLI
davebulaval Nov 24, 2022
b0d57bd
fixed errors due to model naming conventions
MAYAS3 Jan 2, 2023
ec87131
added final docstring
MAYAS3 Jan 2, 2023
9b2af4f
fixed broken tests
MAYAS3 Jan 3, 2023
e558d8d
removed broken test patching
MAYAS3 Jan 3, 2023
cdd0ca4
cleaned-up parser after new changes integration
MAYAS3 Jan 4, 2023
52f4254
Merge branch 'main' of https://github.com/GRAAL-Research/deepparse in…
MAYAS3 Jan 4, 2023
eac2b5c
black formatting
MAYAS3 Jan 4, 2023
4b2e094
Merge branch 'main' of https://github.com/GRAAL-Research/deepparse in…
MAYAS3 Jan 4, 2023
2cb18d6
remove accidental unused import
MAYAS3 Jan 4, 2023
cf57f4f
fixed linting
MAYAS3 Jan 4, 2023
a349f03
black formatting
MAYAS3 Jan 4, 2023
1eec7d1
removed unnecessary args
MAYAS3 Jan 4, 2023
cc7f42c
patching factories in AddressParser tests to memory optimise
MAYAS3 Jan 5, 2023
131ea01
Merge branch 'dev' of https://github.com/GRAAL-Research/deepparse int…
MAYAS3 Jan 5, 2023
dbd524e
fixed brocken tests
MAYAS3 Jan 5, 2023
b626801
removed unused import
MAYAS3 Jan 5, 2023
f1dd9f7
fixed windows tests
MAYAS3 Jan 5, 2023
e0dbf40
fixed windows test
MAYAS3 Jan 5, 2023
d9b6f1f
removed unused modules after refactor
MAYAS3 Jan 5, 2023
62cc8fa
removed imports for removed modules
MAYAS3 Jan 5, 2023
2354204
add tensorboard dependancies in test/requirements since it make test …
davebulaval Jan 6, 2023
c085b9f
Update deepparse/parser/address_parser.py
MAYAS3 Jan 7, 2023
23353d3
added error handling to data processor factory
MAYAS3 Jan 7, 2023
a82c729
fixed linting
MAYAS3 Jan 7, 2023
7b3b13e
Update deepparse/converter/data_processor_factory.py
davebulaval Jan 7, 2023
40384da
fixed broken tests
MAYAS3 Jan 7, 2023
6c357d6
Merge branch 'refactor' of https://github.com/GRAAL-Research/deeppars…
MAYAS3 Jan 7, 2023
6ebef4d
fixed broken test
MAYAS3 Jan 7, 2023
d0514c8
Merge pull request #172 from GRAAL-Research/refactor
MAYAS3 Jan 7, 2023
d2548bb
Update CHANGELOG.md
davebulaval Jan 31, 2023
7e83d73
Bump docker/metadata-action from 4.1.1 to 4.3.0 (#173)
dependabot[bot] Feb 2, 2023
0a92d31
Bump pylint from 2.15.9 to 2.15.10 (#174)
dependabot[bot] Feb 2, 2023
8290245
Bump docker/build-push-action from 3.2.0 to 4.0.0 (#175)
dependabot[bot] Feb 2, 2023
2036d41
Bump black from 22.12.0 to 23.1.0 (#176)
dependabot[bot] Feb 2, 2023
c0c9d95
Merge branch 'main' into dev
davebulaval Feb 20, 2023
f4c4da7
bump version
davebulaval Feb 20, 2023
560f4de
black formatting
davebulaval Feb 20, 2023
5e38ec9
Merge branch 'main' into dev
davebulaval Feb 20, 2023
5c2e594
fixed missing update of tag converter in data processor
davebulaval Feb 23, 2023
7fa7d49
jupyter notebook formating
davebulaval Feb 23, 2023
2553496
fix typo in test
davebulaval Feb 23, 2023
1f5dc9c
bump version
davebulaval Feb 23, 2023
36876a1
Merge branch 'main' into dev
davebulaval Feb 23, 2023
0e44e97
improve documentation of codecov_push.sh
davebulaval Feb 23, 2023
cbffa0e
run docker ci monthly instead of weekly
davebulaval Mar 18, 2023
a347800
Bump pylint from 2.15.10 to 2.16.2 (#180)
dependabot[bot] Mar 18, 2023
7408fa2
Merge branch 'main' into dev
davebulaval Mar 18, 2023
3e66236
Merge branch 'dev' of github.com:GRAAL-Research/deepparse into dev
davebulaval Mar 18, 2023
184b008
add target to dev instead of main
davebulaval Mar 18, 2023
8bafe17
Python311 (#182)
davebulaval Mar 18, 2023
b9751d9
add missing with_hyphen_split argument in __call__
davebulaval Mar 21, 2023
43cac8f
Pre process address call argument (#183)
davebulaval Mar 21, 2023
e4baa2e
merge solved
davebulaval Mar 21, 2023
f7a0587
Various minor improvements (#185)
davebulaval Mar 27, 2023
f9f3eff
fix typo in doc
davebulaval Mar 27, 2023
67e901e
fix typo in doc
davebulaval Mar 27, 2023
817c8ea
drop python 3.7 (#187)
davebulaval Mar 27, 2023
d363f78
Bug fix retrain model loading (#188)
davebulaval Mar 30, 2023
bb9acf7
normalization of hint typing in doc
davebulaval Mar 30, 2023
cf52574
Lengths as list (#189)
davebulaval Mar 30, 2023
32dd5fd
bump version and changelog
davebulaval Mar 31, 2023
35969ba
improve documentation
davebulaval Apr 27, 2023
cd195be
improve ckpt error message and fix typos
davebulaval May 10, 2023
5452b95
improve error handling cache
davebulaval May 20, 2023
f0e7513
Aws s3 uri (#193)
davebulaval May 20, 2023
a97bb77
fix missing element in changelog
davebulaval May 20, 2023
8aa4b97
add override and better verbose handling in retrain
davebulaval May 20, 2023
5ff7e9f
update changelog
davebulaval May 20, 2023
36c3093
fix error in test
davebulaval May 20, 2023
9e92db5
fix tests
davebulaval May 20, 2023
e678cb6
Bug fix fasttext build (#194)
davebulaval May 21, 2023
92ca782
improve error handling weights load and fix broken tests
davebulaval May 22, 2023
fed024a
bump version and changelog
davebulaval May 22, 2023
abcda85
Merge branch 'main' into dev
davebulaval May 22, 2023
0a9b55e
fix import error
davebulaval May 22, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,16 @@ We love your input! We want to make contributing to this project as easy and tra
- Proposing new features
- Becoming a maintainer

## We Develop with Github
We use github to host code, to track issues and feature requests, as well as accept pull requests.
## We Develop with GitHub
We use GitHub to host code, to track issues and feature requests, as well as accept pull requests.

## We Use [Github Flow](https://guides.github.com/introduction/flow/index.html), So All Code Changes Happen Through Pull Requests
## We Use [GitHub Flow](https://guides.github.com/introduction/flow/index.html), So All Code Changes Happen Through Pull Requests
Pull requests are the best way to propose changes to the codebase. We actively welcome your pull requests:

1. Fork the repo and create your branch from the **`dev` branch**.
2. If you've added code that should be tested, you **must** ensure it is properly tested.
3. If you've changed APIs, update the documentation.
4. Ensure the Travis test suite passes.
4. Ensure the CI/CD test suite passes.
5. Make sure your code lints.
6. Submit that pull request!

Expand Down
1 change: 1 addition & 0 deletions .github/workflows/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ jobs:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -Ur requirements.txt
pip install -Ur docs/requirements.txt
pip install -e .
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/linting.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ jobs:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -Ur requirements.txt
pip install -Ur styling_requirements.txt
pip install -Ur tests/requirements.txt
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ jobs:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -Ur requirements.txt
pip install -Ur tests/requirements.txt
python setup.py develop
Expand All @@ -38,6 +39,7 @@ jobs:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -Ur requirements.txt
pip install -Ur tests/requirements.txt
python setup.py develop
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -144,3 +144,5 @@ deepparse/version.py
*.ckpt

*mlruns/

*model/
1 change: 1 addition & 0 deletions .release/bpemb.version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
aa32fa918494b461202157c57734c374
1 change: 1 addition & 0 deletions .release/bpemb_attention.version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
cfb190902476376573591c0ec6f91ece
1 change: 1 addition & 0 deletions .release/fasttext.version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
f67a0517c70a314bdde0b8440f21139d
1 change: 1 addition & 0 deletions .release/fasttext_attention.version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
a2b688bdfa2aa7c009bb7d980e352978
5 changes: 5 additions & 0 deletions .release/model_version_release.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# How to Create a New Model's Version

1. `md5sum <model.ckpt> > model.version`
2. Remove the model.cpkt text in `model.version` file
3. Update latests BPEMB and FastText hash in `tests/test_tools.py`
13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -313,4 +313,17 @@
suggested in the [documentation](https://pytorch.org/tutorials//intermediate/torch_compile_tutorial.html). It
increases the performance by about 1/100.

## 0.9.7

- New models release with more meta-data.
- Add a feature to use an AddressParser from a URI.
- Add a feature to upload the trained model to a URI.
- Add an example of how to use URI for parsing from and uploading to.
- Improve error handling of `path_to_retrain_model`.
- Bug-fix pre-processor error.
- Add verbose override and improve verbosity handling in retrain.
- Bug-fix the broken FastText installation using `fasttext-wheel` instead of `fasttext` (
see [here](https://github.com/facebookresearch/fastText/issues/512#issuecomment-1534519551)
and [here](https://github.com/facebookresearch/fastText/pull/1292)).

## dev
2 changes: 1 addition & 1 deletion deepparse/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
from .fasttext_tools import *
from .tools import *
from .version import __version__
from .weights_init import *
from .weights_tools import *
2 changes: 1 addition & 1 deletion deepparse/cli/parse.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ def main(args=None) -> None:

.. code-block:: sh

parse fasttext ./dataset.csv parsed_address.pckl --path_to_retrained_model ./path
parse fasttext ./dataset.csv parsed_address.pckl --path_to_model_weights ./path

"""
if args is None: # pragma: no cover
Expand Down
2 changes: 1 addition & 1 deletion deepparse/cli/parser_arguments_adder.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ def add_batch_size_arg(parser: ArgumentParser) -> None:
def add_path_to_retrained_model_arg(parser: ArgumentParser) -> None:
parser.add_argument(
"--path_to_retrained_model",
help=wrap("A path to a retrained model to use for testing."),
help=wrap("A path to a retrained model to use. It can be an S3-URI."),
type=str,
default=None,
)
Expand Down
2 changes: 1 addition & 1 deletion deepparse/network/decoder.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import torch
from torch import nn

from ..weights_init import weights_init
from .. import weights_init


class Decoder(nn.Module):
Expand Down
2 changes: 1 addition & 1 deletion deepparse/network/encoder.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from torch import nn
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence

from ..weights_init import weights_init
from .. import weights_init


class Encoder(nn.Module):
Expand Down
19 changes: 10 additions & 9 deletions deepparse/network/seq2seq.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,14 @@
import random
import warnings
from abc import ABC
from collections import OrderedDict
from typing import Tuple, Union, List

import torch
from torch import nn

from .decoder import Decoder
from .encoder import Encoder
from .. import handle_weights_upload
from ..tools import download_weights, latest_version


Expand Down Expand Up @@ -113,20 +113,21 @@ def _load_pre_trained_weights(self, model_type: str, cache_dir: str, offline: bo
)
download_weights(model_type, cache_dir, verbose=self.verbose)

all_layers_params = torch.load(model_path, map_location=self.device)
self.load_state_dict(all_layers_params)
self._load_weights(path_to_model_torch_archive=model_path)

def _load_weights(self, path_to_retrained_model: str) -> None:
def _load_weights(self, path_to_model_torch_archive: str) -> None:
"""
Method to load (into the network) the weights.

Args:
path_to_retrained_model (str): The path to the fine-tuned model.
path_to_model_torch_archive (str): The path to the fine-tuned model Torch archive.
"""
all_layers_params = torch.load(path_to_retrained_model, map_location=self.device)
if isinstance(all_layers_params, dict) and not isinstance(all_layers_params, OrderedDict):
# Case where we have a retrained model with a different tagging space
all_layers_params = all_layers_params.get("address_tagger_model")
all_layers_params = handle_weights_upload(
path_to_model_to_upload=path_to_model_torch_archive, device=self.device
)

# All the time, our torch archive include meta-data along with the model weights
all_layers_params = all_layers_params.get("address_tagger_model")
self.load_state_dict(all_layers_params)

def _encoder_step(self, to_predict: torch.Tensor, lengths: List, batch_size: int) -> Tuple:
Expand Down
Loading