Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fork bergamot-translator into this repostiory as inference #867

Merged
merged 467 commits into from
Oct 1, 2024
Merged
Show file tree
Hide file tree
Changes from 250 commits
Commits
Show all changes
467 commits
Select commit Hold shift + click to select a range
fa2003e
Cleanup API: Refactor request on-complete transition (#80)
Apr 27, 2021
4be96a9
Handle empty translation requests
Apr 27, 2021
e5ec5bd
Control validating the config options via a boolean flag (#116)
abhi-agg Apr 29, 2021
de0abfd
JS bindings for loading model and shortlist files as bytes (#117)
abhi-agg Apr 29, 2021
3525af6
Make wasm test page work with bergamot-models repository
abhi-agg Apr 29, 2021
2788116
Better error logging for wasm test page
abhi-agg Apr 29, 2021
e286533
Update to marian-dev master
XapaJIaMnu Apr 30, 2021
d82e01e
Full windows support with ssplit from browsermt, not a fork (#109)
XapaJIaMnu Apr 30, 2021
f3a257d
Enabled gemm-precision in wasm test page
abhi-agg Apr 29, 2021
4908e40
Updated wasm/README file with instructions for byte loading APIs
abhi-agg Apr 29, 2021
36b3c72
WASM Bindings collapse (#87)
May 3, 2021
1a4add1
Improve script to patch wasm artifacts and load EN->DE vocabulary in …
abhi-agg May 3, 2021
8de368c
Improved wasm scripts and README (#128)
abhi-agg May 4, 2021
ec3a785
Merge remote-tracking branch 'upstream/main' into main
abhi-agg May 4, 2021
d8f7e51
Minor README change
abhi-agg May 4, 2021
c478a62
Updating ci scripts for the latest upstream changes
abhi-agg May 4, 2021
a63533b
Extension desired changes (#129)
motin May 4, 2021
743ebcd
Extension desired changes (#129)
motin May 4, 2021
c61b2bd
Fix busy loop in windows (#131)
kpu May 5, 2021
bc2e4ee
Making bytearray a commandline switch (#127)
May 5, 2021
b86c76b
Faithful to source-structure translation (#115)
May 6, 2021
5b02008
Enable vocabs pass as byte arrays (#122)
qianqianzhu May 7, 2021
21c1cae
Update ssplit submodule, removing absl (#132)
XapaJIaMnu May 7, 2021
bef1276
Minor rename: sentence_ranges -> annotation (#134)
May 7, 2021
87adb5d
Target master of ssplit-cpp
XapaJIaMnu May 7, 2021
354e7ac
Remove unused used types TokenRanges, SentenceTokenRanges, UPtr (#137)
May 9, 2021
ce01de9
Change USE_WASM_COMPATIBLE_SOURCE =OFF by default on native, force on…
kpu May 10, 2021
ce576c2
Export "addOnPreMain" function from wasm module
abhi-agg May 10, 2021
331216e
Enable Debugging information in wasm module builds
abhi-agg May 10, 2021
9f78985
JS bindings for vocabularies as bytes
abhi-agg May 10, 2021
5025285
Updated wasm test page to pass vocabulary files as bytes
abhi-agg May 10, 2021
d7cb859
Refactoring TranslationModelBindings class
abhi-agg May 11, 2021
451ab04
Merge remote-tracking branch 'upstream/main' into main
abhi-agg May 12, 2021
8a6c7b4
Avoid packaging vocab files into wasm binary in CI builds
abhi-agg May 12, 2021
e0b9bad
Updated wasm README to update for passing vocabs as bytes
abhi-agg May 12, 2021
0189500
Updated README to remove packaging steps for wasm compilation
abhi-agg May 12, 2021
6c063c6
Updated CMakeLists.txt to remove packaging steps for wasm compilation
abhi-agg May 12, 2021
6c7e615
Bundle AlignedMemory inputs with MemoryBundle (#147)
qianqianzhu May 13, 2021
77424a3
Enabling ccache on github builds for Ubuntu (#95)
May 17, 2021
5bd1fc6
Refactor vocabs in Service (#143)
qianqianzhu May 17, 2021
3e70587
Rewrite annotation class to remove corner cases (#135)
kpu May 17, 2021
c1ef6f2
Added cmake file to compute version information
abhi-agg May 17, 2021
c44868e
Import GetVersionFromFile cmake file in root level CMakeLists.txt
abhi-agg May 17, 2021
2e5880d
Modified wasm cmake file to include version information in built arti…
abhi-agg May 17, 2021
0ad583c
Generate project version file for native builds
abhi-agg May 17, 2021
067076f
Bumped version to 0.3.0
abhi-agg May 17, 2021
b73714e
Merge remote-tracking branch 'upstream/main' into main
abhi-agg May 18, 2021
7a973df
Corrected the version number
abhi-agg May 18, 2021
1c40cc8
Merge branch 'main' into main
motin May 18, 2021
10131c7
Marian submodule with unified loading (#157)
XapaJIaMnu May 18, 2021
813e81c
Merge branch 'main' into main
abhi-agg May 18, 2021
8b621de
Merge pull request #159 from mozilla/main
abhi-agg May 18, 2021
269edc7
Collapsing TranslationRequest -> ResponseOptions (#139)
May 18, 2021
b25f223
Rewriting batching for threadsafety (#155)
kpu May 18, 2021
89bd473
Use binary lexical shortlist in documentation (#152)
kpu May 19, 2021
7ad8d0a
initialise MemoryBundle members (#167)
qianqianzhu May 19, 2021
9dcf6ab
Adding clang-format and updating existing sources to adhere (#151)
May 19, 2021
0f8f8e0
Pin emsdk version to the same one used in Circle CI (#165)
motin May 20, 2021
4b177d5
GitHub action to push browsermt/main branch to mozilla/bergamot-trans…
motin May 20, 2021
4f8050b
Update tests
XapaJIaMnu May 20, 2021
f125372
Bumping BRT for hotfixes (#169)
May 20, 2021
22a1b91
Remove O(N^2) reallocation (#171)
May 21, 2021
576afae
Adding documentation action (#168)
May 25, 2021
8bec1b7
Fix failures when loading text shortlist (#154)
qianqianzhu May 25, 2021
eb579ed
Updating marian dev RelwithDebInfo -> Release (#178)
May 27, 2021
5d3ec9c
Single executable (#175)
May 31, 2021
ceaf21a
Deploy generated documentation only if browsermt (#179)
Jun 1, 2021
3308403
Including WASM documentation in sphinx build toc (#176)
Jun 1, 2021
73228bb
Updating marian-dev: intgemm with env variable matmul switches (#187)
Jun 3, 2021
5f0d396
Remove addSentenceWithPriority (#186)
kpu Jun 3, 2021
71a6240
Update native (ubuntu, mac) workflows with ccache (#181)
Jun 4, 2021
d39e027
Replace resize with possible negative range with pop_back() (#189)
Jun 4, 2021
3e46e33
Consistent EMSDK version and parallel make jobs in README and github …
abhi-agg Jun 8, 2021
dc2fb3d
CMake fixes: Generate project.h in binary dir, fix GetVersionFromFile…
Jun 9, 2021
3039dea
Fixing if syntax with YAML var subsitution (#188)
Jun 9, 2021
16eb47f
Generating cmake configured project version (.js) file in build folde…
abhi-agg Jun 9, 2021
e9e5ac6
Partial test-apps and tolerance in evaluations (#184)
Jun 14, 2021
4b01466
Removing alignments and quality-scores test-code (#196)
Jun 14, 2021
b00116c
Refactor wasm bindings to use consistent interface names as in native…
abhi-agg Jun 15, 2021
44aa70a
Account for EOS in both source and target annotations (#190)
Jun 15, 2021
13a1fe8
Load sentence-splitter (non-breaking prefixes) from ByteArray
Jun 21, 2021
cb855be
maxLengthBreak_ -> wrapStep bugfix (#200)
Jun 28, 2021
a202e35
Change ResponseBuilder to accept callback instead of future (#142)
Jul 5, 2021
6ad794f
Added public methods in Response class to return sentences
abhi-agg Jul 5, 2021
7052722
JS bindings to return sentence byte ranges
abhi-agg Jul 6, 2021
5a8fe20
Wasm: Enabled sentence byte ranges in the wasm test page
abhi-agg Jul 6, 2021
d31f963
Windows workflow: run-vcpkg7.{3->4}; vcpkg master (#208)
jerinphilip Jul 29, 2021
f3e00ae
Added build instructions to run on other browsers
abhi-agg Aug 11, 2021
9994d4a
Merge pull request #215 from abhi-agg/non-wormhole-builds
abhi-agg Aug 11, 2021
972d856
Add a clang-tidy run (#214)
jerinphilip Aug 13, 2021
b64ffce
Wasm test page using web workers now (#218)
abhi-agg Aug 26, 2021
ff391c6
Updated marian submodule to latest commit of master
abhi-agg Aug 24, 2021
cafb65e
Wasm builds without SharedArrayBuffer
abhi-agg Aug 24, 2021
8e43742
Circle CI wasm artifacts for non-wormhole builds
abhi-agg Aug 31, 2021
48e955c
BRT: Update sacrebleu to get tests back working (#217)
jerinphilip Sep 7, 2021
63120c1
QualityEstimation: Preliminary Implementation (#197)
abarbosa94 Sep 16, 2021
cf541c6
Multiple TranslationModels Implementation (#210)
jerinphilip Sep 21, 2021
c7b626d
Adapted wasm test page for new Service interface (#224)
abhi-agg Sep 28, 2021
a0cb1e4
Wasm test page UI for translating b/w non-English language pairs (#231)
abhi-agg Oct 19, 2021
c5167b3
Import matrix-multiply from a separate wasm module (#232)
abhi-agg Oct 27, 2021
d0d08c0
JS bindings for Quality Estimation (#239)
abhi-agg Oct 27, 2021
2b98c67
Cache for translations (#227)
jerinphilip Oct 27, 2021
45412ce
Set PR to any branch to trigger workflows (#230)
jerinphilip Oct 28, 2021
47e57c9
[ssplit-cpp] Enable position independent library when compiled from s…
jerinphilip Oct 29, 2021
9b44399
EXCLUDE_FROM_ALL for marian and ssplit-cpp 3rd-party libraries (#243)
jerinphilip Oct 31, 2021
c5bc3f5
Update config "skip-cost" to enable log probabilities for QE scores (…
abhi-agg Nov 1, 2021
806169c
Recover logging (#226)
jerinphilip Nov 1, 2021
0bb8095
Deprecate hardAlignment in favour of softAlignment (#250)
jerinphilip Nov 1, 2021
7693a1d
Updated marian submodule (#256)
abhi-agg Nov 3, 2021
fa4efb4
Update ssplit cpp, pcre2 source compile to fix broken builds (#258)
jerinphilip Nov 5, 2021
5a693b7
Fixes windows workflow for PCRE2 (#260)
jerinphilip Nov 5, 2021
d6a14b1
Fix badge to point to this repo instead mozilla's (#261)
andrenatal Nov 15, 2021
f9e55b3
Make script run from any directory (#262)
abhi-agg Nov 15, 2021
2b1b053
Import optimized gemm implementation (when available) for wasm target…
abhi-agg Nov 17, 2021
4036616
HTML input (#253)
kpu Nov 25, 2021
eea5554
HTML handling improvements (#266)
jelmervdl Nov 29, 2021
e8fd01e
Updated marian-dev submodule
abhi-agg Nov 30, 2021
8e79897
Updated configuration for html text translation to work in wasm test …
abhi-agg Dec 1, 2021
e75a9e1
More robust logic to import wasm gemm (#276)
abhi-agg Dec 14, 2021
571d312
Constrain mistune to fix docs CI (#278)
jerinphilip Dec 14, 2021
feb9c90
Additional logs in JS translation worker (#277)
abhi-agg Dec 14, 2021
8563f08
Proper arch setting on win32 (#275)
XapaJIaMnu Dec 14, 2021
420f12b
Remove value length limit from HTML parser & interpolated alignments …
jelmervdl Dec 15, 2021
8884b39
Disabled importing optimized gemm module (#282)
abhi-agg Dec 17, 2021
793d132
Adding circle ci job to push the wasm artifacts to github releases (#…
andrenatal Dec 17, 2021
1a27a8e
Increase HTML test coverage (#279)
jelmervdl Dec 20, 2021
bcbbfe1
Better command-line with isolation for both Services and co-located d…
jerinphilip Dec 21, 2021
f55377b
HTML transfer empty elements (#283)
jelmervdl Dec 21, 2021
9e1c1e8
CI: Circle CI config script update (#287)
abhi-agg Dec 21, 2021
6e6042c
GitHub CI: Update YAML to run all tests on marian-full (#292)
jerinphilip Dec 29, 2021
8eb238e
HTML basic integration tests (#291)
jerinphilip Dec 30, 2021
d209e4f
Fix typo in BRT args on CI runs (#294)
jerinphilip Dec 30, 2021
ddccc77
Turn logging off by default, allow turning on via config/cmdline (#295)
jerinphilip Jan 2, 2022
3883dd1
cache: threadsafety-fixes; optional stats collection (#245)
jerinphilip Jan 2, 2022
81c2192
Have alignments placed if HTML is on (#296)
jerinphilip Jan 3, 2022
dae02a3
HTML transfer script/style/etc elements (#285)
jelmervdl Jan 5, 2022
71b84b7
CI guaranteed example documentation (#300)
jerinphilip Jan 6, 2022
13c55e2
Defer model loading to parallel worker thread (#303)
jelmervdl Jan 14, 2022
e061b56
Treat most HTML elements as word-breaking (#286)
jelmervdl Jan 16, 2022
6a4f409
First class pivot translation capability (#236)
jerinphilip Jan 17, 2022
acbc46d
Accept XHTML-style self-closing void tags (#305)
jelmervdl Jan 19, 2022
7099b9e
Streamline memory-bundle loads (#307)
jerinphilip Jan 19, 2022
aef76c0
Add API to trigger fast shutdown of AsyncService (#297)
jelmervdl Jan 21, 2022
495f98d
Speed up Windows CI with ccache (#308)
jerinphilip Jan 22, 2022
3dde0fe
Remove unused compiler hash script (#309)
jerinphilip Jan 24, 2022
c0f311a
Batteries included python package (#310)
jerinphilip Jan 26, 2022
cfdda15
BRT: Update to fix QE download failures (#321)
jerinphilip Jan 31, 2022
95de806
Fix HTML with pivoting (#323)
jerinphilip Feb 1, 2022
19ae519
Remove obsolete workflow transferring source across forks (#326)
jerinphilip Feb 2, 2022
d95b014
Wasm/JS: Pivot translation API JS binding and test page update (#327)
abhi-agg Feb 2, 2022
91b2e06
emscripten: ccache and artefact upload (#325)
jerinphilip Feb 2, 2022
5e78260
Consolidate release artefacts (#329)
jerinphilip Feb 4, 2022
b1e5a48
Increment version to v0.4.0 (#328)
jerinphilip Feb 5, 2022
97bd6e3
Make default throw exception on abort for python (#333)
jerinphilip Feb 5, 2022
62ff781
Revert "Make default throw exception on abort for python (#333)"
kpu Feb 5, 2022
f6d9233
Revert "Revert "Make default throw exception on abort for python (#33…
kpu Feb 7, 2022
6b2a855
JS/WASM: Re-enable importing optimized gemm module for (#336)
abhi-agg Feb 7, 2022
80bd4e7
Print errors by default in WASM build (#343)
jelmervdl Feb 9, 2022
3478652
Add ability to load `.npz` models (#342)
jerinphilip Feb 9, 2022
ec46919
Allow per-input options (#346)
jerinphilip Feb 11, 2022
c76e630
JS/WASM: Passing ResponseOptions for every item for translation batch…
abhi-agg Feb 14, 2022
a94725b
Update aligned vector following intgemm 1b8cbd6f611c21011325cfe031294…
kpu Feb 14, 2022
9f55fb4
Improve cache (#347)
jerinphilip Feb 15, 2022
2844ced
JS: Refactoring wasm test page (#354)
abhi-agg Feb 17, 2022
6ccd4c6
Create github release via CircleCI only for mozilla fork (#349)
abhi-agg Feb 17, 2022
9eb2437
Bump version to 0.4.1 (#356)
abhi-agg Feb 17, 2022
1f98f97
Improve handling HTML special cases (#312)
jelmervdl Feb 22, 2022
96b0f82
Simplify cache config and bind for use in JS (#359)
jerinphilip Feb 23, 2022
fe3f398
Embed quality-scores as HTML tag attributes (#358)
jelmervdl Feb 25, 2022
1360941
Enable dependabot to automate updating dependencies (#365)
jerinphilip Mar 3, 2022
89a96bf
Use right range and threshold for showing "bad" words/sentences (#370)
abhi-agg Mar 3, 2022
ab7f84f
Bump version to 0.4.2 (#371)
abhi-agg Mar 7, 2022
22d6bc0
Bump 3rd_party/marian-dev from `08b1544` to `7e67124` (#372)
dependabot[bot] Mar 9, 2022
2c0e65c
JS: Reuse Model registry from firefox-translation-models for test pag…
abhi-agg Mar 14, 2022
0a52a6d
JS: Using supervised QE models for available language pairs (#378)
abhi-agg Mar 15, 2022
409b7d2
Bump 3rd_party/marian-dev from `7e67124` to `844800e` (#382)
dependabot[bot] Mar 18, 2022
ed31605
JS: Update languages & use Intl API for their display names (#379)
jelmervdl Mar 23, 2022
46882e7
JS: Fix swap button on test-page (#388)
jerinphilip Mar 24, 2022
1344335
Docs: Pin Jinja2 to last known working version (#389)
jerinphilip Mar 24, 2022
d2e3a82
Bump version to 0.4.3 (#392)
abhi-agg Mar 28, 2022
7d51d10
Bump bergamot-translator-tests from `d03a9d3` to `7984d14` (#394)
dependabot[bot] Mar 30, 2022
df5db52
Fix call to `isspace` (#396)
jelmervdl Mar 31, 2022
f18a883
Bump 3rd_party/ssplit-cpp from `a08d6bc` to `49fde6d` (#408)
dependabot[bot] Apr 14, 2022
98af594
Update and fix windows CI (#410)
jerinphilip Apr 15, 2022
e344206
Upgrade emsdk to 3.1.8 (#414)
abhi-agg Apr 19, 2022
5ae1b1e
Bump version to 0.4.4 (#415)
abhi-agg Apr 28, 2022
ad78165
Bump 3rd_party/marian-dev from `199201e` to `e88c1aa` (#416)
dependabot[bot] May 18, 2022
61d2c35
Set up python packaging for pypi distribution (#424)
jerinphilip Jun 20, 2022
8771078
Basic HTML property testing for WebAssembly (#425)
jerinphilip Jun 21, 2022
05a8778
Bump version to 0.4.5 (#427)
jerinphilip Jun 21, 2022
3ef85e1
Python package: pyyaml >= 5.1 (#429)
jerinphilip Jun 24, 2022
84c761b
Python: Work offline if models are available (#431)
jerinphilip Jun 25, 2022
7f79128
MacOS Wheels (#432)
graemenail Jun 29, 2022
06c31af
update download path
XapaJIaMnu Jan 17, 2023
21eff44
try to update coding_styles workflow
XapaJIaMnu Jan 17, 2023
6cefc43
Latest and greatest clang-format
XapaJIaMnu Jan 18, 2023
620c8b0
Bump qs and express in /wasm/test_page (#444)
dependabot[bot] Jan 18, 2023
6f2659f
Arm updated (#443)
XapaJIaMnu Jan 18, 2023
7d24908
Apply security update and formatting
XapaJIaMnu Jan 18, 2023
2834f04
Expand the node-test.js example code with documentation (#434)
jelmervdl Jan 18, 2023
8d5f877
More portable WASM demo (#437)
jelmervdl Jan 18, 2023
1ba7461
Fix compilation on x86
XapaJIaMnu Jan 19, 2023
82c276a
Fix path to example program
kpu Mar 1, 2023
eb0fe1b
Bump 3rd_party/marian-dev from `69e27d2` to `8ceb051` (#446)
dependabot[bot] May 4, 2023
fceb713
Update workflows
XapaJIaMnu May 4, 2023
3c2a667
Try harder to install gperftools
XapaJIaMnu May 4, 2023
b3d36bc
Bump 3rd_party/marian-dev from `8ceb051` to `bb65f47` (#447)
dependabot[bot] May 10, 2023
ada8c39
Fix compilation on newer gcc
XapaJIaMnu Jun 6, 2023
eaa2562
Sentencepiece windows compilation
XapaJIaMnu Jul 12, 2023
e333208
Bump 3rd_party/marian-dev from `6a6bbb6` to `aa0221e` (#452)
dependabot[bot] Jul 31, 2023
becb6e2
Fix Python formatting (Black) (#453)
graemenail Jul 31, 2023
cbfa839
Fix CI (#454)
graemenail Jul 31, 2023
8011f9c
Bump bergamot-translator-tests from `7984d14` to `a04432d` (#455)
dependabot[bot] Jul 31, 2023
4b0da8d
Enables model ensembles (#450)
graemenail Aug 1, 2023
2bdc493
Bump 3rd_party/ssplit-cpp from `ad2c5a5` to `a311f98` (#456)
dependabot[bot] Aug 8, 2023
ca95467
Bump 3rd_party/marian-dev from `aa0221e` to `8dbde0f` (#458)
dependabot[bot] Aug 11, 2023
534ed37
Remove wormhole references (#459)
XapaJIaMnu Aug 14, 2023
47024ec
Add more things to the gitignore that are not being ignored (#462)
gregtatum Aug 16, 2023
62770bb
Generate a compile_commands.json by default with cmake (#461)
gregtatum Aug 16, 2023
db38262
Report the wasm size on builds (#460)
gregtatum Aug 17, 2023
0b069ac
Bump 3rd_party/marian-dev from `300a50f` to `780df27` (#464)
dependabot[bot] Sep 11, 2023
321be8a
Bump 3rd_party/marian-dev from `780df27` to `11c6ae7` (#466)
dependabot[bot] Sep 20, 2023
73182d4
Pull in marian-dev with fixed CI and clang
kpu Dec 7, 2023
7774029
clang: marian-dev with newer fbgemm
kpu Dec 7, 2023
0367ae0
Fix MKL key URL
kpu Dec 7, 2023
983331b
More pendantic spm
XapaJIaMnu Dec 19, 2023
5261614
model url update in example script (#470)
Kirandevraj Mar 23, 2024
34acd8d
fix downloading of models in the python binding (#472)
bjesus Apr 19, 2024
9271618
Update submodule
XapaJIaMnu May 12, 2024
285cf6b
Add 'inference-engine/' from commit '9271618ebbdc5d21ac4dc4df9e72beb7…
nordzilla Sep 26, 2024
bbb8442
Move inference-engine git submodules to the repository root
nordzilla Sep 19, 2024
cad3963
Rename inference-engine/3rd_party/marian-nmt
nordzilla Sep 19, 2024
3da08c9
Remove bergamot-translator-tests dependency
nordzilla Sep 26, 2024
37d0113
Remove .circleci and .github files
nordzilla Sep 20, 2024
27e85d2
Remove unneeded Python code
nordzilla Sep 20, 2024
1019fdd
Remove unneeded CLI code
nordzilla Sep 26, 2024
b6906d9
Remove unneded doc code
nordzilla Sep 26, 2024
00f4a30
Add build-local script to inference-engine
nordzilla Sep 26, 2024
bb47eed
Add unit-tests script to inference-engine
nordzilla Sep 26, 2024
cf23bf7
Add clean script to inference-engine
nordzilla Sep 26, 2024
07e3216
Move build-wasm script to inference-engine/scripts directory
nordzilla Sep 26, 2024
c62bea0
Add review groups to CODEOWNERS
nordzilla Sep 25, 2024
72b6c9d
Rename inference-engine to inference
nordzilla Sep 30, 2024
8d2edd1
Reintroduce browsermt-marian-dev comment to .gitmodules file
nordzilla Sep 30, 2024
01e3af5
Remove sub-directory README files
nordzilla Sep 30, 2024
baf2d55
Move hidden clang files to the repository root
nordzilla Sep 30, 2024
39ee2c4
Remove inference/Doxyfile.in
nordzilla Oct 1, 2024
bdbb68a
Remove inference/MANIFEST.in
nordzilla Oct 1, 2024
3558f4f
Remove inference/LICENSE
nordzilla Oct 1, 2024
55f04b1
Add TODO for issue #869
nordzilla Oct 1, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .clang-format
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
BasedOnStyle: Google

# Maximum line length 80 is too low even for 1080p monitor. @XapaJIaMnu
# personally would like 120.
ColumnLimit: 120
4 changes: 4 additions & 0 deletions .clang-format-ignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
3rd_party
wasm/test_page
src/translator/aligned.h
src/translator/pcqueue.h
32 changes: 32 additions & 0 deletions .clang-tidy
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
Checks: >
.*,
bugprone-*,
concurrency-*,
google-*,
portability-*,
performance-*,
clang-analyzer-*,
readability-*,
-readability-implicit-bool-conversion,
-readability-isolate-declaration,
-readability-uppercase-literal-suffix,
misc-*,
-misc-noexcept*,
modernize-*,
-modernize-deprecated-headers,
-modernize-use-nodiscard,
-modernize-raw-string-literal,
-modernize-return-braced-init-list,
-modernize-use-equals-delete,
-modernize-use-trailing-return-type,



CheckOptions:
- { key: readability-identifier-naming.ClassCase, value: CamelCase }
- { key: readability-identifier-naming.ClassMethodCase, value: camelBack }
- { key: readability-identifier-naming.VariableCase, value: camelBack }
- { key: readability-identifier-naming.FunctionCase, value: camelBack }
- { key: readability-identifier-naming.PrivateMemberSuffix, value: _ }
- { key: readability-identifier-naming.ParameterCase, value: camelBack }

30 changes: 28 additions & 2 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -1,5 +1,31 @@
# Firefox Translations review group
.dockerignore @mozilla/firefox-translations
.github @mozilla/firefox-translations
.gitignore @mozilla/firefox-translations
.gitmodules @mozilla/firefox-translations
docker @mozilla/firefox-translations
docs @mozilla/firefox-translations
utils @mozilla/firefox-translations
CODE_OF_CONDUCT.md @mozilla/firefox-translations
LICENSE @mozilla/firefox-translations
poetry.lock @mozilla/firefox-translations
pyproject.toml @mozilla/firefox-translations
README.md @mozilla/firefox-translations
Taskfile.yml @mozilla/firefox-translations

# Translations Training review group
configs @mozilla/translations-training
pipeline @mozilla/translations-training
snakemake @mozilla/translations-training
tests @mozilla/translations-training
tracking @mozilla/translations-training

# Translations Inference review group
inference-engine @mozilla/translations-inference
eu9ene marked this conversation as resolved.
Show resolved Hide resolved

# Taskcluster pipeline related files. Changes to these ought to be reviewed by
# RelEng to watch for security issues and best practices. These should also
# be reviewed by people familiar with the pipeline itself.
.taskcluster.yml @mozilla/releng
taskcluster @mozilla/releng
.taskcluster.yml @mozilla/releng @mozilla/translations-training
taskcluster @mozilla/releng @mozilla/translations-training

24 changes: 24 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,27 @@
[submodule "3rd_party/preprocess"]
path = 3rd_party/preprocess
url = https://github.com/kpu/preprocess.git
[submodule "inference/3rd_party/ssplit-cpp"]
path = inference/3rd_party/ssplit-cpp
url = https://github.com/browsermt/ssplit-cpp
# This is the same dependency and repository as `3rd_party/browsermt-marian-dev` below.
#
# When forking `inference-engine` into to this project, I made an earnest attempt to utilize the preexisting
# `3rd_party/browsermt-marian-dev` submodule within `inference-engine`. Unfortunately, I ran into several roadblocks:
#
# 1) I cannot directly add `3rd_party/browsermt-marian-dev` as a cmake subdirectory because cmake is aware that
# this path is not a subdirectory of the `inference-engine` project root.
#
# 2) Symbolic links do not appear to work for git submodule direcotires the way that they do for regular directories.
# Even if the symbolic link had linked correctly, it may have still failed due to the considerations of 1).
#
# 3) I tried using cmake to copy the files from `3rd_party/browsermt-marian-dev` into `inference-engine/3rd_party/browsermt-marian-dev`
# at build time, which would ensure that there is no duplicate reference to the URL in this file, however the upstream dependency itself
# has hard-coded expectations that the `.git` directory is only one level up, which appears to work correctly for the way git submodules are
# configured, but does not work if the files are copied over to a regular directory deeper in the repository's directory tree.
#
# It may be possible to remove `3rd_party/browsermt-marian-dev` to instead use `inference-engine/3rd-party/browsermt-marian-dev` everywhere
nordzilla marked this conversation as resolved.
Show resolved Hide resolved
# within this repository, but I will leave that for a future commit if there is a need to do so.
[submodule "inference/3rd_party/browsermt-marian-dev"]
path = inference/3rd_party/browsermt-marian-dev
url = https://github.com/browsermt/marian-dev
24 changes: 24 additions & 0 deletions Taskfile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,30 @@ tasks:
cmds:
- poetry run opuscleaner-server serve --host=0.0.0.0 --port=8000

inference-clean:
desc: Clean build artifacts from the inference directory.
cmds:
- >-
task docker-run -- ./inference/scripts/clean.sh

inference-build:
desc: Build inference engine.
cmds:
- >-
task docker-run -- ./inference/scripts/build-local.sh

inference-test:
desc: Run inference tests.
cmds:
- >-
task docker-run -- ./inference/scripts/unit-tests.sh

inference-build-wasm:
desc: Build inference engine WASM.
cmds:
- >-
task docker-run -- ./inference/scripts/build-wasm.sh

lint-black:
desc: Checks the styling of the Python code with Black.
deps: [poetry-install-black]
Expand Down
30 changes: 30 additions & 0 deletions inference/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# vim temporary files
*.swp
*.swo

# CMake
CMakeLists.txt.user
CMakeCache.txt
CMakeFiles
CMakeScripts
Testing
Makefile
cmake_install.cmake
install_manifest.txt
compile_commands.json
CTestTestfile.cmake
_deps


wasm/test_page/node_modules
/build
/build-local
/build-native
/build-wasm
/emsdk
models
wasm/module/worker/bergamot-translator-worker.*
wasm/module/browsermt-bergamot-translator-*.tgz

# VSCode
.vscode
32 changes: 32 additions & 0 deletions inference/3rd_party/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# browsermt-marian-dev is tested elsewhere in both paths, turning off here.
set(COMPILE_TESTS OFF)
add_subdirectory(browsermt-marian-dev EXCLUDE_FROM_ALL)

if(COMPILE_WASM)
# This is a bad way of adding compilation flags. Will be improved soon.
add_compile_options(${WASM_COMPILE_FLAGS})
add_link_options(${WASM_LINK_FLAGS})
endif(COMPILE_WASM)

add_subdirectory(ssplit-cpp EXCLUDE_FROM_ALL)

# Add include directories for 3rd party targets to be able to use it anywhere in the
# project without explicitly specifying their include directories. Once they
# fixe this problem, it can be removed.
get_property(INCDIRS DIRECTORY browsermt-marian-dev/src PROPERTY INCLUDE_DIRECTORIES)
target_include_directories(marian PUBLIC ${INCDIRS})

get_property(INCLUDE_DIRECTORIES DIRECTORY ssplit-cpp/src PROPERTY INCLUDE_DIRECTORIES)
target_include_directories(ssplit PUBLIC ${INCLUDE_DIRECTORIES})

get_property(COMPILE_DEFINITIONS DIRECTORY browsermt-marian-dev PROPERTY COMPILE_DEFINITIONS)
target_compile_definitions(marian PUBLIC ${COMPILE_DEFINITIONS})

get_property(COMPILE_OPTIONS DIRECTORY browsermt-marian-dev PROPERTY COMPILE_OPTIONS)
target_compile_options(marian PUBLIC ${COMPILE_OPTIONS})

# Compilation flags
get_directory_property(CMAKE_C_FLAGS DIRECTORY browsermt-marian-dev DEFINITION CMAKE_C_FLAGS)
get_directory_property(CMAKE_CXX_FLAGS DIRECTORY browsermt-marian-dev DEFINITION CMAKE_CXX_FLAGS)
set(CMAKE_C_FLAGS ${CMAKE_C_FLAGS} PARENT_SCOPE)
set(CMAKE_CXX_FLAGS ${CMAKE_CXX_FLAGS} PARENT_SCOPE)
1 change: 1 addition & 0 deletions inference/3rd_party/browsermt-marian-dev
Submodule browsermt-marian-dev added at 2781d7
1 change: 1 addition & 0 deletions inference/3rd_party/ssplit-cpp
Submodule ssplit-cpp added at a311f9
1 change: 1 addition & 0 deletions inference/BERGAMOT_VERSION
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
v0.4.5
188 changes: 188 additions & 0 deletions inference/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
cmake_minimum_required(VERSION 3.5.1)
set(CMAKE_MODULE_PATH ${CMAKE_CURRENT_SOURCE_DIR}/cmake)

if (POLICY CMP0074)
cmake_policy(SET CMP0074 NEW) # CMake 3.12
endif ()

if (POLICY CMP0077)
cmake_policy(SET CMP0077 NEW)
endif()

project(bergamot_translator CXX C)

# Retrieve the parent-directory path of PROJECT_SOURCE_DIR and assign that to REPOSITORY_ROOT_DIR.
cmake_path(GET PROJECT_SOURCE_DIR PARENT_PATH REPOSITORY_ROOT_DIR)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

# Generate a compile_commands.json in the build directory. The compile commands allow
# code editors to understand the build process and provide static analysis of the code.
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)

# Note that with CMake MSVC build, the option CMAKE_BUILD_TYPE is automatically derived from the key
# 'configurationType' in CMakeSettings.json configurations
if(NOT CMAKE_BUILD_TYPE)
message(WARNING "CMAKE_BUILD_TYPE not set; setting to Release")
set(CMAKE_BUILD_TYPE "Release")
endif()

if(NOT COMPILE_WASM)
# Setting BUILD_ARCH to native invokes CPU intrinsic detection logic below.
# Prevent invoking that logic for WASM builds.
set(BUILD_ARCH native CACHE STRING "Compile for this CPU architecture.")

# Unfortunately MSVC supports a limited subset of BUILD_ARCH flags. Instead try to guess
# what architecture we can compile to reading BUILD_ARCH and mapping it to MSVC values
# references: https://clang.llvm.org/docs/UsersManual.html https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html https://gcc.gnu.org/onlinedocs/gcc-4.8.5/gcc/i386-and-x86-64-Options.html
# https://docs.microsoft.com/en-us/cpp/build/reference/arch-x86?redirectedfrom=MSDN&view=vs-2019&view=msvc-170 https://devblogs.microsoft.com/oldnewthing/20201026-00/?p=104397
# This is by no means an exhaustive list but should match the most common flags Linux programmers expect to parse to MSVC
if(MSVC)
if(BUILD_ARCH STREQUAL "native") # avx2 is good default for native. Very few desktop systems support avx512
set(MSVC_BUILD_ARCH "/arch:AVX2")
elseif(BUILD_ARCH STREQUAL "skylake-avx512" OR BUILD_ARCH STREQUAL "cannonlake" OR BUILD_ARCH STREQUAL "x86-64-v4" OR BUILD_ARCH STREQUAL "tigerlake" OR BUILD_ARCH STREQUAL "cooperlake" OR BUILD_ARCH STREQUAL "cascadelake")
set(MSVC_BUILD_ARCH "/arch:AVX512")
elseif(BUILD_ARCH STREQUAL "core-avx2" OR BUILD_ARCH STREQUAL "haswell" OR BUILD_ARCH STREQUAL "x86-64-v3" OR BUILD_ARCH STREQUAL "broadwell" OR BUILD_ARCH STREQUAL "skylake")
set(MSVC_BUILD_ARCH "/arch:AVX2")
elseif(BUILD_ARCH STREQUAL "sandybridge" OR BUILD_ARCH STREQUAL "corei7-avx" OR BUILD_ARCH STREQUAL "core-avx-i" OR BUILD_ARCH STREQUAL "ivybridge")
set(MSVC_BUILD_ARCH "/arch:AVX")
elseif(BUILD_ARCH STREQUAL "nehalem" OR BUILD_ARCH STREQUAL "westmere" OR BUILD_ARCH STREQUAL "x86-64-v2" OR BUILD_ARCH STREQUAL "corei7" OR BUILD_ARCH STREQUAL "core2")
set(MSVC_BUILD_ARCH "/arch:SSE2") # This is MSVC default. We won't go down to SSE because we don't support that hardware at all with intgemm. Marian recommends to only go down to SSE4.1 at most
else()
message(WARNING "Unknown BUILD_ARCH ${BUILD_ARCH} provided. Default to SSE2 for Windows build")
set(MSVC_BUILD_ARCH "/arch:SSE2")
endif()
endif(MSVC)
endif()

#MSVC can't seem to pick up correct flags otherwise:
if(MSVC)
add_definitions(-DUSE_SSE2=1) # Supposed to fix something in the sse_mathfun.h but not sure it does
set(INTRINSICS ${MSVC_BUILD_ARCH}) # ARCH we're targetting on win32. @TODO variable

set(CMAKE_CXX_FLAGS "/EHsc /DWIN32 /D_WINDOWS /DUNICODE /D_UNICODE /D_CRT_NONSTDC_NO_WARNINGS /D_CRT_SECURE_NO_WARNINGS /bigobj")
set(CMAKE_CXX_FLAGS_RELEASE "${CMAKE_CXX_FLAGS} /MT /O2 ${INTRINSICS} /MP /GL /DNDEBUG")
set(CMAKE_CXX_FLAGS_DEBUG "${CMAKE_CXX_FLAGS} /MTd /Od /Ob0 ${INTRINSICS} /RTC1 /Zi /D_DEBUG")

# ignores warning LNK4049: locally defined symbol free imported - this comes from zlib
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} /DEBUG /LTCG:incremental /INCREMENTAL:NO /ignore:4049")
set(CMAKE_EXE_LINKER_FLAGS_RELEASE "${CMAKE_EXE_LINKER_FLAGS} /NODEFAULTLIB:MSVCRT")
set(CMAKE_EXE_LINKER_FLAGS_DEBUG "${CMAKE_EXE_LINKER_FLAGS} /NODEFAULTLIB:MSVCRTD")
set(CMAKE_STATIC_LINKER_FLAGS "${CMAKE_STATIC_LINKER_FLAGS} /LTCG:incremental")
endif(MSVC)

include(CMakeDependentOption)

# Project specific cmake options
option(COMPILE_WASM "Compile for WASM" OFF)
cmake_dependent_option(USE_WASM_COMPATIBLE_SOURCE "Use wasm compatible sources" OFF "NOT COMPILE_WASM" ON)

# WASM disables a million libraries, which also includes the unit test-library.
cmake_dependent_option(COMPILE_UNIT_TESTS "Compile unit tests" OFF "USE_WASM_COMPATIBLE_SOURCE" ON)
option(COMPILE_TESTS "Compile bergamot-tests" OFF)
cmake_dependent_option(ENABLE_CACHE_STATS "Enable stats on cache" ON "COMPILE_TESTS" OFF)


# Set 3rd party submodule specific cmake options for this project
SET(COMPILE_CUDA OFF CACHE BOOL "Compile GPU version")
SET(USE_SENTENCEPIECE ON CACHE BOOL "Download and compile SentencePiece")
SET(USE_STATIC_LIBS ON CACHE BOOL "Link statically against non-system libs")
SET(SSPLIT_COMPILE_LIBRARY_ONLY ON CACHE BOOL "Do not compile ssplit tests")
if (USE_WASM_COMPATIBLE_SOURCE)
SET(COMPILE_LIBRARY_ONLY ON CACHE BOOL "Build only the Marian library and exclude all executables.")
SET(USE_MKL OFF CACHE BOOL "Compile with MKL support")
# # Setting the ssplit-cpp submodule specific cmake options for wasm
SET(SSPLIT_USE_INTERNAL_PCRE2 ON CACHE BOOL "Use internal PCRE2 instead of system PCRE2")
endif()

# Documentation: https://cliutils.gitlab.io/modern-cmake/chapters/projects/submodule.html
# Ensures the submodules are set correctly during a build.
find_package(Git QUIET)
if(GIT_FOUND AND EXISTS "${REPOSITORY_ROOT_DIR}/.git")
# Update submodules as needed
option(GIT_SUBMODULE "Check submodules during build" ON)
if(GIT_SUBMODULE)
message(STATUS "Submodule update")
execute_process(COMMAND ${GIT_EXECUTABLE} submodule update --init --recursive
WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}
RESULT_VARIABLE GIT_SUBMOD_RESULT)
if(NOT GIT_SUBMOD_RESULT EQUAL "0")
message(FATAL_ERROR "git submodule update --init failed with ${GIT_SUBMOD_RESULT}, please checkout submodules")
endif()
endif()
endif()

# Project versioning
include(GetVersionFromFile)
message(STATUS "Project name: ${PROJECT_NAME}")
message(STATUS "Project version: ${PROJECT_VERSION_STRING_FULL}")

if(COMPILE_WASM)
# See https://github.com/emscripten-core/emscripten/blob/main/src/settings.js
list(APPEND WASM_COMPILE_FLAGS
-O3
# Preserve whitespaces in JS even for release builds; this doesn't increase wasm binary size
$<$<CONFIG:Release>:-g1>
# Relevant Debug info only for release with debug builds as this increases wasm binary size
$<$<CONFIG:RelWithDebInfo>:-g2>
-fPIC
-mssse3
-msimd128
# -fno-exceptions # Can't do that because spdlog uses exceptions
-sDISABLE_EXCEPTION_CATCHING=1
-sSTRICT=1
)
list(APPEND WASM_LINK_FLAGS
-O3
# Preserve whitespaces in JS even for release builds; this doesn't increase wasm binary size
$<$<CONFIG:Release>:-g1>
# Relevant Debug info only for release with debug builds as this increases wasm binary size
$<$<CONFIG:RelWithDebInfo>:-g2>
-lembind
# Save some code, and some speed
-sASSERTIONS=0
-sDISABLE_EXCEPTION_CATCHING=1
# the intgemm functions we call will be undefined since these are linked at
# runtime by our own javascript.
-sLLD_REPORT_UNDEFINED
-sERROR_ON_UNDEFINED_SYMBOLS=0
# Cause we can!
-sSTRICT=1
# You know we need it
-sALLOW_MEMORY_GROWTH=1
-sENVIRONMENT=web,worker
# No need to call main(), there's nothing there.
-sINVOKE_RUN=0
# No need for filesystem code in the generated Javascript
-sFILESYSTEM=0
# If you turn this on, it will mangle names which makes the dynamic linking hard.
-sDECLARE_ASM_MODULE_EXPORTS=0
# Export all of the intgemm functions in case we need to fall back to using the embedded intgemm
-sEXPORTED_FUNCTIONS=[_int8PrepareAFallback,_int8PrepareBFallback,_int8PrepareBFromTransposedFallback,_int8PrepareBFromQuantizedTransposedFallback,_int8PrepareBiasFallback,_int8MultiplyAndAddBiasFallback,_int8SelectColumnsOfBFallback]
# Necessary for mozintgemm linking. This prepares the `wasmMemory` variable ahead of time as
# opposed to delegating that task to the wasm binary itself. This way we can link MozIntGEMM
# module to the same memory as the main bergamot-translator module.
-sIMPORTED_MEMORY=1
# Dynamic execution is either frowned upon or blocked inside browser extensions
-sDYNAMIC_EXECUTION=0
)
endif(COMPILE_WASM)

# Needs to be enabled before including the folder containing tests (src/tests)
if(COMPILE_TESTS)
enable_testing()
endif(COMPILE_TESTS)

add_subdirectory(3rd_party)
add_subdirectory(src)

if(COMPILE_WASM)
add_subdirectory(wasm)
endif(COMPILE_WASM)

option(COMPILE_PYTHON "Compile python bindings. Intended to be activated with setup.py" OFF)
if(COMPILE_PYTHON)
add_subdirectory(bindings/python)
endif(COMPILE_PYTHON)

Loading