Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge segmenter_lstm with segmenter #2087

Merged
merged 1 commit into from
Jun 24, 2022

Conversation

makotokato
Copy link
Member

For #1654

@makotokato makotokato requested review from aethanyc, sffc and a team as code owners June 17, 2022 10:36
@makotokato makotokato force-pushed the merge-lstm-with-segmenter branch from 9010803 to 8e10cf6 Compare June 19, 2022 05:37
@jira-pull-request-webhook
Copy link

Notice: the branch changed across the force-push!

  • Cargo.lock is different
  • Cargo.toml is different
  • experimental/segmenter/Cargo.toml is different

View Diff Across Force-Push

~ Your Friendly Jira-GitHub PR Checker Bot

@aethanyc aethanyc linked an issue Jun 19, 2022 that may be closed by this pull request
Copy link
Contributor

@aethanyc aethanyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@sffc it would be great if you can take another look. Thanks!

@@ -22,7 +22,6 @@ experimental/collator/ @hsivonen @echeran
experimental/normalizer/ @hsivonen @echeran
experimental/provider_ppucd/ @echeran
experimental/segmenter/ @aethanyc @makotokato
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: If @sffc doesn't mind, I'd like propose we add @sffc as a co-owner of segmenter since he is the owner of the subsumed segmenter_lstm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sffc, could I add you to this?

@@ -168,3 +169,94 @@ impl Lstm {
bies
}
}

#[cfg(test)]
mod tests {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: I assume these tests within this mod are adapted from lstm_test.rs without any change to the logic. If not, please let me know.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lstm_structs and 'lstm_bies' are private module now. So I move lstm_test.rsto tolstm_bies.rs`.

@@ -157,6 +157,14 @@ extern crate lazy_static;
// Use the LSTM when the feature is enabled.
#[cfg(feature = "lstm")]
mod lstm;
#[cfg(feature = "lstm")]
mod lstm_bies;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit (for later): it would be cleaner to nest these under the lstm module.

serde = { version = "1.0", default-features = false, features = ["derive", "alloc"], optional = true }
serde_json = { version = "1.0", default-features = false, features = ["alloc"] }
lazy_static = { version = "1.0", features = ["spin_no_std"] }
zerovec = { version = "0.7", path = "../../utils/zerovec", features = ["yoke"] }
databake = { version = "0.4", path = "../../utils/databake", optional = true, features = ["derive"] }
litemap = { version = "0.4.0", path = "../../utils/litemap", optional = true, features = ["serde"] }
ndarray = { git = "https://github.com/rust-ndarray/ndarray", rev = "31244100631382bb8ee30721872a928bfdf07f44", default-features = false, optional = true, features = ["serde"] }
unicode-segmentation = { version = "1.3.0", optional = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder: we hope to get rid of this dependency.

litemap = { version = "0.4.0", path = "../../utils/litemap", optional = true, features = ["serde"] }
ndarray = { git = "https://github.com/rust-ndarray/ndarray", rev = "31244100631382bb8ee30721872a928bfdf07f44", default-features = false, optional = true, features = ["serde"] }
unicode-segmentation = { version = "1.3.0", optional = true }
num-traits = { version = "0.2", optional = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Request: See if you can get rid of this dependency as well.

@makotokato makotokato merged commit c9aef93 into unicode-org:main Jun 24, 2022
aethanyc added a commit to aethanyc/icu4x that referenced this pull request Jun 27, 2022
aethanyc added a commit that referenced this pull request Jun 28, 2022
samchen61661 pushed a commit to samchen61661/icu4x that referenced this pull request Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move segmeter_lstm into segmenter
3 participants