Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPENNLP-936: Add thread-safe versions of POSTaggerME, SentenceDetecto… #69

Merged
merged 3 commits into from
Oct 7, 2024

Conversation

twgoetz
Copy link
Contributor

@twgoetz twgoetz commented Jan 17, 2017

…rME and TokenizerME. Include test case as well.

I'm open to changing the names of the classes, if anybody has a better idea.

@kottmann
Copy link
Member

Thanks!

  • Indent by 2 spaces (not 4), checkstyle is giving many warnings about that (click the green check mark to see the travis-ci output)
  • Each new file needs to have AL 2.0 license header
  • Commit messages subject line is limited to 72 chars, please make it short and also use the body of it
  • We strongly recommend to set up Eclipse or Intellij with the style files from our code convention page (https://opennlp.apache.org/code-conventions.html)
  • Update the commit above e.g. via git commit --amend or squash and then force push (git push -f) to update the PR

@twgoetz
Copy link
Contributor Author

twgoetz commented Jan 17, 2017

Ok done.

*/
public class POSTaggerME_TS implements POSTagger {

private POSModel model;
Copy link

@linekin linekin Apr 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

POSModel is not thread-safe because GISModel uses HashMap which is not thread-safe.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to change the values in the map after the model is create. It should be a Collections.unmodifiableMap()... which would be Thread.Safe().

@mawiesne
Copy link
Contributor

This issue (OPENNLP-936) is still relevant to many users. Is there any form of consensus - in 2022 - on how to progress with the current code base?

@jzonthemtn
Copy link
Contributor

The comments on this PR seem fairly trivial. I have no objections to getting those comments addressed and merging this. I will go through the PR again and see if there's anything since then that should be addressed.

Thanks @mawiesne for bringing attention to this one.

@bachelarius
Copy link

Hi all, looks like this thread has broadly reached consensus back in 2022, but not yet been merged - what needs to happen to get it looked at again?

I've verified that the current version of the SentenceDetectorME indeed suffer from this issue and implementing the solution originally proposed by @twgoetz fixes it.

@rzo1
Copy link
Contributor

rzo1 commented Sep 21, 2024

@bachelarius Think it boils down to (a) rename the classes, (b) add the @threadsafe annotation for tracking and (c) rebase against latest changes from main.

@rzo1
Copy link
Contributor

rzo1 commented Sep 21, 2024

That being said: PRs are welcome.

Thread safe versions of POSTaggerME, SentenceDetectorME and TokenizerME.
Include test case as well.
@rzo1
Copy link
Contributor

rzo1 commented Oct 1, 2024

I rebased this PR, so we get CI feedback.

@rzo1
Copy link
Contributor

rzo1 commented Oct 1, 2024

Copy link
Contributor

@mawiesne mawiesne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @twgoetz, @rzo1 and other feedback providers!

@mawiesne
Copy link
Contributor

mawiesne commented Oct 7, 2024

@mawiesne mawiesne merged commit df4a7ca into apache:main Oct 7, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants