Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

多线程支持 #22

Open
lewiis opened this issue May 28, 2017 · 3 comments
Open

多线程支持 #22

lewiis opened this issue May 28, 2017 · 3 comments

Comments

@lewiis
Copy link

lewiis commented May 28, 2017

请问有没有可能支持多线程分词?就是Model只加载一次,然后多个线程分别对不同的分本进行处理?

@MaJunhua
Copy link
Collaborator

感谢您对THULAC的支持,如果是对一个长文本进行多线程分词,您可以试一下
THULAC_result& multiTreadCut(const std::string &in, THULAC& lac, int thread);输入一个待分词和词性标注的字符串,一个THULAC实例,线程数,返回THULAC_result类型变量

如果是对多个文本进行分词,目前我们没有封装,您可以自行创建thread多次调用cut函数~

@lewiis
Copy link
Author

lewiis commented May 29, 2017

谢谢回复!

自行创建thread多次调用cut函数固然可行,但是我担心的是,多个线程并行调用cut函数会不会出问题?

例如,如果多个线程通过同一个THULAC对象的指针或者引用来调用cut函数,而cut函数中调用TaggingDecoder::segment函数,其中会对allowed_label_lists数组进行赋值,这也就意味着多个线程会对同一个数组进行读写,我在其中并未看到mutex或者其他机制,这种情况下还是线程安全的吗?

包括在刚刚提交的multiTreadCut的实现中,似乎也未看到线程同步机制,不知道是否能够确保结果的正确性?

@sherlockhoatszx
Copy link

可以自行创建thread多次调

我试过了,多个thread用cut 容易引起 Nonetype 错误

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants