-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ocr识别出问题 #12
Comments
修改代码的部分是 :CrackVerifyCode.py的 CrackCode 类中的成员函数 |
你好,我使用中遇到下面的问题,请问如何解决? |
cnki 改变了验证模式,会有二次ip验证(返回的网页就不是目标网页)。所以下载文献的代码已经失效了。
…------------------ 原始邮件 ------------------
发件人: "dengwen168"<notifications@github.com>;
发送时间: 2020年3月31日(星期二) 下午3:48
收件人: "CyrusRenty/CNKI-download"<CNKI-download@noreply.github.com>;
抄送: "蔡治成"<czc.cai@qq.com>; "Author"<author@noreply.github.com>;
主题: Re: [CyrusRenty/CNKI-download] ocr识别出问题 (#12)
你好,我使用中遇到下面的问题,请问如何解决?
File "C:\Users\john1\Desktop\PI\cnki\CNKI-download-master\CNKI-download-master\CrackVerifyCode.py", line 34, in get_im age self.current_url = re.search(r'(.*?)#', current_url).group(1) AttributeError: 'NoneType' object has no attribute 'group'
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
哦,其实我不用下载文献的,只需要采集详情页的那些关键词,摘要信息,应该还是可以用的吧? |
也不可以,你得重写验证码链接的判断逻辑,以及使用云服务提供商的OCR服务对小图片进行识别(图片太小了,不友好)。
…------------------ 原始邮件 ------------------
发件人: "dengwen168"<notifications@github.com>;
发送时间: 2020年3月31日(星期二) 下午4:33
收件人: "CyrusRenty/CNKI-download"<CNKI-download@noreply.github.com>;
抄送: "蔡治成"<czc.cai@qq.com>; "Author"<author@noreply.github.com>;
主题: Re: [CyrusRenty/CNKI-download] ocr识别出问题 (#12)
哦,其实我不用下载文献的,只需要采集详情页的那些关键词,摘要信息,应该还是可以用的吧?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
好的,谢谢,看样子得自己好好研究一下才行了。 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
问题描述
直接fork到的代码不是直接能用的
然后修改了一下
在
result = tesserocr.image_to_text(image)
这里出现了问题无论如何识别,或者处理图像,tesserocr返回结果均为空
The text was updated successfully, but these errors were encountered: