-
-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract Only Year from text #4
Comments
Great question. Give me 24 hours and I'll have a solution for you :) |
Thanks man!!!!!! |
Hey, I thought about it a lot and this is what I came up with. You can set |
Hi Daniel, For example my text is :
Ouput :
So here the 1948 year should not have been fetched. I think we can solve this issue if we implement a login to only normalise those 2 digits which are preceded by "-" and not followed by "th". Please let any know if you have any other solution to resolve this. Thanks & Regards, |
@swathimithran, thanks for the example. Your help is sincerely appreciated! As a quick fix, I made it so it won't capture ordinal numbers that end in However, we will need more discussion on what rule should be used for what precedes the 2 digits. Here's a few examples of 2 digit years:
Basically, I'm afraid that if we make the rule too strict, people won't be able to parse dates out of filenames. Here's a few possible solutions. What would you like? Option 1Add a parameter
Option 2The second option could be allowing users to override existing patterns.
Option 3The third option could be returning a confidence level,
Option 4I'm open to suggestions as long as it doesn't prevent users from extracting years out of filenames. Which do you prefer? What do you think? |
Thanks for this great project.
Currently I am able to extract the dates, but for only year i.e for eample "In year 2011 the incident happened." The program retrieves "2011-01-01 00:00:00+00".
But we need to retrieve it as "2011-01-01 12:14:12+00"
Can you please let me know how should I change in the library to achieve this.
The basic Aim is to differentiate the original "1st Jan 2011" and "2011".
Thanks
The text was updated successfully, but these errors were encountered: