Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mail addresses are split by tokenizer #511

Open
timohund opened this issue Jul 15, 2016 · 2 comments
Open

mail addresses are split by tokenizer #511

timohund opened this issue Jul 15, 2016 · 2 comments

Comments

@timohund
Copy link
Contributor

Mail addresses in content such as my.name@example.org are split by the StandardTokenizerFactory as "my.name" and "example.org" because according to http://unicode.org/reports/tr29/#Word_Boundaries certain punctuation between letter is not a word boundary. This causes "my.name" and "example.org" to show up in the autocomplete suggestions.

Alternative tokenizer: UAX29URLEmailTokenizerFactory this is similar to the StandardTokenizerFactory, but also recognizes URLs and mail addresses as entire tokens. This way the complete mail address would show up in the suggestions, which is more understandable to website users.

https://forge.typo3.org/issues/48990

@timohund
Copy link
Contributor Author

timohund commented Jul 15, 2016

Discussion:
1
Updated by Ingo Renner about 3 years ago

  • Target version set to 3.0
    
  • TYPO3 Version changed from 4.7 to 4.5
    

2
Updated by Bernhard Kraft over 2 years ago
Comment Edit

You could configure an additional solr field and make it a copy of the "content" field. Then let this solr field get tokenized by the mentioned tokenizer and configure the suggest feature to take this solr field into account.
3
Updated by Jigal van Hemert over 2 years ago
Comment Edit

That's a workaround. There is a suitable tokenizer available, so why not use it? That way all users of EXT:solr have understandable suggestions without having to tinker with the server configuration.

@timohund
Copy link
Contributor Author

Estimation:

Effort: 5
Value: 5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant