-
Notifications
You must be signed in to change notification settings - Fork 231
Feature: result grouping by main dictionary sequence (along with some other changes) #95
Feature: result grouping by main dictionary sequence (along with some other changes) #95
Conversation
The compiled template is already there, though
- use correct tags - indicate popular and rare terms - indicate definitions restricted to specific terms - frequencies (Innocent Corpus)
Alt+P now works again in grouped/split mode In merged mode, 「、」 is added even after the last term, but it's hidden for that. This ensures consistent behavior with voice button and tags
Use ['gloss', 'ary'].concat('DictName') Known collision: 日本国有鉄道 in JMdict and JMnedict
The dictionary tags field can now have a '\t' in it, and it is used to separate tags associated with definitions and terms.
Thanks for the PR! This is pretty large, so I will be looking this over the next couple of days (probably only get to running my normal tests on it this weekend). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial thoughts on the code, going to actually run it next ; )
tmpl/terms.html
Outdated
{{/each}} | ||
</div> | ||
{{/if}}<!-- | ||
--></div><!-- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for consuming spaces here? It shouldn't impact the layout and arguably makes the template harder to understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are probably result of me trying to eliminate some whitespace between the term and the preceding comma. Seems like I wasn't aware of ~
at that moment.
tmpl/terms.html
Outdated
<span class="label label-default tag-frequency">{{dictionary}}:{{frequency}}</span> | ||
{{/each}} | ||
</div> | ||
{{/if}}<!-- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An easy way to consume space without making the template more complicated in areas next to handlebars blocks is to use the ~
character. I can't remember for sure if it works with closing tags, but I think you can do something like {{/if~}}
to consume all spaces after the end of the if
block.
tmpl/terms.html
Outdated
{{#if glossary.[1]}} | ||
<ul> | ||
<ul {{#if compactGlossaries}}class="compact"{{/if}}> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer a more descriptive name for this class, so maybe compact-glossary
?
tmpl/terms.html
Outdated
<span class="label label-default tag-{{category}}" title="{{notes}}">{{name}}</span> | ||
{{/each}} | ||
</div> | ||
{{/if}} | ||
{{#if only}} | ||
<div {{#if compactGlossaries}}style="display: inline-block;"{{/if}}> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer not to use classes instead of style
attributes, especially since this one is reused. Maybe something like compact-info
or something else to that effect.
ext/bg/js/util.js
Outdated
function utilStringHashCode(string) { | ||
let hashCode = 0; | ||
|
||
if (string.length === 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This conditional is unnecessary since if string.length == 0
the for loop will not execute and you will end up return 0
anyway.
ext/bg/js/options.js
Outdated
options.general.compactTags = false; | ||
options.general.compactGlossaries = false; | ||
if (utilStringHashCode(options.anki.fieldTemplates) !== -805327496) { // a3c8508031a1073629803d0616a2ee416cd3cccc | ||
options.anki.fieldTemplates = '{{#if merge}}\n' + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer template string as it does not require explicit line breaks and you can just use all of the expressions inline without having to combine strings with +
operator.
ext/bg/js/options.js
Outdated
} else { | ||
options.general.resultOutputMode = 'split'; | ||
} | ||
options.general.compactTags = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to set compactTags
and compcatGlossaries
defaults here since they are already specified in the default options object. The first thing that happens when options are loaded is that the default options are merged onto the user options, meaning any missing fields are filled in. The only time you have to manually do things in version blocks is if you want to base settings on a converted version settings used in a prior options version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, seems like they are useless at the moment. I wasn't sure if they should be set true by default for new users but false for existing users that haven't had the features in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah the idea is that you only have to write versions that transform prior data or you are fixing up broken options data that you introduced with a bug, which I have totally never had to do before.
ext/bg/js/handlebars.js
Outdated
function handlebarsTermFrequencyColor(options) { | ||
const termFrequency = options.fn(this); | ||
|
||
if (termFrequency === 'popular') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably better to do this with classes. Having to update js
files to change color scheme is odd.
ext/bg/js/dictionary.js
Outdated
function dictTermTagScore(tags) { | ||
let score = 0; | ||
|
||
const tagScores = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something, but aren't these actually stored in the metadata in the dictionary JSON?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The score for terms in term_bank_n.json
is currently the result of termTags
like P
or news
and definitionTags
like arch
. Maybe yomichan-import should be updated to have a score separately for both of these. But that would require changes to the sorting. tag_bank_n.json
only has [Name, Category, Order, Notes]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, I think it would be fine to introduce Score
to the tag array (at the end, as usual). As you've probably noticed, a lot of effort is made to make sure that all data comes from data files only. The only exception is for styling (that's why it is OK that tag colors are in the style sheets).
ext/bg/js/dictionary.js
Outdated
} | ||
|
||
for (const tag of definition.definitionTags) { | ||
if (typeof tag === 'string') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why again is it that definitionTags
has tags that are either strings or objects? Is there a reason why they aren't objects always?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dictTermsMergeByGloss
is used for definitions from Database.findTermsBySequence
first (tags aren't expanded) and a mix of definitions from Translator.findTerms
(tags expanded) and Database.findTermsExact
(not expanded) after that. The tag metadata isn't needed in dictTermsMergeByGloss
currently, so I didn't think it was necessary to expand the tags in the rest of the definitions at that point yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think tags should be expanded from the start for everything for clarity sake. It's much harder to reason about code when the type is not explicit. Of course, in languages like JavaScript, explicit type declaration is not an option (unless you use Typescript or something), so we have to rely on convention. By making tags be the same thing everywhere it's a lot easier to write bug-free code. Using typeof
should be reserved only for checking if something that could have a false-like value is undefined
.
ext/bg/js/settings.js
Outdated
|
||
$('.dict-group').each((index, element) => { | ||
const dictionary = $(element); | ||
if (dictionary.data('title') !== mainDictionaryTitle) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this should be necessary if all of the radio buttons are in the same group (set by name
attribute). Setting one should unset the others.
I'm not sure I like the UX of the radio buttons for setting the "main" dictionary, but I don't yet know what would be better. Radio buttons are usually shown grouped closely together, and the fact that they are so far apart looks odd. Also I notice that after clearing dictionaries (or perhaps even importing dictionaries for the first time), none of the dictionaries are set as "main". If we want to support not having a "main" dictionary, then radio button is not the correct control to use. It should be a dropdown list with the names of all the dictionaries with the first option being set to "None" or something similar. |
I remember that I planned setting the first imported dictionary that has |
This can be achieved by expanding on the data contained in the database:
We already store meta information about the dictionaries; all you would have to do is add a column like "hasSequences" and set it to true if all of the sequence data coming out of the dictionary is valid ( |
The problem areas should be better now. The |
ext/bg/js/settings.js
Outdated
titles = titles.filter(title => options.dictionaries[title].enabled); | ||
const formOptionsHtml = []; | ||
let mainDictionarySelected = false; | ||
for (title of titles) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably mean for (const title of titles)
; not having const
here makes it this look kind of wonky.
ext/bg/settings.html
Outdated
</select> | ||
</div> | ||
|
||
<div class="form-group options-merge"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this under the "Dictionaries" section.
Looking good 👍 One last thing: I think it would be more visually appealing to have the |
Nice! I hope it doesn't break things when released and that users find it useful. It must be confusing that the mode doesn't work with old dictionaries but it should help that they don't show up in the dropdown. Feel free to rename things to make them easier to understand and remove the warning (experimental) if you think this feature is ready. |
🎉 🎈 |
👍 |
I can't find anything major to change or implement, so I'm opening this pull request. Some minor things below:
I'm not sure if
compactTags
andcompactGlossaries
should be on by default for new users. They will be off for existing users.I tested on a Windows Firefox that
utilStringHashCode
gives the same hash for the first revision of the field templates (it could have messed up with\r\n
afterformRead
). Can we trust this or should the string be normalized before hashing?