Skip to content
This repository has been archived by the owner on Feb 25, 2023. It is now read-only.

Feature: result grouping by main dictionary sequence (along with some other changes) #95

Merged
merged 54 commits into from
Oct 26, 2017

Conversation

siikamiika
Copy link
Collaborator

I can't find anything major to change or implement, so I'm opening this pull request. Some minor things below:

  • I'm not sure if compactTags and compactGlossaries should be on by default for new users. They will be off for existing users.

  • I tested on a Windows Firefox that utilStringHashCode gives the same hash for the first revision of the field templates (it could have messed up with \r\n after formRead). Can we trust this or should the string be normalized before hashing?

The compiled template is already there, though
- use correct tags
- indicate popular and rare terms
- indicate definitions restricted to specific terms
- frequencies (Innocent Corpus)
Alt+P now works again in grouped/split mode

In merged mode, 「、」 is added even after the last term, but it's
hidden for that. This ensures consistent behavior with voice button and
tags
Use ['gloss', 'ary'].concat('DictName')
Known collision: 日本国有鉄道 in JMdict and JMnedict
The dictionary tags field can now have a '\t' in it, and it is used to
separate tags associated with definitions and terms.
@FooSoft
Copy link
Owner

FooSoft commented Oct 19, 2017

Thanks for the PR! This is pretty large, so I will be looking this over the next couple of days (probably only get to running my normal tests on it this weekend).

Copy link
Owner

@FooSoft FooSoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial thoughts on the code, going to actually run it next ; )

tmpl/terms.html Outdated
{{/each}}
</div>
{{/if}}<!--
--></div><!--
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for consuming spaces here? It shouldn't impact the layout and arguably makes the template harder to understand.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are probably result of me trying to eliminate some whitespace between the term and the preceding comma. Seems like I wasn't aware of ~ at that moment.

tmpl/terms.html Outdated
<span class="label label-default tag-frequency">{{dictionary}}:{{frequency}}</span>
{{/each}}
</div>
{{/if}}<!--
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An easy way to consume space without making the template more complicated in areas next to handlebars blocks is to use the ~ character. I can't remember for sure if it works with closing tags, but I think you can do something like {{/if~}} to consume all spaces after the end of the if block.

tmpl/terms.html Outdated
{{#if glossary.[1]}}
<ul>
<ul {{#if compactGlossaries}}class="compact"{{/if}}>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer a more descriptive name for this class, so maybe compact-glossary?

tmpl/terms.html Outdated
<span class="label label-default tag-{{category}}" title="{{notes}}">{{name}}</span>
{{/each}}
</div>
{{/if}}
{{#if only}}
<div {{#if compactGlossaries}}style="display: inline-block;"{{/if}}>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer not to use classes instead of style attributes, especially since this one is reused. Maybe something like compact-info or something else to that effect.

function utilStringHashCode(string) {
let hashCode = 0;

if (string.length === 0) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conditional is unnecessary since if string.length == 0 the for loop will not execute and you will end up return 0 anyway.

options.general.compactTags = false;
options.general.compactGlossaries = false;
if (utilStringHashCode(options.anki.fieldTemplates) !== -805327496) { // a3c8508031a1073629803d0616a2ee416cd3cccc
options.anki.fieldTemplates = '{{#if merge}}\n' +
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer template string as it does not require explicit line breaks and you can just use all of the expressions inline without having to combine strings with + operator.

} else {
options.general.resultOutputMode = 'split';
}
options.general.compactTags = false;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to set compactTags and compcatGlossaries defaults here since they are already specified in the default options object. The first thing that happens when options are loaded is that the default options are merged onto the user options, meaning any missing fields are filled in. The only time you have to manually do things in version blocks is if you want to base settings on a converted version settings used in a prior options version.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, seems like they are useless at the moment. I wasn't sure if they should be set true by default for new users but false for existing users that haven't had the features in the first place.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the idea is that you only have to write versions that transform prior data or you are fixing up broken options data that you introduced with a bug, which I have totally never had to do before.

function handlebarsTermFrequencyColor(options) {
const termFrequency = options.fn(this);

if (termFrequency === 'popular') {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better to do this with classes. Having to update js files to change color scheme is odd.

function dictTermTagScore(tags) {
let score = 0;

const tagScores = {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something, but aren't these actually stored in the metadata in the dictionary JSON?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The score for terms in term_bank_n.json is currently the result of termTags like P or news and definitionTags like arch. Maybe yomichan-import should be updated to have a score separately for both of these. But that would require changes to the sorting. tag_bank_n.json only has [Name, Category, Order, Notes].

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I think it would be fine to introduce Score to the tag array (at the end, as usual). As you've probably noticed, a lot of effort is made to make sure that all data comes from data files only. The only exception is for styling (that's why it is OK that tag colors are in the style sheets).

}

for (const tag of definition.definitionTags) {
if (typeof tag === 'string') {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why again is it that definitionTags has tags that are either strings or objects? Is there a reason why they aren't objects always?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dictTermsMergeByGloss is used for definitions from Database.findTermsBySequence first (tags aren't expanded) and a mix of definitions from Translator.findTerms (tags expanded) and Database.findTermsExact (not expanded) after that. The tag metadata isn't needed in dictTermsMergeByGloss currently, so I didn't think it was necessary to expand the tags in the rest of the definitions at that point yet.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think tags should be expanded from the start for everything for clarity sake. It's much harder to reason about code when the type is not explicit. Of course, in languages like JavaScript, explicit type declaration is not an option (unless you use Typescript or something), so we have to rely on convention. By making tags be the same thing everywhere it's a lot easier to write bug-free code. Using typeof should be reserved only for checking if something that could have a false-like value is undefined.


$('.dict-group').each((index, element) => {
const dictionary = $(element);
if (dictionary.data('title') !== mainDictionaryTitle) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that this should be necessary if all of the radio buttons are in the same group (set by name attribute). Setting one should unset the others.

@FooSoft
Copy link
Owner

FooSoft commented Oct 21, 2017

I'm not sure I like the UX of the radio buttons for setting the "main" dictionary, but I don't yet know what would be better. Radio buttons are usually shown grouped closely together, and the fact that they are so far apart looks odd. Also I notice that after clearing dictionaries (or perhaps even importing dictionaries for the first time), none of the dictionaries are set as "main". If we want to support not having a "main" dictionary, then radio button is not the correct control to use. It should be a dropdown list with the names of all the dictionaries with the first option being set to "None" or something similar.

@siikamiika
Copy link
Collaborator Author

I remember that I planned setting the first imported dictionary that has sequence to the main dictionary but failed to find a good way to do it without changing format in index.json. Even in that case there could be 0 dictionaries that have sequence so radio buttons would still be the wrong approach. A dropdown list (that appears when merged mode is enabled?) sounds like a good solution. Also, it would be nice if the dropdown only contained dictionaries that actually have sequence but I can't think of a good way to do this with the current dictionary format.

@FooSoft
Copy link
Owner

FooSoft commented Oct 23, 2017

Also, it would be nice if the dropdown only contained dictionaries that actually have sequence but I can't think of a good way to do this with the current dictionary format.

This can be achieved by expanding on the data contained in the database:

dictionaries: '++,title,version'

We already store meta information about the dictionaries; all you would have to do is add a column like "hasSequences" and set it to true if all of the sequence data coming out of the dictionary is valid (!= -1). Versioning the dictionaries table is not going to be required since hasSequences does not have to be a database key. You should be able to just grab all the dictionaries and iterate over them and check for the truthiness of hasSequences. Dictionaries imported with older versions of Yomichan will have a value of undefined for the column, which will will work just fine instead of false.

@siikamiika
Copy link
Collaborator Author

The problem areas should be better now. The <!-- comment --> issue in terms.html actually required some more of them to get fixed completely (but I think I also replaced some with ~). Not sure if Handlebars would have some cleaner way to do it but this is how I've done it in the past because it lets you keep the markup indented.

titles = titles.filter(title => options.dictionaries[title].enabled);
const formOptionsHtml = [];
let mainDictionarySelected = false;
for (title of titles) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably mean for (const title of titles); not having const here makes it this look kind of wonky.

</select>
</div>

<div class="form-group options-merge">
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please move this under the "Dictionaries" section.

@FooSoft
Copy link
Owner

FooSoft commented Oct 26, 2017

Looking good 👍

One last thing: I think it would be more visually appealing to have the (Not selected) text appear in a muted color (add the text-muted class to the option, drop the parens).

@siikamiika
Copy link
Collaborator Author

Nice! I hope it doesn't break things when released and that users find it useful. It must be confusing that the mode doesn't work with old dictionaries but it should help that they don't show up in the dropdown. Feel free to rename things to make them easier to understand and remove the warning (experimental) if you think this feature is ready.

@FooSoft FooSoft merged commit 25ae2e4 into FooSoft:dev Oct 26, 2017
@FooSoft
Copy link
Owner

FooSoft commented Oct 26, 2017

🎉 🎈

@siikamiika
Copy link
Collaborator Author

👍

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants