Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alignment information #52

Merged
merged 52 commits into from
Oct 29, 2021
Merged

Conversation

jelmervdl
Copy link
Collaborator

Implementation of #50

This pull requests adds a QSyntaxHighlighter to the translation output box that highlights the alignment probability of words (or sentencepiece tokens really) in the output box given the sentencepiece token currently under the cursor in the input box.

There's a toggle to turn it on/off in the View menu.

It also works when you select more than one word in the input box. It will then highlight all the parts that match that selection. The probability-based colour will make less sense in that case though.

Additionally, if you click on a word in the output box (the translation) it will move the cursor in the input box with the word that aligns best with it. However, it does not move the focus to the input box, so the only way to then use that moved cursor is through tapping [shift] + [tab]. I'm not yet sure about the usefulness of this feature.

image

@jelmervdl jelmervdl marked this pull request as draft October 1, 2021 09:26
@jelmervdl jelmervdl changed the title Draft: Alignment information Alignment information Oct 1, 2021
@jelmervdl
Copy link
Collaborator Author

jelmervdl commented Oct 1, 2021

It completely breaks down once you turn off "translate as I type" and start typing: It will still highlight stuff, but that doesn't make any sense at that point.

As far as I know the QTextEdit widget doesn't give me enough information or a data type that allows me to see which parts are original, and which are added, and backtrack where the cursor position would be if we'd remove all the changes since the last translation. Therefore I'm thinking of just disabling highlighting as soon as you start changing the input text.

@jerinphilip
Copy link

jerinphilip commented Oct 1, 2021

FYI, I intend to remove softAlignment/hardAlignment support in the cache PR (browsermt/bergamot-translator#202) most likely. I mean, it'll work given there's a commit for bergamot-translator the submodule points to for now, but whatever API for alignments provided is unlikely to survive @qianqianzhu's TagTree PR - browsermt/bergamot-translator#199 (conditioned on the functionality working).

If there is demand for the raw marian alignments to be exposed and kept, please let know. (My plan as of now is to discard and let @qianqianzhu operate with ranges on a binary flat-storage). /cc @kpu

I would think you can reuse the UI - but expect what comes after @qianqianzhu to be very different in terms of API, and possibly more useful.

@qianqianzhu
Copy link

FYI, I intend to remove softAlignment/hardAlignment support in the cache PR (browsermt/bergamot-translator#202) most likely. I mean, it'll work given there's a commit for bergamot-translator the submodule points to for now, but whatever API for alignments provided is unlikely to survive @qianqianzhu's TagTree PR - browsermt/bergamot-translator#199 (conditioned on the functionality working).

@jerinphilip feel free to go for your plan. Just one question: how can I get Marian alignment matrix after that?

@kpu
Copy link
Collaborator

kpu commented Oct 1, 2021

Please kill all hard alignments. I think there is a usecase for soft alignments to be returned to the user if requested in raw form. That will make pivoting easier. Keep in mind @qianqianzhu can be stacked on top, taking the soft alignments and tag information.

@XapaJIaMnu
Copy link
Owner

I can't get it to work with QT5 on Linux. I get a single error.

QObject::connect: Cannot queue arguments of type 'Translation'
(Make sure 'Translation' is registered using qRegisterMetaType().)

No translations are displayed at this point, regardless of whether i use translate as I type or the translate button. Showing alignments also doesn't make a difference.

Copy link
Owner

@XapaJIaMnu XapaJIaMnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to solve the issue of being able to highlight inside long blocks of text, we probably need a more complicated container for the text, before it gets dispatched to qTextBox. Let's discuss offline.

src/Translation.h Show resolved Hide resolved
@jelmervdl
Copy link
Collaborator Author

In order to solve the issue of being able to highlight inside long blocks of text, we probably need a more complicated container for the text, before it gets dispatched to qTextBox. Let's discuss offline.

I found https://doc.qt.io/qt-5/qtextdocument.html#contentsChange which gives information about edits and their position. With that we should be able to track and backtrack where the cursor is given the original alignment offsets. But it will involve a bit of bookkeeping.

@XapaJIaMnu
Copy link
Owner

dheart@arbiter ..i_stuff/postdoc/translateLocally/build (git)-[show-alignment] % ./translateLocally
QObject::connect: Cannot queue arguments of type 'Translation::Direction'
(Make sure 'Translation::Direction' is registered using qRegisterMetaType().)

One issue that I have when copy/pasting large chunks of texts is that the two textbox screens are out of sync. Eg the input one would have scrolled to the end, whereas the output one would stay at the start. I think that on click in the input field, it should scroll the output field to the corresponding place and vice versa?

@jelmervdl
Copy link
Collaborator Author

QObject::connect: Cannot queue arguments of type 'Translation::Direction'

Thanks. I wonder whether we can test for these runtime errors that only occur at startup as well using Github actions. I will try to test it often on Ubuntu (as soon as I have an unmetered internet connection again.)

One issue that I have when copy/pasting large chunks of texts is that the two textbox screens are out of sync

I'm now working on making sure the highlighted parts of either box (depending on in which you click) is always scrolled into view. But this will only work if there is alignment information and if it is up-to-date: so immediately after the translation has loaded in, but not once you start editing it again.

The alternative approach would be to just sync the scroll position of both scroll bars, i.e. if scroll bar of the input box is at about 60%, make the one on the output box also 60%. No alignment information necessary, but might be slightly off when you're selecting stuff at the edges of your viewport. Example

@XapaJIaMnu
Copy link
Owner

QObject::connect: Cannot queue arguments of type 'Translation::Direction'

Thanks. I wonder whether we can test for these runtime errors that only occur at startup as well using Github actions. I will try to test it often on Ubuntu (as soon as I have an unmetered internet connection again.)

I am not exactly sure how we can do this with CI. We require to input translation, and to have alignment enabled to trigger the warning message. Maybe once I start implementing the command line interface we can see if I can trigger any of those in a console only environment.

One issue that I have when copy/pasting large chunks of texts is that the two textbox screens are out of sync

I'm now working on making sure the highlighted parts of either box (depending on in which you click) is always scrolled into view. But this will only work if there is alignment information and if it is up-to-date: so immediately after the translation has loaded in, but not once you start editing it again.

The alternative approach would be to just sync the scroll position of both scroll bars, i.e. if scroll bar of the input box is at about 60%, make the one on the output box also 60%. No alignment information necessary, but might be slightly off when you're selecting stuff at the edges of your viewport. Example

I think the alternative approach is good enough for 99% of the cases, let's do that. Do you thin it's necessary to put an option for it?

@jelmervdl
Copy link
Collaborator Author

I think the alternative approach is good enough for 99% of the cases, let's do that. Do you thin it's necessary to put an option for it?

I almost have a working implementation of this alternative approach (it's really simple!) that I'll push so you can play around with it yourself. I can always add a toggle for it in the preferences/view menu later on.

@jelmervdl
Copy link
Collaborator Author

In theory it seems to be as simple as adding this to MainWindow:

    connect(ui_->inputBox->verticalScrollBar(), &QAbstractSlider::valueChanged, [&](int value) {
        float percentage = (float) value / ui_->inputBox->verticalScrollBar()->maximum();
        ui_->outputBox->verticalScrollBar()->setValue((int) (ui_->outputBox->verticalScrollBar()->maximum() * percentage + 0.5f));
    });

In practice, this doesn't seem to be triggered when you move the cursor inside the text, and the text box scrolls to keep it in view. Which doesn't make sense to me! The scrollbar updates when you do this, and that can only happen when QAbstractSlider::setValue() is called. Which always emits QAbstractSlider::valueChanged(). But for some reason, it doesn't.

Surprisingly tricky to make feel okay
Still works okay on macOS
This yellow works well in dark mode as well.
Comments mostly, some const-correctness
@jelmervdl
Copy link
Collaborator Author

I think I got the synced scrolling nailed down now. Well, the easy implementation. It's intentionally only one way at the moment. Both because it was easier, no need to worry about a weird feedback loop. And to give you an escape hatch to still have both boxes at a different scroll position.

I think my ideal implementation would be the current one (one-way sync) with some scroll-into-view for the input box when you do select/highlight text for alignment. I'm not sure yet whether that should only work if alignment info is turned on. The current behaviour does not need the alignment information.

There should probably be a toggle to turn all this syncing off entirely. If not, I'm tempted to have a hard long think about whether it should always be on, or only when translate as you type is on.

Oh and I did already implement an option to pick your own colour! Could copy most of that out of an old project that had similar functionality :)

@XapaJIaMnu
Copy link
Owner

It works really good. My only comment is that "Highlight matching words in the presentation" is shown in the options, whereas in edit it's "show alignments".

Do you think after long copy/pastes we should scroll the input field up? I think that's what people would expect?

@jelmervdl
Copy link
Collaborator Author

It works really good. My only comment is that "Highlight matching words in the presentation" is shown in the options, whereas in edit it's "show alignments".

I'm tempted to change the menu option to "Highlight translation" or "Highlight matching word", "alignment" doesn't sound like something people would be familiar with.

Do you think after long copy/pastes we should scroll the input field up? I think that's what people would expect?

Right now the behaviour is the same as if you would paste a long text in a text editor: cursor always ends up at the end of the pasted segment. And scroll position is always so that the cursor is in view. I think I'd like to keep it that way.

There should probably be a toggle to turn all this syncing off entirely. If not, I'm tempted to have a hard long think about whether it should always be on, or only when translate as you type is on.

Thinking a bit more about the behaviour, I think it would make sense to have the synced scrolling only turned on when you also have "translate as I type" turned on. When that's toggled off, when a translation is ready, do scroll the output box to the scroll position of the input box, but only once. Subsequent scrolls of the input box don't automatically copy over to the output box.

It's the same behaviour cmd + s (or cmd + enter) as in e.g. Overleaf: the pdf scrolls to where you are currently editing, but it is not constantly kept in sync.

@XapaJIaMnu
Copy link
Owner

It works really good. My only comment is that "Highlight matching words in the presentation" is shown in the options, whereas in edit it's "show alignments".

I'm tempted to change the menu option to "Highlight translation" or "Highlight matching word", "alignment" doesn't sound like something people would be familiar with.

I agree

Do you think after long copy/pastes we should scroll the input field up? I think that's what people would expect?

Right now the behaviour is the same as if you would paste a long text in a text editor: cursor always ends up at the end of the pasted segment. And scroll position is always so that the cursor is in view. I think I'd like to keep it that way.

Probably correct

There should probably be a toggle to turn all this syncing off entirely. If not, I'm tempted to have a hard long think about whether it should always be on, or only when translate as you type is on.

Thinking a bit more about the behaviour, I think it would make sense to have the synced scrolling only turned on when you also have "translate as I type" turned on. When that's toggled off, when a translation is ready, do scroll the output box to the scroll position of the input box, but only once. Subsequent scrolls of the input box don't automatically copy over to the output box.

It's the same behaviour cmd + s (or cmd + enter) as in e.g. Overleaf: the pdf scrolls to where you are currently editing, but it is not constantly kept in sync.

I think having an option to scroll the input to the target on by default and making it togleable makes the most sense. The thing is, when you have a really big document, automatic scrolling can come in handy to be enabled, even if no on-the-fly edits are happening

Cheers,

Nick

@jelmervdl
Copy link
Collaborator Author

jelmervdl commented Oct 21, 2021 via email

@jelmervdl
Copy link
Collaborator Author

jelmervdl commented Oct 25, 2021

There's one more thing I'd like to add (when you click/select in the output box, scroll the highlighted area in the input box into view, while not affecting scroll position in the output box) but that can be added post-merge.

So I guess this one is ready for clean-up, testing, review, and merge.

@jelmervdl jelmervdl marked this pull request as ready for review October 25, 2021 07:45
Copy link
Owner

@XapaJIaMnu XapaJIaMnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I want to run it on Windows first, because I am afraid it might have some unicode issues


void AlignmentHighlighter::setColor(QColor color) {
color_ = color;
highlight(alignments_);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some way to try to get the system default highlighting and start with this one on first run?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QGuiApplication::palette() should know it, but that differs based on the OS' light/dark mode. Which is why I picked orange: it's distinct from the default text selection colour to avoid confusion about which text box is focussed, and it works in both light and dark modes. At least on macOS, my Gnome desktop distro doesn't have dark mode support yet.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought was to keep the default for the system and then let the user change it? Or would be confusing, because it would look the same as selecting the text for copy?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give it a try, but I think it will be confusing to use the highlight colour that you normally only see in text boxes when they have focus. But in our case that text box won't have focus.

It might also be a bit lazyness on my part. Getting the system highlight colour in there right, and staying consistent when the user switches their OS between dark and light mode, seems to be a bit tricky and OS specific.

@@ -0,0 +1,133 @@
#include "Translation.h"
#include "3rd_party/bergamot-translator/src/translator/response.h"
#include <QDebug>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated include?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes QDebug shouldn't be there. I'll get rid of it.

The response.h is there as Translation.h just does a forward declaration because of a pointer. I didn't want all files that include Translation.h also having to include bergamot-translator files. That privilege is currently limited to Translation.cpp and MarianInterface.cpp.

src/mainwindow.h Show resolved Hide resolved
@XapaJIaMnu
Copy link
Owner

Windows works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants