Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 810636 - Poor copy & paste behavior with pdf.js #2989

Closed
jviereck opened this issue Mar 25, 2013 · 15 comments
Closed

Bug 810636 - Poor copy & paste behavior with pdf.js #2989

jviereck opened this issue Mar 25, 2013 · 15 comments

Comments

@jviereck
Copy link
Contributor

See also Bug 810636 on Bugzilla: https://bugzilla.mozilla.org/show_bug.cgi?id=810636 for reference.

Basic problem: The textLayer is build up from multiple <span> elements. If the user selects text across multiple spans and copies the text, there is a newline insert between each span. However, it makes more sense (in most cases) to just insert a whitespace. Firefox has now support to change the clipboardData during the copy/cut event (background: https://bugzilla.mozilla.org/show_bug.cgi?id=407983).

Implementation idea:

  1. Add an onCopy and onCut event listener to PDF.JS
  2. If the event is fired, look at the current selected text in the textLayer
  3. For each selected text in the span, take that text and concat it with a whitespace
  4. Put the resulting string on the clipboarData object by using

This shouldn't be hard to implement and I have a plan for it bug lack the time to do it myself and make sure it lands properly.

Let me know if someone is interested in fixing this.

@vyv03354
Copy link
Contributor

The textLayer is build up from multiple <span> elements.

Actually the textLayer is build up from <div> elements, not <span>. The newline is inserted because the <div> element is a blobk-level element.
Even simply changing <div> to <span> will improve the copy result without adding any JavaScript codes.

@mduan
Copy link
Contributor

mduan commented Mar 27, 2013

If we just changed all <div>s to <span>s, wouldn't we have the opposite problem? That there wouldn't be spaces between paragraphs? I guess in general this would still be better behaviour than having newlines where there should not be since it's less common to select multiple paragraphs.

@vyv03354
Copy link
Contributor

Correct, but we can add <br> or something when a newline is expected. The opposite (removing a newline when it is inappropreate) is impossible unless adding onCopy and onCut handlers.
Note that I don't oppose the clipboardData solution. However the more natural markup will work as a fallback even if the clipboardData object is not supported or is disabled (due to a security reason, for example).

@jviereck
Copy link
Contributor Author

+1 for using <span>.

@SSk123
Copy link
Contributor

SSk123 commented Apr 18, 2013

Hi I am interested in fixing this issue ,can anyone mentor me to fix it ,where should I start looking in the code to start working on it?

@brendandahl
Copy link
Contributor

@rishibaldawa
Copy link

I got the span working well but can't find the new line character. Is it getting trimmed somewhere earlier ?

@timvandermeij
Copy link
Contributor

@rishib1988 Are you still working on this? If so and you need any help, you can always contact us using IRC. It would be nice to have this feature in PDF.js :)

@lpy
Copy link
Contributor

lpy commented Oct 26, 2013

Hello. I am interested in fixing this issues. Could anyone help me? Where should I start to look?

I will try to read viewer.js. Is there anything else?

@timvandermeij
Copy link
Contributor

@lpy I think @SSk123 is also working on this, but I'm not 100% sure. If you want help with this, the best thing to do is to contact us using the PDF.js IRC channel (irc.mozilla.org, #pdfjs).

/cc @yurydelendik @SSk123

@SSk123
Copy link
Contributor

SSk123 commented Nov 9, 2013

@lpy, good to see you are interested in fixing this issue, just wanted to make sure you are working on it, or else I would be happy to work on it :)

@vagifverdi
Copy link

Any update on this issue? We have to tell our customers use abode reader because of this problem. Copy/paste is a must have for us.

@jviereck
Copy link
Contributor Author

jviereck commented Jul 1, 2014

Just did a short try and replaced divs with spans. This removes the newlines as mentioned above, however, it also removes the space between newlines. E.g. the paragraph:

block, a trace can contain join nodes. Since a trace always only
follows one single path through the original program, however, join 

gets copied to a single line:

block, a trace can contain join nodes. Since a trace always onlyfollows one single path through the original program, however, join

Note the missing space between "onlyfollows".

PS: you can find my work here: jviereck@47b97bf

@Snuffleupagus
Copy link
Collaborator

There is also PR #4629, but work on that seems to have stopped.

@timvandermeij
Copy link
Contributor

Closing since we now use span elements. For any remaining issues, please check if we already have an open issue. If not, please open a new one since this one has become too general to be actionable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests