Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support paragraphs grouping in the same div #5482

Closed
fjaguero opened this issue Nov 9, 2014 · 3 comments
Closed

Support paragraphs grouping in the same div #5482

fjaguero opened this issue Nov 9, 2014 · 3 comments

Comments

@fjaguero
Copy link

fjaguero commented Nov 9, 2014

Hello,

I have a question regarding the conversion to HTML of the PDF file: Is it possible to detect the paragraphs?

In the following screenshot you can see that the each line of a paragraph is a div, so there is no way (without playing with tops, etc) to know where each paragraph ends.

Demo PDF

Example:

p

Right now:

<div class="textLayer">
 <div>Line of text of the first p,</div>
 <div>another line of text of the first one</div>
 <div>This is another p</div>
</div>

Expected:

<div class="textLayer">
 <div>Line of text of the first p, another line of text of the first one</div>
 <div>This is another p</div>
</div>

The same file has the expected paragraph separation when opening it in native viewers like the Adobe one or in Safari web viewer. We receive that info when doing a copy-paste event of the selected text.

@fjaguero fjaguero changed the title Support line-breaks and paragraphs in the same div Support paragraphs grouping in the same div Nov 9, 2014
@fjaguero
Copy link
Author

I see #4629 as a related solution. I need to check if that code works.

@timvandermeij
Copy link
Contributor

I'm closing this as a duplicate of many other issues in the 4-text-selection category (and others listed in #4629).

@francescovallone
Copy link

Hello sorry if I'm reopening this issue, but I would like to know if this feature has been added to PDF.js. I couldn't find any solution to grouping paragraphs on the same span instead of having different spans for each line of the text.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants