Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lesson_check.py: harden single-line image/link pattern #597

Merged

Conversation

maxim-belkin
Copy link
Contributor

This change hardens the pattern that matches single-line
image or link:

  1. It extends the pattern to be matched in a heading
  2. It allows the line to contain {: ...} customizations
  3. It allows the line to end with \

Fixes #591

This change hardens the pattern that matches single-line
image or link:

1. It extends the pattern to be matched in a heading
2. It allows the line to contain {: ...} customizations
3. It allows the line to end with \
@maxim-belkin
Copy link
Contributor Author

@jhlegarreta, here is the list of "long lines" that I get for your lesson with this change:

./CONTRIBUTING.md: Line(s) too long: 3, 114, 126
./README.md: Line(s) too long: 2, 3, 4, 5, 6, 7, 10, 14, 17, 18, 23, 25, 59, 60, 61, 64, 65
./_episodes/constrained_spherical_deconvolution.md: Line(s) too long: 69, 77
./_episodes/deterministic_tractography.md: Line(s) too long: 44, 61, 267, 272, 274, 283
./_episodes/diffusion_tensor_imaging.md: Line(s) too long: 32, 48, 52, 53, 56, 121, 166, 237, 320
./_episodes/preprocessing.md: Line(s) too long: 137, 175, 193, 243, 245, 270, 271, 273, 347, 368, 392
./_episodes/probabilistic_tractography.md: Line(s) too long: 73, 92
./_extras/discuss.md: Line(s) too long: 76
./_extras/guide.md: Line(s) too long: 30
./index.md: Line(s) too long: 30

The lines you were concerned about seems to be no longer reported as "too long".

Line 190 in _episodes/constrained_spherical_deconvolution.md was triggering the "Line too long" warning with the previous pattern because it wasn't matching against it due to {:class=...} and \ at the end of the line. Before that, this line was skipped by this check because it begins with an !.

Line 458 in the same file was triggering the "Line too long" warning again due to {:class=...} and \ at the end of the line not letting the previous pattern to match it.

@maxim-belkin
Copy link
Contributor Author

I'm going to merge this one as all the checks have passed and I'm pretty confident it addresses the issues reported.

@jhlegarreta, let me know if there is some other extra syntax that you use (or would like to use) for links/images that I haven't captured in this regex.

@maxim-belkin maxim-belkin merged commit af7efa5 into carpentries:gh-pages May 9, 2021
@maxim-belkin maxim-belkin deleted the fix-link-image-pattern branch May 9, 2021 08:28
@jhlegarreta
Copy link

jhlegarreta commented May 9, 2021

@maxim-belkin thanks for the explanation in the commit message, and your time and effort !

The new changeset does fix the lines I pointed to 👍. The relevant part of my make lesson-chell-all output is exactly the same as the one you posted.

There are still three instances that are not handled correctly:
./_episodes/diffusion_tensor_imaging.md: 48, 56, 166

I am not sure in which ways those lines are different from other instances. I have noticed that they have a trailing whitespace, and they only contain a link, but apart from that I see no other noticeable different to the many similar instances in that file.

Some other lines, e.g. 37, do have trailing whitespaces, are link-only lines within a paragraph, exceed the 100 character limit, and are not triggering a warning.

I grant that trailing whitespaces should always be trimmed, and that we should configure our tools to do that. If you conclude that the reason behind the above lines not being handled correctly are the trailing whitespaces, I'll remove them all from the file. However, note that I went ahead and removed them locally (only for the pointed lines), then passed the check again and the warnings are still there.

So there is still something that is elusive to us.

@maxim-belkin
Copy link
Contributor Author

There are still three instances that are not handled correctly:
./_episodes/diffusion_tensor_imaging.md: 48, 56, 166

All of these lines are images and they have the following parts that don't match the pattern:

  • Line 48 has a dot at the end.
  • Line 56 has a dot at the end.
  • Line 166 has a comma at the end.

What's interesting (at least, I've never seen this done before) is that you treat images' alt text as part of the corresponding paragraphs. I think that due to accessibility considerations paragraphs should refer to images rather than use them this way.

Some other lines, e.g. 37, do have trailing whitespaces, are link-only lines within a paragraph, exceed the 100 character limit, and are not triggering a warning.

Line 37 is an image-only line, doesn't have neither a period nor comma at the end, so it matches the pattern and is, therefore, skipped by this check.

I grant that trailing whitespaces should always be trimmed

Two trailing whitespaces force line breaks in Markdown - see 740773a (PR #522), so you should keep them, if that's what you use them for.

@jhlegarreta
Copy link

Again, thanks for having had a look at this @maxim-belkin.

All of these lines are images and they have the following parts that don't match the pattern:
Line 48 has a dot at the end.
Line 56 has a dot at the end.
Line 166 has a comma at the end.

Thanks for identifying the reason. As I see it, having a punctuation mark at the end should not be a reason to miss these cases. But let me know if I'm wrong.

What's interesting (at least, I've never seen this done before) is that you treat images' alt text as part of the corresponding paragraphs. I think that due to accessibility considerations paragraphs should refer to images rather than use them this way.

Not sure if "treating images' alt text as part of the corresponding paragraphs" was intended. I do not see its effects or how this can be done otherwise. Alt texts were introduced so that all images, even the inline ones, have a caption when being listed in
https://carpentries-incubator.github.io/SDC-BIDS-dMRI/figures/index.html

So not sure how this comment and the above are intertwined so that we can fix the cases.

Two trailing whitespaces force line breaks in Markdown - see 740773a (PR #522), so you should keep them, if that's what you use them for.

The intention was not to force line breaks; maybe it was just the way the contributor added the text. But I did not know about the two trailing whitespaces. Interesting. Thanks.

@maxim-belkin
Copy link
Contributor Author

As I see it, having a punctuation mark at the end should not be a reason to miss these cases.

Currently, images in the lessons are displayed as block elements (line 256 below), meaning that they start on new lines and take up the whole width of the page:

article img {
display: block;
margin: 20px auto;
max-width: 100%;
}

Because of that, text that follows immediately after an image in Markdown ends up on a new line in HTML, see the screenshot below:

screenshot

rendered version of the "Diffusion Tensor Imaging" episode of the "Introduction to dMRI" lesson

So, all of that was to say that our lessons don't currently support inline images. But they could and, probably, should. I'll submit a PR proposing this and another PR to allow punctuation marks after links and images because it seems logical to be able to use punctuation marks after the inline images.

@jhlegarreta
Copy link

Thanks for having had a close look at this 💯.

Currently, images in the lessons are displayed as block elements (line 256 below), meaning that they start on new lines and take up the whole width of the page:
Because of that, text that follows immediately after an image in Markdown ends up on a new line in HTML, see the screenshot below:

I see. Yes, I had noticed that what we pretended to be inline images were in fact not so. I did not investigate further, though. Now I know the reason 👍.

So, all of that was to say that our lessons don't currently support inline images. But they could and, probably, should. I'll submit a PR proposing this and another PR to allow punctuation marks after links and images because it seems logical to be able to use punctuation marks after the inline images.

Excellent ! Both endeavors make sense to me as separate PRs. They are by no means urgent. At least we now know better the reasons behind.

@maxim-belkin
Copy link
Contributor Author

maxim-belkin commented May 11, 2021

@jhlegarreta, I submitted PR #599 that allows up to 3 non-word characters to appear on the same line before and after a link or an image. I chose 3 characters as a middle ground between "too few" and "too many". I think it's OK to allow non-word characters in the beginning of the image because I noticed that in the carpentries-incubator/SDC-BIDS-dMRI lesson there is at least one case where an image is surrounded by parentheses.

@jhlegarreta
Copy link

jhlegarreta commented May 12, 2021

Thanks @maxim-belkin !

Looks like our diffusion_tensor_imaging.md episode gets a nice benefit from the PR:

./_episodes/diffusion_tensor_imaging.md: Line(s) too long: 32, 53, 121, 320

vs. without the changes in the PR:

./_episodes/diffusion_tensor_imaging.md: Line(s) too long: 32, 48, 52, 53, 56, 121, 166, 237, 320

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow skipping line length checks for link-only lines
2 participants