Skip to content

Commit

Permalink
Strip trailing slashes and match \r\n line breaks
Browse files Browse the repository at this point in the history
Unfortunately this commit introduces two changes because I haven't
figured out how to make them independently of each other:

* Strip trailing backslashes from the lines of an address
* When replacing line breaks with "<br>"s in the parsing of address blocks,
  replace the whole "\r\n" line break and not just the "\n" character

_Strip trailing backslashes from the lines of an address_

Kramdown has a feature where it will replace two trailing backslashes
from the input Govspeak with a line break in the output HTML. I'm in the
process of making some changes that will stop that feature from working
inside address blocks.

It doesn't appear to be used in the wild in any address blocks and we
don't officially support the feature in address blocks, so I think that
intentionally disabling it is reasonable.

To be clear, while Kramdown only treats pairs of backslashes this way,
we're actually removing all trailing backslashes because even an odd
number of them interferes with our code that handles address blocks.

_When replacing line breaks with "<br>"s in the parsing of address blocks,
replace the whole "\r\n" line break and not just the "\n" character_

The line breaks in the Govspeak in Whitehall's database and in the
strings that Govspeak Preview hands to the parser are composed of both
a return char and a newline char. In the resulting HTML, address blocks
contain the return chars still intact because the address block's
search-and-replace code was only matching and removing the newline
chars.

The intact return chars might actually be serving the purpose of
producing line breaks in our source HTML (i.e. not affecting the way the
HTML is rendered in the browser, but admittedly producing more readable
plain HTML). But if this is the case and if this turns out to have been
desirable behaviour I'd prefer that we do that more deliberately.

Since the example Govspeak in the test suite sometimes uses standalone
newlines and sometimes uses combined returns-and-newlines, it seems
safest to assume that both options are possible and so I've opted for
optionally matching either.
  • Loading branch information
mike3985 committed Dec 18, 2024
1 parent dc833c3 commit fd30721
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 1 deletion.
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# Changelog

## Unreleased

* Strip trailing backslashes from the lines of an address
* When replacing line breaks with "&lt;br&gt;"s in the parsing of address blocks,
replace the whole "\r\n" line break and not just the "\n" character

## 8.7.0

* Allow data attributes in spans ([#364](https://github.com/alphagov/govspeak/pull/364))
Expand Down
2 changes: 1 addition & 1 deletion lib/govspeak.rb
Original file line number Diff line number Diff line change
Expand Up @@ -305,7 +305,7 @@ def render_image(image)
<<~BODY
<div class="address"><div class="adr org fn"><p markdown="1">
#{body.sub("\n", '').gsub("\n", '<br />')}
#{body.sub(/\r?\n/, '').gsub(/\\*\r?\n/, '<br />')}
</p></div></div>
BODY
end
Expand Down
19 changes: 19 additions & 0 deletions test/govspeak_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,25 @@ class GovspeakTest < Minitest::Test
assert_equal %(\n<div class="address"><div class="adr org fn"><p>\n123 Test Street<br>Testcase Cliffs<br>Teston<br>0123 456 7890 \n</p></div></div>\n), doc.to_html
end

test "newlines in address block content are replaced with <br>s" do
input = %($A\n123 Test Street\nTestcase Cliffs\nTeston\n0123 456 7890\n$A)
doc = Govspeak::Document.new(input)
assert_equal %(\n<div class="address"><div class="adr org fn"><p>\n123 Test Street<br>Testcase Cliffs<br>Teston<br>0123 456 7890<br>\n</p></div></div>\n), doc.to_html
end

test "combined return-and-newlines in address block content are replaced with <br>s" do
input = %($A\r\n123 Test Street\r\nTestcase Cliffs\r\nTeston\r\n0123 456 7890\r\n$A)
doc = Govspeak::Document.new(input)
assert_equal %(\n<div class="address"><div class="adr org fn"><p>\n123 Test Street<br>Testcase Cliffs<br>Teston<br>0123 456 7890<br>\n</p></div></div>\n), doc.to_html
end

test "trailing backslashes are stripped from address block content" do
# two trailing backslashes would normally be converted into a line break by Kramdown
input = %($A\r\n123 Test Street\\\r\nTestcase Cliffs\\\nTeston\\\\\r\n0123 456 7890\\\\\n$A)
doc = Govspeak::Document.new(input)
assert_equal %(\n<div class="address"><div class="adr org fn"><p>\n123 Test Street<br>Testcase Cliffs<br>Teston<br>0123 456 7890<br>\n</p></div></div>\n), doc.to_html
end

test "should convert barchart" do
input = <<~GOVSPEAK
|col|
Expand Down

0 comments on commit fd30721

Please sign in to comment.