Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable spotless formatter #7449

Merged
merged 9 commits into from
Apr 27, 2020
Merged

Conversation

J-N-K
Copy link
Member

@J-N-K J-N-K commented Apr 22, 2020

  • apply "mvn spotless:apply"
  • enable spotless checks by default
  • exclude itest-inputs for feed

Supersedes #6874

Signed-off-by: Jan N. Klug jan.n.klug@rub.de

@mhilbush
Copy link
Contributor

It's really old, but it works (except for this one formatting issue). Having used it to write 11 bindings, I was comfortable with it. And, having read numerous remarks about the challenges of getting new installs working, I've been putting off upgrading for a long time. I guess it's time to upgrade. 😦

@wborn
Copy link
Member

wborn commented Apr 28, 2020

We should probably file issues for that.

There's now also an issue for the karaf-maven-plugin verify goal leaking threads:

https://issues.apache.org/jira/browse/KARAF-6697

@nedtwigg
Copy link

@mhilbush fwiw, a big part of spotless is that you don't have to worry about how your IDE formats. Just run mvn spotless:apply and it will fix the formatting, so it doesn't matter how your IDE is setup.

@wborn
Copy link
Member

wborn commented Apr 30, 2020

Are there ways for improving the Spotless check performance @nedtwigg?

Looks like our Jenkins build times increase more than 2.5x by enabling the Spotless check, see:

https://ci.openhab.org/job/PR-openHAB-Core/buildTimeTrend

The check was enabled with openhab/openhab-core#1446 (job #2195). As a test I added a commit to openhab/openhab-core#1443 in which the Spotless check is disabled (job #2213).

So if we would also enable the check for this repo, build times might also increase 2.5x, i.e. go from ~2 hours to 5+ hours (see PR build time trend).

I hope part of this may also be fixed by the solution for diffplug/spotless#559.

@wborn
Copy link
Member

wborn commented Apr 30, 2020

I think the decreased performance is mainly due to the amount of extra memory being used. We have another build job for that project that's using -Xmx4096m instead of -Xmx768m which doesn't seem to be having these issues (see build time trend). I'll change the -Xmx setting of the PR job which hopefully fixes it for now.

@nedtwigg
Copy link

Indeed, there are multiple big spotless-maven performance problems noted in #559. There may be other ways to improve, but those are the lowest-hanging fruit

@andrewfg
Copy link
Contributor

andrewfg commented May 11, 2020

Has anyone noticed that the Spotless formatter is trashing special characters? For example it trashes the degree symbol characters in my Units-Measure-Code as follows. I think this a breaking change! Or??

Before Spotless

if ("°C".equals(descr)) {
    return SIUnits.CELSIUS;
} else if ("°F".equals(descr)) {
    return ImperialUnits.FAHRENHEIT;
} else if ("K".equals(descr)) {
    return Units.KELVIN;
}

After Spotless

if ("�C".equals(descr)) {
    return SIUnits.CELSIUS;
} else if ("�F".equals(descr)) {
    return ImperialUnits.FAHRENHEIT;
} else if ("K".equals(descr)) {
    return Units.KELVIN;
}

@nedtwigg
Copy link

Spotless doesn’t do any auto-detection on encoding, it defaults to UTF-8. You can set to something else (maybe cp1252?) with this.

@andrewfg
Copy link
Contributor

@nedtwigg many thanks for the observation. I am sure that there are many ways to change the out of the box settings of MVN spotless, and solve future problems. But that was not my point..

My point was that with the current spotless settings set by the OH team, and the current out of the box settings of Eclipse, the formatter changed the functionality of the above piece of code; (not only how it looks, but also how it works).

This means that the recent automated application of spotless over the whole existing code-base of OpenHAB might have broken the existing functionality in some places. - and you guys might not discover it until breaks user's live system.

@nedtwigg
Copy link

the formatter changed the functionality of the above piece of code

That is an excellent point! Of all the times people have used Spotless and had encoding issues, you are actually the first one to correctly report it as a bug, which it is. We've just opened this PR which will error loudly when/if the encoding is set incorrectly in the future. It might not have helped with this particular situation, as it seems to have encoded to valid codepoints, just different ones.

Also, these format-the-world commits are currently an unavoidable way to get started with enforced formatting, but they are dangerous. To fix this, we have the ratchet feature on the way (hasn't shipped yet) which lets you enforce formatting only on files which have changed since origin/master or v3.1.5, so that you can adopt auto-formatting change-by-change rather than all at once.

@nedtwigg
Copy link

To fix your immediate problem, you could:

  • checkout the commit before spotless apply
  • set encoding correctly
  • run spotless apply
  • diff the correct-encoding apply with the incorrect-encoding apply

If you then apply the patch from that diff to your current working tree, any regressions will be fixed. Perhaps a good argument for snapshot-testing that this regression was not caught by a test suite.

@andrewfg
Copy link
Contributor

Hi @nedtwigg I have been experimenting, and found out that the best way of eliminating spotless issues seems to be by using inline tags in the source files..

//@formatter:off 
..
//@formatter:on

I used this to prevent modification of special characters as mentioned already; and in my Junit test class where I have some huge, but nicely formatted, JSON strings for testing the parser. Without applying spotless, the JSON is nicely indented according to elements, sub elements etc. Whereas spotless basically munges the whole string into maximum line length blocks, which looks awful, and is impossible for a human to read..

@nedtwigg
Copy link

Yes, those tags are useful! You are using the eclipse formatter which is highly configurable. I agree that it’s a bummer to reorganize existing string literals, I don’t which flag enables that exactly, but I do know that this configuration doesn’t do that:

https://github.com/diffplug/spotless/blob/master/gradle/spotless.eclipseformat.xml

@andrewfg
Copy link
Contributor

Crikey! But, usefully the formatter on/off tags have the same impact on Eclipse as they do on Spotless..

@andrewfg
Copy link
Contributor

andrewfg commented May 14, 2020

Just to give a final word on this..

For the binding that I am working on, the hardware manufacturer delivers its temperature strings with a degree symbol prefix "°C" which is coded in plain ASCII as decimal code=176 / hex=B0 / binary=10110000 / octal=260.

For whatever historical reason, Java strings allow ASCII characters to be included using octal escaping; so the temperature unit string"°C" can be coded as "\260C".

I have done a few tests before and after spotless, and I can confirm that the code "\260C" is able to "survive" being run through spotless. So in summary, the first assertion below works before spotless, whereas the second assertion below fails because spotless has changed "°" to "�"

// passes before running spotless
assertEqual("\260C",  "°C"); 

// fails after running spotless, because spotless changed the symbol
assertEqual("\260C",  "�C"); 

@nedtwigg
Copy link

nedtwigg commented May 14, 2020

Java strings allow ASCII characters Unicode codepoints (which may require up to 4 'characters' in UTF-8, or 2 characters in UTF-16) to be included using octal escaping

In Cp-1252, '°' encodes to 0xB0, and B0 is also the unicode codepoint. But in UTF-8 (which is Spotless default if you do not set encoding), the B0 codepoint encodes to 0xC2 0xB0. Because Spotless is set to UTF-8, 0xB0 on its own is invalid, and decodes to the codepoint U+FFFD � REPLACEMENT CHARACTER, which encodes in UTF-8 to 0xEF 0xBF 0xBD. If you read those bytes as Cp-1252:

EF BF BD
ï  ¿  ½

So it's not random, it's what you would expect if you roundtrip Cp-1252 through UTF-8.

@J-N-K
Copy link
Member Author

J-N-K commented May 14, 2020

IIRC UTF-8 would be the correct encoding for our files. The exception to this rule are the i18n-files. So we should check which files had the wrong encoding before applying spotless. @wborn, since you are the script-guru, do you have an oder how to find those files? I would then check if something went wrong while applying spotless and provide a fix for that.

@andrewfg
Copy link
Contributor

Hi @nedtwigg , I was not suggesting that it is random. My point is that the binding is reading a byte stream over the wire (not a word stream). And in that byte stream are embedded degree symbol characters that are encoded as one single byte xB0. And the binding Java code has to be able to compare these incoming stream strings with a Java string, in order to set the UoM on the data points. My point is that in Eclipse I can also write the Java string with a single byte B0 character, and everything works fine. But when I run spotless over the code, it replaces the single byte character with three bytes. So that when I recompile it, the binding does not work any more. I am just suggesting that you need to find a solution where spotless does not break the functionality of a binding that uses single byte special characters.

@nedtwigg
Copy link

@andrewfg Agreed, and I appreciate that you have correctly identified this as a bug in Spotless. Many have had encoding issues, you are the first to correctly identify it as such.
diffplug/spotless#575 will fix this for most scenarios (including this one).

@wborn
Copy link
Member

wborn commented May 15, 2020

Doesn't seem to me that there were unwanted file encoding changes (depending on how well file -i detects these):

$ git checkout c5fdd32420f98af893bba13c1be39e0f4d5e92eb
$ find . -type f | xargs -i@ file -i @ | awk -F ':' '{print $2, $1}' | awk '{print $2, $3}' | sort > ../encodings-before
$ git checkout 7628135323df6f07e3c2993d7b816a16e334d849
$ find . -type f | xargs -i@ file -i @ | awk -F ':' '{print $2, $1}' | awk '{print $2, $3}' | sort > ../encodings-after
$ diff ../encodings-before ../encodings-after 
2d1
< charset=binary ./bundles/org.openhab.binding.amazonechocontrol/src/main/java/org/openhab/binding/amazonechocontrol/internal/WebSocketConnection.java
12049a12049
> charset=us-ascii ./bundles/org.openhab.binding.amazonechocontrol/src/main/java/org/openhab/binding/amazonechocontrol/internal/WebSocketConnection.java
20574a20575
> charset=us-ascii ./.gitattributes

The WebSocketConnection we discussed in #7449 (comment).

You can also easily scroll through the big changes of that PR on the CLI with:

git show 7628135323df6f07e3c2993d7b816a16e334d849

When I reapply Spotless on c5fdd32 (on Linux), then rebase the result on 7628135 there aren't any major differences as well.

LoungeFlyZ pushed a commit to LoungeFlyZ/openhab2-addons that referenced this pull request Jun 8, 2020
Signed-off-by: Jan N. Klug <jan.n.klug@rub.de>
J-N-K added a commit to J-N-K/openhab-addons that referenced this pull request Jul 14, 2020
Signed-off-by: Jan N. Klug <jan.n.klug@rub.de>
CSchlipp pushed a commit to CSchlipp/openhab-addons that referenced this pull request Jul 26, 2020
Signed-off-by: Jan N. Klug <jan.n.klug@rub.de>
Signed-off-by: CSchlipp <christian@schlipp.de>
andrewfg pushed a commit to andrewfg/openhab-addons that referenced this pull request Aug 31, 2020
Signed-off-by: Jan N. Klug <jan.n.klug@rub.de>
andrewfg pushed a commit to andrewfg/openhab-addons that referenced this pull request Aug 31, 2020
Signed-off-by: Jan N. Klug <jan.n.klug@rub.de>
andrewfg pushed a commit to andrewfg/openhab-addons that referenced this pull request Aug 31, 2020
Signed-off-by: Jan N. Klug <jan.n.klug@rub.de>
andrewfg pushed a commit to andrewfg/openhab-addons that referenced this pull request Aug 31, 2020
Signed-off-by: Jan N. Klug <jan.n.klug@rub.de>
DaanMeijer pushed a commit to DaanMeijer/openhab-addons that referenced this pull request Sep 1, 2020
Signed-off-by: Jan N. Klug <jan.n.klug@rub.de>
Signed-off-by: Daan Meijer <daan@studioseptember.nl>
markus7017 pushed a commit to markus7017/openhab-addons that referenced this pull request Sep 19, 2020
Signed-off-by: Jan N. Klug <jan.n.klug@rub.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure Build system and Karaf related issues and PRs
Projects
None yet
Development

Successfully merging this pull request may close these issues.