Clang format proposal v4 #5186

roligugus · 2020-07-15T05:28:07Z

Continuation of #5092

Make sure these boxes are signed before submitting your Pull Request -- thank you.

I have read the contributing guide lines at https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Contributing
I have signed the Open Information Security Foundation contribution agreement at https://suricata-ids.org/about/contribution-agreement/
I have updated the dev guide (in doc/devguide/) to reflect the changes made (if applicable)

Link to redmine ticket:
https://redmine.openinfosecfoundation.org/issues/3736

Describe changes:

Addressed a few comments only
Updated code-style doc (in code, not wiki)
Added github action to run formatting check on pull request and push (of local branch). Currently produces a warning if formatting is not ok.
Added make targets for easier usage (I hope) for devs. Especially when reformatting whole branches. See code-style doc.

Next Steps

Get rid of TODOs and make decisions. Clean up pull request and open a "real one".

Requires

Minimal clang-format-9 due to some settings. See code-style.rst

Decisions to make

Bikeshedding some settings? Please see TODOs in code-style.rst
Should github formatting-check produce an error or a warning?
I might have forgotten something? Any TODO I have
Minimal .clang-format files, vs complete. I recommend the minimal .clang-format file.
Decide what to do with the source code sample. I've only copied some things into the code-style doc

Updated code-style.rst

Please see TODOs there that need a decision

Github Action

New formatting-check
Currently only produces a warning if formatting is off. Kind of a compromise to not pee everyone off that they have to update their branch. However, this kind of limits the usefulness of it.
Run on pull request
Run on push (useful when pushing new commits to private fork)
That integration was painful. Learned more about github actions than what I ever wanted to.

Tested github action and script

See "Check formatting step" in formatting-check log on how the formatting check results are presented.

Examples of what is produced. Note, even if formatting is not ok, the status is green as we only produce warnings.

Branch off current master:

No formatting required (branched off current master): https://github.com/roligugus/suricata/runs/872041858?check_suite_focus=true
Formatting required (newest commit needs formatting): https://github.com/roligugus/suricata/runs/871938082?check_suite_focus=true
Formatting required (first commit needs formatting): https://github.com/roligugus/suricata/runs/871972045?check_suite_focus=true

Branched off older master commit:

No formatting required: https://github.com/roligugus/suricata/runs/872001307?check_suite_focus=true
Formatting required (newest commit needs formatting): https://github.com/roligugus/suricata/runs/871994986?check_suite_focus=true
Formatting required (first commit needs formatting): https://github.com/roligugus/suricata/runs/871992927?check_suite_focus=true

Other

Other clang versions, etc: https://github.com/roligugus/suricata/actions/runs/169573386

Make targets and helper scripts

added make targets that are used in github action and can be used
added scripts/clang-format.sh to handle the branch reformatting and checking of formatting logic
added a customized version of git-clang-format as I depend on a feature that does not exist yet: listing of files needing formatting. I will try to upstream this change to clang.

clang-format allows to auto-format C code. The settings here are set up to follow the code style, see https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Coding_Style. Some bike shedding should still happen. Includes sample formatted code. One can set up .clang-format as "differential setting based on an existing project" or specify all settings. Includes both for people to compare. DO NOT MERGE AS IS.

.github/workflows/formatting.yml

victorjulien · 2020-07-15T05:31:41Z

@jasonish can you have a look at the github-actions integration?

roligugus · 2020-07-15T05:31:43Z

.github/workflows/formatting.yml

+name: formatting-check
+
+on:
+  push:


This enables running the checks on pushes to private forks which I kinda like. No need to create local pull requests to have formatting checked.

.github/workflows/formatting.yml

roligugus · 2020-07-15T05:35:24Z

.github/workflows/formatting.yml

+                wget \
+                zlib1g \
+                zlib1g-dev
+      - name: Install packages for clang-format 9


Clang-format 9 is minimal requirement for the chosen .clang-format settings.

Tested with clang 10, clang 11 (trunk) as well. See overall pull request comments

.github/workflows/formatting.yml

roligugus · 2020-07-15T05:37:20Z

.github/workflows/formatting.yml

+          elif [ $rc -eq 1 ]; then
+              # limit output as it might be a lot in the worst case
+              tail -n 100 check_formatting_log.txt
+              echo "::warning ::Formatting is not following code style guide!"


It would be trivial to switch formatting problems to an error here.

I'm not too keen on that right now.

Can we make it an error but start out with having the job be allowed to fail? This way we'll get visual feedback, but it won't fail the build. On of our travis tests is set up this way. Not sure if github actions allows for it.

Just played around a bit.

There's the continue-on-error one can set for a job which would allow us to set an "error" in the check-formatting job. However, the visual representation still sucks in github. It'll still set the job to green.

From what i can gather it's a limitation of github compared to e.g. Travis (and most other CI solutions). See open feature request: https://github.com/actions/toolkit/issues/399

There's also a way to have a job fail and then still run follow-on jobs with e.g. if: always(). However, that means configuring the follow-on jobs to still run if any of the previous jobs failed. No idea, how github decides on the order of jobs and dependencies...

To be more precise, setting continue-on-error while not perfect is much better.

It'll show the "formatting-check" as failed in the pull-request's checks overview. See e.g. roligugus#5
Don't know if that would prevent you from merging a pull request if formatting was not ok.

Compare that with the previous approach of a warning only and formatting-check always succeeded.

Looks like that the continue-on-error setting is almost useless as it seems to only affect the yaml's jobs it is in. It could be interesting if ever a rustfmt/cargo fmt check was added into the same formatting-check.yaml though.

Anyhow, it seems one can pick which status checks that have to pass in your "Branch Protection Rules", see https://docs.github.com/en/github/administering-a-repository/enabling-required-status-checks

So as long as you don't require the formatting-check to pass, you can merge pull requests.

Hence, I'd recommend the formatting-check to error out in case of wrong formatting problems and not pick it as a required "status check" for merging purposes. That way, you'd see the red failure indication in the pull request, yet your merging is not blocked. I.e. it's up to you as maintainers to decide whether to request a formatting change on the pull request or merge something in anyways.

@jasonish and @victorjulien Any thoughts on such an approach?

I think that makes sense, we can always disable it again if we find it to be too intrusive.

roligugus · 2020-07-15T05:40:26Z

Makefile.am

+
+#--- clang-formatting ---
+# Check if branch changes formatting is correct
+.PHONY: check-style-branch


These targets help with github actions. More importantly, they help with reformatting code and whole branches. See https://github.com/roligugus/suricata/blob/clang-format-proposal-v4/doc/devguide/codebase/code-style.rst#use-make-targets

The makefile targets, as well as the underlying script that gets called, also isolate the dev from knowing which git-flang-format version to call (on ubuntu).

Do these script wrappers in the Makefile actually help with github actions over just calling the scripts directly? I ask as they seem to be direct wrappers over scripts, and often the scripts are easier to discover, so I wonder the value of providing these as mark targets.

Nope, they do not matter for the actions, as that is all scripted. In fact I no longer use make as it
hides the scripts exit code" there.

I only added the make targets as an option and alternative to use for people essentially. The idea was that it might be easier for people to remember some of the more complicated calls such as make diffstat-style-branch than ./scripts/clang-format.sh check-branch --make --diffstat --show-commits. Having said that, for the ones that mostly count for devs, there is not much difference directly calling the script.

Some projects like having that option, some not. I am usually in the latter part to be honest, but wanted to give you guys an option. It can easily be removed if not needed.

Any quick yay or nay on whether to keep the make targets? I get the feeling from @jasonish they should rather be removed?

I removed the make targets for v5. Agree with @jasonish that they are not that useful

clang-format.full.do_not_merge

roligugus · 2020-07-15T05:41:20Z

doc/devguide/codebase/code-style.rst

@@ -6,12 +6,165 @@ Suricata uses a fairly strict coding style. This document describes it.
 Formatting
 ~~~~~~~~~~

+clang-format


If the diff sucks, see https://github.com/roligugus/suricata/blob/clang-format-proposal-v4/doc/devguide/codebase/code-style.rst for how it's displayed

roligugus · 2020-07-15T05:42:09Z

doc/devguide/codebase/code-style.rst

+********************************************
+clang-format 9 is required.
+
+TODO - REMOVEME: We could possibly provide an "install-clang-format" script if we wanted to.


All the TODOs in this file here need addressing and decisions.

.clang-format

doc/devguide/codebase/code-style.rst

roligugus · 2020-07-15T05:51:56Z

doc/devguide/codebase/code-style.rst

+clang-format:
+ - ColumnLimit: 80
+ - ContinuationIndentWidth: 8
+ - ReflowComments: true


@jasonish I've enabled reformatting of comments. Also in the actual .clang-format file

roligugus · 2020-07-15T05:53:16Z

doc/devguide/codebase/code-style.rst

+
+NOTE - REMOVEME: tab default width of 8, not 4, as that's most editors default
+
+NOTE - REMOVEME: Old sample code function parameter indentation was 4. Indentation for next line of function parameters follows


clang only support on "Continuation Indent", hence function wrapping should be indented by 8 as well

So is this broader than just the function parameter wrapping? For that it makes sense to me.

Yes, ContinuationIndentWith is used for any statement needing to span over multiple lines in general, i.e. ifs, fors, variable initialization, functions, etc.

Having said that, some of them have more knobs where one can override "Continuation Indent". For functions, one can specify multiple alignments, in fact.

I set up DontAlign which uses "Continuation Indent" for function continuation as I found that approach was used most often in the code I checked. Plus, I'd also recommend using that setting.

However, I found some code using the Align approach. That latter setting is useless for C++ code due to people's misguided use of java-naming-approaches to class, namespace, method and type names. It's less bad for C code in general, but the longer the function name and return value, the further to the right you get all parameters. Just not very pretty imho, as you might end up with many parameters on their own line in the worst case. My preference is the "denser" approach with fewer lines.

Please let me know if you'd like to change that setting.

Here are the possible values:
AlignAfterOpenBracket (BracketAlignmentStyle)
If true, horizontally aligns arguments after an open bracket.

This applies to round brackets (parentheses), angle brackets and square brackets. Among them, function

Possible values:

BAS_Align (in configuration: Align) Align parameters on the open bracket, e.g.:

someLongFunction(argument1, argument2);

BAS_DontAlign (in configuration: DontAlign) Don’t align, instead use ContinuationIndentWidth, e.g.:

someLongFunction(argument1, argument2);

BAS_AlwaysBreak (in configuration: AlwaysBreak) Always break after an open bracket, if the parameters don’t fit on a single line, e.g.:

someLongFunction( argument1, argument2);

roligugus · 2020-07-15T05:55:05Z

doc/devguide/codebase/code-style.rst

+
+.. code-block:: c
+
+    //vvv--- REMOVEME


Code samples that should help with the decision on what configuration to chose. Hence tagged with REMOVEME

doc/devguide/codebase/code-style.rst

sample_formatted_code.do_not_merge.c

doc/devguide/codebase/code-style.rst

roligugus · 2020-07-15T06:02:43Z

doc/devguide/codebase/code-style.rst

+
+Format your Changes
+*******************
+Before opening a pull request, please also try to ensure it is formatted


This is almost verbatim from the llvm dev guide.

sample_formatted_code.do_not_merge.c

roligugus · 2020-07-15T06:11:31Z

scripts/clang-format.sh

@@ -0,0 +1,591 @@
+#!/bin/bash
+#


I was bashing around. Sorry. Kinda became a complete script.

The goal of this script is

to be the backing of the make targets

provide reformatting of all commits on a branch functionality

provide hints to the user what to call to fix the formatting. Used in github actions log.

provide nicer error messages if something is not nicely set up, e.g. missing install of clang-format

isolate the ubuntu clang versionitis, e.g "oh, I installed clang-format-10, so I should call git clang-format-10, not git clang-format..."

Best reviewed from the bottom up, i.e. start at the end.

scripts/git-clang-format-custom

jasonish

The github workflow looks fine, just a little unclear around the need for a ref at all in the checkout.

.clang-format

jasonish · 2020-07-15T20:50:11Z

Makefile.am

+
+#--- clang-formatting ---
+# Check if branch changes formatting is correct
+.PHONY: check-style-branch


Do these script wrappers in the Makefile actually help with github actions over just calling the scripts directly? I ask as they seem to be direct wrappers over scripts, and often the scripts are easier to discover, so I wonder the value of providing these as mark targets.

.github/workflows/formatting.yml

jasonish · 2020-07-15T21:18:58Z

.github/workflows/formatting.yml

+          elif [ $rc -eq 1 ]; then
+              # limit output as it might be a lot in the worst case
+              tail -n 100 check_formatting_log.txt
+              echo "::warning ::Formatting is not following code style guide!"


I'm not too keen on that right now.

jasonish · 2020-07-15T21:26:56Z

Some formatting oddities I found:

const SCPlugin PluginSpec = { 
        .name = OUTPUT_NAME,
        .author = "Some Developer",
        .license = "GPLv2",
        .Init = TemplateInit,
};

with our current code guidline I believe that would get an indent of 4, but its reformatted to 8. I believe statements like this should get the 4 indent level.

It does get it correct for something like this:

-enum LogFileType { LOGFILE_TYPE_FILE,
-                   LOGFILE_TYPE_SYSLOG,
-                   LOGFILE_TYPE_UNIX_DGRAM,
-                   LOGFILE_TYPE_UNIX_STREAM,
-                   LOGFILE_TYPE_REDIS,
-                   LOGFILE_TYPE_PLUGIN };
+enum LogFileType {
+    LOGFILE_TYPE_FILE,
+    LOGFILE_TYPE_SYSLOG,
+    LOGFILE_TYPE_UNIX_DGRAM,
+    LOGFILE_TYPE_UNIX_STREAM,
+    LOGFILE_TYPE_REDIS,
+    LOGFILE_TYPE_PLUGIN
+};

but gets it wrong for something like this:

     const char *builtin[] = {
-        "regular",
-        "syslog",
-        "unix_dgram",
-        "unix_stream",
-        "redis",
-        NULL,
+            "regular",
+            "syslog",
+            "unix_dgram",
+            "unix_stream",
+            "redis",
+            NULL,
     };

It appears to want to put the { on a new line after a macro expansion like TAILQ.. I do belive this may be a tricky one though:

     ConfNode *plugin = NULL;
-    TAILQ_FOREACH(plugin, &conf->head, next) {
+    TAILQ_FOREACH(plugin, &conf->head, next)
+    {

jasonish · 2020-07-15T21:31:58Z

Oh, I also noticed the formatting workflow didn't run when I pushed to my own repo (I added some code changes to test). Hard to know if this is something wrong with the workflow file (it looks fine) or GitHub, as I've noticed this happen recently on GitHub for another repo as well.

roligugus · 2020-07-16T00:03:50Z

Oh, I also noticed the formatting workflow didn't run when I pushed to my own repo...

If also seen that sometimes. My theory was that in my case I've done too many changes in a short time frame, but that might just be hot air from my side.

I agree that the workflow seems correct and covers the case of updating a pull request as well as pushing new versions of a branch. But maybe we're missing something...

roligugus · 2020-07-16T00:08:38Z

It appears to want to put the { on a new line after a macro expansion like TAILQ.. I do belive this may be a tricky one though:
     ConfNode *plugin = NULL;
-    TAILQ_FOREACH(plugin, &conf->head, next) {
+    TAILQ_FOREACH(plugin, &conf->head, next)
+    {

See comment https://github.com/OISF/suricata/pull/5186/files#r454813018 and following lines.

One can configure "foreach macros" that it pulls the opening parens up as you'd like in .clang-format with ForEachMacros, see https://github.com/OISF/suricata/pull/5186/files#diff-69a68f7ce1ea3687a38d8e4705e6fa48R31. I did configure 'json_array_foreach', 'json_object_foreach' to pull the parens up as you'd like. I have not configured TAILQ...() and other "foreach macros" as the current code is awfully inconsistent and the ones I checked used parens on next line.

Anyhow, it's trivial to add them. I've listed all "foreach macros" I found in the code in sample_formatted_code.do_not_merge.c as well, see https://github.com/OISF/suricata/pull/5186/files#diff-c10e7f7ebb587c6259db37a9ce170740R458 (crap, click on "load diff" for sample_formatted_code.do_not_merge.c for the link to work). There might be more.

This would be a good time to make it consistent and force all known "foreach macros" to have the opening parens with the macro. ;)

roligugus · 2020-07-16T00:42:55Z

Some formatting oddities I found:
const SCPlugin PluginSpec = { 
        .name = OUTPUT_NAME,
        .author = "Some Developer",
        .license = "GPLv2",
        .Init = TemplateInit,
};
with our current code guidline I believe that would get an indent of 4, but its reformatted to 8. I believe statements like this should get the 4 indent level.

...

but gets it wrong for something like this:

     const char *builtin[] = {
-        "regular",
-        "syslog",
-        "unix_dgram",
-        "unix_stream",
-        "redis",
-        NULL,
+            "regular",
+            "syslog",
+            "unix_dgram",
+            "unix_stream",
+            "redis",
+            NULL,
     };

This is because I chose the setting that does that for variable definitions. ;) My point-of-view was that the variable definition is really a continuation.

One can change that behaviour with Cpp11BracedListStyle. Despite it's name, this applies to C and C++ as with all settings. It'll use the IndentWidth, i.e. 4 spaces, for braced-style variable definitions if set to false.

So if you prefer that, we change it. No biggie.

A discussion on that behaviour with some samples is in sample_formatted_code.do_not_merge., see https://github.com/OISF/suricata/pull/5186/files#diff-c10e7f7ebb587c6259db37a9ce170740R340 (crap, click on "load diff" for sample_formatted_code.do_not_merge.c for the link to work). I call it "variable init continuation behaviour" and "designated initializer continuation behaviour" there.

That sample code uses 8 spaces ContinuationIndent as that's how it's currently set up.

The formatting of the sample code with Cpp11BracedListStyle:false which uses IndentWith, i.e. 4 spaces, for continuation lines would be:

    //--- variable init continuation behaviour
    // Cpp11BracedListStyle impacts if space at beginning and end of brace
    // e.g. false: Bla bla[] = { 1, 2, 3 };
    //      true:  Bla bla[] = {1, 2, 3};
    // Note, the different indentation of continuation line:
    //  - Cpp11BracedListStyle:false does NOT use ContinuationIndentWidth (8)
    //    but rather the regular IndentWidth (4)! Bug? Feature?
    //  - Cpp11BracedListStyle:true uses ContinuationIndentWidth (8)
    int list[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
    int list[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
        18 };
    int list[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22 };
    PcapStats64 last = { 0, 1234, 8765 };
    // trailing comma forces one-value-by-line
    PcapStats64 trailing_comma = {
        0,
        1234,
        8765,
    };
    struct pcap_stat current = { 12, 2345, 9876, 1345 };
    struct pcap_stat current = { 12, 2345, 9876, 1345, 333, 444, 5555, 66666,
        7777777 };
    current = { 12, 2345, 9876, 1345, 333, 444, 5555, 66666, 7777777, 888888,
        9999999 };

    //--- designated initializer continuation behaviour
    // Current clang-format disables BinPacking for designated intializers when
    // continuing on more than one line.
    PcapStats64 last = { .ps_recv = 0, .ps_drop = 1234, .ps_ifdrop = 8765 };
    // Cpp11BracedListStyle:false puts end brace onto separate line iff
    // continuation line can hold all intializers. Bug? Feature?
    struct pcap_stat current = {
        .ps_recv = 12, .ps_drop = 2345, .ps_ifdrop = 9876, .ps_what = 134
    };
    pcap_stat current = {
        .ps_recv = 12, .ps_drop = 2345, .ps_ifdrop = 9876, .ps_what = 134
    };
    // One designated initializer per line if it does not fit into one
    // continuation line as BinPacking is disabled for designated intializer.
    struct pcap_stat current = { .ps_recv = 12,
        .ps_drop = 2345,
        .ps_ifdrop = 9876,
        .ps_what = 1345,
        .ps_more = 333 };
    current = { .ps_recv = 12,
        .ps_drop = 2345,
        .ps_ifdrop = 9876,
        .ps_what = 1345,
        .ps_more = 333 };

.clang-format

roligugus · 2020-07-16T16:14:06Z

doc/devguide/codebase/code-style.rst

+
+in order to update the last commit with all pending changes.
+
+Use make targets


Will replace that with scripts/clang-format.sh calls if it's decided that the make targets go away.

doc/devguide/codebase/code-style.rst

sample_formatted_code.do_not_merge.c

victorjulien · 2020-07-17T08:35:20Z

Thanks for this effort Roland, it's much more work that I could have imagined. Really appreciate it!

roligugus · 2020-07-21T18:36:53Z

One can change that behaviour with Cpp11BracedListStyle. Despite it's name, this applies to C and C++ as with all settings. It'll use the IndentWidth, i.e. 4 spaces, for braced-style variable definitions if set to false.

So if you prefer that, we change it. No biggie.

Fixed in v5

roligugus · 2020-07-21T18:38:43Z

This would be a good time to make it consistent and force all known "foreach macros" to have the opening parens with the macro. ;)

Will be fixed in v5

roligugus · 2020-07-27T15:22:31Z

Continued in #5222

roligugus added 4 commits July 10, 2020 23:21

Add dev code-style

af5cbd8

Add make support for clang-format

7df3813

Run formatting check on pull request

8429123

This was referenced Jul 15, 2020

Clang format proposal v3 - github actions testing - please ignore #5147

Closed

Add clang-format file #5092

Closed

victorjulien reviewed Jul 15, 2020

View reviewed changes

.github/workflows/formatting.yml Show resolved Hide resolved

victorjulien assigned jasonish Jul 15, 2020