Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning for possibly misspelled built-in attributes #372

Closed
wants to merge 8 commits into from

Conversation

Octachron
Copy link
Member

This pull request adds a simplistic safety net against misspelled built-in attributes.

Since some builtin attributes can be silently deactivated by a spelling error, this patch tries to catch basic spelling mistakes and expose them before they can cause any harms.

The chosen model is a set of opt-in functions for detecting possibly misspelled attribute in parsing/attr_helper using the edit_distance already implemented in util/misc. To decrease the number of false positive the maximal edit distance for identifying misspelling is quite small.

In a semi-related way, this patch also corrects a spelling mistake on the tailcall attribute in the changelog.


During attribute identification, a warning is emitted if the nearest
name in the union of names and context is in the list of possible
names and is at a distance inferior to [attribute_max_distance] *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what context is from this explanation. From the way you use it later I would guess that the words in the context are closed words that we know are not mispellings of the attributes (the implementation seems to somewhat confirm), but nothing in the "context" name or in the explanation here allows me to be sure it works this way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The original meaning was "contextual information useful for reducing the false positive rate". And it currently hold names that might legitimately appear in the same context as the identifier and should not be identified as spelling errors. I will update the explanation and name to better reflect this point.

@alainfrisch
Copy link
Contributor

This is an interesting contribution. One would need some experimenting in the large to see if we get too many false positive. One potential problem is that the new warning implicitly reserves many attribute names; how many of them would be useful for other projects?

One benefit of allowing the form ocaml.XXX is that it makes it easier to spot errors. It could be useful to report any use of an ocaml.XXX warning which is not among those actually known by the compiler. (Similarly, each ppx could report any use of an unknown attribute in "its" namespace.)

@gasche
Copy link
Member

gasche commented Dec 23, 2015

One potential problem is that the new warning implicitly reserves many attribute names; how many of them would be useful for other projects?

I was thinking along these lines, but in fact there should be no problem if we assume that -ppx rewriters "upstream" (en amont) remove the occurrences attributes that they are in charge of, instead of just leaving them for the compiler. We should discuss it with @diml (or other people knowledgeable about ppx_driver for example), but if this is the common model (which is only the only one that allows for robust warnings/errors on misspelled attributes) we can assume that the only attributes left for the compiler are meant for it.

@alainfrisch
Copy link
Contributor

there should be no problem if we assume that -ppx rewriters "upstream" (en amont) remove the occurrences attributes that they are in charge of

It's not clear to me that this is necessarily the right model, and this it would completely solve the problem anyway:

  • The same attribute might be understood by multiple independent ppx (for instance, a notion of "default value" for a field).
  • The compiler and ppx are not the only "clients" of attributes. Tools can react on attributes in .cmi/.cmt files, for instance; or, on the other side, standalone tools that analyze source code.

It would be an interesting project to design a system allowing each package (ppx or other tools) to specify in a declarative way which attributes it understands (and in which context, and perhaps with which payload -- although an error in the payload would typically be detected by the package itself). A dedicated tool (or the compiler itself?) could parse a list of such specifications and apply the corresponding checks. Findlib packages would indicate the spec file so that they can be passed to the checker (or the compiler) easily.

One might not be able to express 100% of contextual constraints (e.g. complex nesting of attributes and/or extension nodes), but this could detect many simple typos, and also provide a good foundation for IDEs/Merling (such as autocompletion on attribute names depending on the context, or "jump to documentation" for a given attribute, assuming the specs contain such information).

@ghost
Copy link

ghost commented Dec 24, 2015

With ppx_driver removing attributes that are interpreted by a rewriter is the common model. The only case where the rewriter itself does not remove attributes is type-driven code generators. In this case ppx_type_conv takes care of removing attributes that have been used after all generators have been executed.
So far it's the only case of attribute sharing between rewriters we had. This model could work with ppx_deriving too.

Note that removing attributes has the nice property of making rewriters idempotent.

To deal with attributes that are not used by any rewriters, for instance the ones inserted by merlin, ppx_driver has an option to white-list a namespace. In this case it only checks that these attributes are not dropped by mistake.

Having a system to declare attributes would certainly be nice. Technically we already have that in ppx_core. For us, I don't think it would completely replace the checks done by ppx_core/ppx_driver as they are more strict and have proven to be quite useful so far.

@alainfrisch
Copy link
Contributor

Dropping attributes makes them invisible to tools parsing .cmt/.cmi/.cmti files. For instance, a documentation generator based on .cmti files might want to react on such attributes, even if they have been already intepreted by a ppx (e.g. to show in a compact way the info about automatically generated functions, or to keep e.g. "default" attributes in type declarations).

@ghost
Copy link

ghost commented Dec 24, 2015

Currently removing them is the only way for a rewriter to mark them as processed. I guess we wouldn't need to remove them if there was another way, for instance if attributes carried a flag for that.

I'm not sure I understand what you are talking about for documentation, certainly the doc tool couldn't know about all rewriters. It would be more viable to have the ppx rewriters themselves generate specific attributes that the doc tool would interpret, and these could be namespaced.

@alainfrisch
Copy link
Contributor

certainly the doc tool couldn't know about all rewriters

Of course, but it could know about some of them (those from most popular packages), it could be extensible itself to support more attributes, or it could just display generically raw attributes as part of the documentation.

@Octachron
Copy link
Member Author

I agree that false positive and the implit reservation of names could be quite problematic in general. But for the current set of built-in attributes, I do not think it is too problematic: at an edit distance of 1, the valid english words that could be misidentified (at least in the aspell corpus) are

-warning: waning, warnings, earning, darning, warming, warding, warping, warring
-inline: inlined, online, incline
-inlined: inline, unlined, inclined
-unboxed: unbowed
-untagged: (none)

Even for an edit distance of 2, except for "inline" and "warning", the number
of implictely reserved useful names is limited:

-unboxed: unbowed, inboxes, unfixed, unmixed, unbaked, boxed, unbolted, unbound, unyoked
-untagged: untamed, unwaged, untanned, untapped, nagged, tagged, untangled,
snagged, untasted
-inlined: unlined, inland, online, mainlined, lined, inclines, relined, incised,
incited, indited, invited

With few exceptions, these words do not seem to be a problable choice for attribute name.
Moreover, the few exceptions could be added to the list of known false positives.
For instance, the list [unfixed; unmixed; online;line; mainlined; warding; tagged; boxed;unbound] seems quite reasonnable. It could also make sense to adapt the maximal edit distance identifier by identifier to decrease it to 1 for "inline" and "warning".

This commits add a warning and some new functions for detecting
potential misspelling of built-in attribute in `parsing/attr_helper`.
These new function are then used to detect possible misspelling of
all existing built-in attributes.
@Octachron Octachron force-pushed the misspelled_attribute_warning branch from a7e1aea to f1a4ce1 Compare December 27, 2015 17:06
@damiendoligez damiendoligez added this to the 4.04-or-later milestone Jan 22, 2016
@mshinwell mshinwell modified the milestones: 4.05-or-later, 4.04 Sep 7, 2016
@bobzhang
Copy link
Member

bobzhang commented Sep 19, 2016

I came across the same issue, how about adding a mutable field processed in the payload, so it will be marked as processed instead of being dropped.

Btw, another item I wish to be mutable is location

@damiendoligez damiendoligez modified the milestone: 4.05-or-later Feb 15, 2017
@Octachron
Copy link
Member Author

This PR has gone stale and I am not fond of the current implementation; I am therefore closing this PR. Since this seems to be the right kind of target for a compiler plugin, I may reimplement the proposed warning as such at a later date.

@Octachron Octachron closed this Jul 16, 2017
anmolsahoo25 pushed a commit to anmolsahoo25/ocaml that referenced this pull request Aug 25, 2020
mshinwell added a commit to mshinwell/ocaml that referenced this pull request Mar 26, 2021
stedolan pushed a commit to stedolan/ocaml that referenced this pull request Dec 13, 2021
chambart added a commit to chambart/ocaml-1 that referenced this pull request Jan 4, 2022
Co-authored-by: Mark Shinwell <mshinwell@gmail.com>
chambart pushed a commit to chambart/ocaml-1 that referenced this pull request Feb 1, 2022
23a7f73 flambda-backend: Fix some Debuginfo.t scopes in the frontend (ocaml#248)
33a04a6 flambda-backend: Attempt to shrink the heap before calling the assembler (ocaml#429)
8a36a16 flambda-backend: Fix to allow stage 2 builds in Flambda 2 -Oclassic mode (ocaml#442)
d828db6 flambda-backend: Rename -no-extensions flag to -disable-all-extensions (ocaml#425)
68c39d5 flambda-backend: Fix mistake with extension records (ocaml#423)
423f312 flambda-backend: Refactor -extension and -standard flags (ocaml#398)
585e023 flambda-backend: Improved simplification of array operations (ocaml#384)
faec6b1 flambda-backend: Typos (ocaml#407)
8914940 flambda-backend: Ensure allocations are initialised, even dead ones (ocaml#405)
6b58001 flambda-backend: Move compiler flag -dcfg out of ocaml/ subdirectory (ocaml#400)
4fd57cf flambda-backend: Use ghost loc for extension to avoid expressions with overlapping locations (ocaml#399)
8d993c5 flambda-backend: Let's fix instead of reverting flambda_backend_args (ocaml#396)
d29b133 flambda-backend: Revert "Move flambda-backend specific flags out of ocaml/ subdirectory (ocaml#382)" (ocaml#395)
d0cda93 flambda-backend: Revert ocaml#373 (ocaml#393)
1c6eee1 flambda-backend: Fix "make check_all_arches" in ocaml/ subdirectory (ocaml#388)
a7960dd flambda-backend: Move flambda-backend specific flags out of ocaml/ subdirectory (ocaml#382)
bf7b1a8 flambda-backend: List and Array Comprehensions (ocaml#147)
f2547de flambda-backend: Compile more stdlib files with -O3 (ocaml#380)
3620c58 flambda-backend: Four small inliner fixes (ocaml#379)
2d165d2 flambda-backend: Regenerate ocaml/configure
3838b56 flambda-backend: Bump Menhir to version 20210419 (ocaml#362)
43c14d6 flambda-backend: Re-enable -flambda2-join-points (ocaml#374)
5cd2520 flambda-backend: Disable inlining of recursive functions by default (ocaml#372)
e98b277 flambda-backend: Import ocaml#10736 (stack limit increases) (ocaml#373)
82c8086 flambda-backend: Use hooks for type tree and parse tree (ocaml#363)
33bbc93 flambda-backend: Fix parsecmm.mly in ocaml subdirectory (ocaml#357)
9650034 flambda-backend: Right-to-left evaluation of arguments of String.get and friends (ocaml#354)
f7d3775 flambda-backend: Revert "Magic numbers" (ocaml#360)
0bd2fa6 flambda-backend: Add [@inline ready] attribute and remove [@inline hint] (not [@inlined hint]) (ocaml#351)
cee74af flambda-backend: Ensure that functions are evaluated after their arguments (ocaml#353)
954be59 flambda-backend: Bootstrap
dd5c299 flambda-backend: Change prefix of all magic numbers to avoid clashes with upstream.
c2b1355 flambda-backend: Fix wrong shift generation in Cmm_helpers (ocaml#347)
739243b flambda-backend: Add flambda_oclassic attribute (ocaml#348)
dc9b7fd flambda-backend: Only speculate during inlining if argument types have useful information (ocaml#343)
aa190ec flambda-backend: Backport fix from PR#10719 (ocaml#342)
c53a574 flambda-backend: Reduce max inlining depths at -O2 and -O3 (ocaml#334)
a2493dc flambda-backend: Tweak error messages in Compenv.
1c7b580 flambda-backend: Change Name_abstraction to use a parameterized type (ocaml#326)
07e0918 flambda-backend: Save cfg to file (ocaml#257)
9427a8d flambda-backend: Make inlining parameters more aggressive (ocaml#332)
fe0610f flambda-backend: Do not cache young_limit in a processor register (upstream PR 9876) (ocaml#315)
56f28b8 flambda-backend: Fix an overflow bug in major GC work computation (ocaml#310)
8e43a49 flambda-backend: Cmm invariants (port upstream PR 1400) (ocaml#258)
e901f16 flambda-backend: Add attributes effects and coeffects (#18)
aaa1cdb flambda-backend: Expose Flambda 2 flags via OCAMLPARAM (ocaml#304)
62db54f flambda-backend: Fix freshening substitutions
57231d2 flambda-backend: Evaluate signature substitutions lazily (upstream PR 10599) (ocaml#280)
a1a07de flambda-backend: Keep Sys.opaque_identity in Cmm and Mach (port upstream PR 9412) (ocaml#238)
faaf149 flambda-backend: Rename Un_cps -> To_cmm (ocaml#261)
ecb0201 flambda-backend: Add "-dcfg" flag to ocamlopt (ocaml#254)
32ec58a flambda-backend: Bypass Simplify (ocaml#162)
bd4ce4a flambda-backend: Revert "Semaphore without probes: dummy notes (ocaml#142)" (ocaml#242)
c98530f flambda-backend: Semaphore without probes: dummy notes (ocaml#142)
c9b6a04 flambda-backend: Remove hack for .depend from runtime/dune  (ocaml#170)
6e5d4cf flambda-backend: Build and install Semaphore (ocaml#183)
924eb60 flambda-backend: Special constructor for %sys_argv primitive (ocaml#166)
2ac6334 flambda-backend: Build ocamldoc (ocaml#157)
c6f7267 flambda-backend: Add -mbranches-within-32B to major_gc.c compilation (where supported)
a99fdee flambda-backend: Merge pull request ocaml#10195 from stedolan/mark-prefetching
bd72dcb flambda-backend: Prefetching optimisations for sweeping (ocaml#9934)
27fed7e flambda-backend: Add missing index param for Obj.field (ocaml#145)
cd48b2f flambda-backend: Fix camlinternalOO at -O3 with Flambda 2 (ocaml#132)
9d85430 flambda-backend: Fix testsuite execution (ocaml#125)
ac964ca flambda-backend: Comment out `[@inlined]` annotation. (ocaml#136)
ad4afce flambda-backend: Fix magic numbers (test suite) (ocaml#135)
9b033c7 flambda-backend: Disable the comparison of bytecode programs (`ocamltest`) (ocaml#128)
e650abd flambda-backend: Import flambda2 changes (`Asmpackager`) (ocaml#127)
14dcc38 flambda-backend: Fix error with Record_unboxed (bug in block kind patch) (ocaml#119)
2d35761 flambda-backend: Resurrect [@inline never] annotations in camlinternalMod (ocaml#121)
f5985ad flambda-backend: Magic numbers for cmx and cmxa files (ocaml#118)
0e8b9f0 flambda-backend: Extend conditions to include flambda2 (ocaml#115)
99870c8 flambda-backend: Fix Translobj assertions for Flambda 2 (ocaml#112)
5106317 flambda-backend: Minor fix for "lazy" compilation in Matching with Flambda 2 (ocaml#110)
dba922b flambda-backend: Oclassic/O2/O3 etc (ocaml#104)
f88af3e flambda-backend: Wire in the remaining Flambda 2 flags (ocaml#103)
678d647 flambda-backend: Wire in the Flambda 2 inlining flags (ocaml#100)
1a8febb flambda-backend: Formatting of help text for some Flambda 2 options (ocaml#101)
9ae1c7a flambda-backend: First set of command-line flags for Flambda 2 (ocaml#98)
bc0bc5e flambda-backend: Add config variables flambda_backend, flambda2 and probes (ocaml#99)
efb8304 flambda-backend: Build our own ocamlobjinfo from tools/objinfo/ at the root (ocaml#95)
d2cfaca flambda-backend: Add mutability annotations to Pfield etc. (ocaml#88)
5532555 flambda-backend: Lambda block kinds (ocaml#86)
0c597ba flambda-backend: Revert VERSION, etc. back to 4.12.0 (mostly reverts 822d0a0 from upstream 4.12) (ocaml#93)
037c3d0 flambda-backend: Float blocks
7a9d190 flambda-backend: Allow --enable-middle-end=flambda2 etc (ocaml#89)
9057474 flambda-backend: Root scanning fixes for Flambda 2 (ocaml#87)
08e02a3 flambda-backend: Ensure that Lifthenelse has a boolean-valued condition (ocaml#63)
77214b7 flambda-backend: Obj changes for Flambda 2 (ocaml#71)
ecfdd72 flambda-backend: Cherry-pick 9432cfdadb043a191b414a2caece3e4f9bbc68b7 (ocaml#84)
d1a4396 flambda-backend: Add a `returns` field to `Cmm.Cextcall` (ocaml#74)
575dff5 flambda-backend: CMM traps (ocaml#72)
8a87272 flambda-backend: Remove Obj.set_tag and Obj.truncate (ocaml#73)
d9017ae flambda-backend: Merge pull request ocaml#80 from mshinwell/fb-backport-pr10205
3a4824e flambda-backend: Backport PR#10205 from upstream: Avoid overwriting closures while initialising recursive modules
f31890e flambda-backend: Install missing headers of ocaml/runtime/caml (ocaml#77)
83516f8 flambda-backend: Apply node created for probe should not be annotated as tailcall (ocaml#76)
bc430cb flambda-backend: Add Clflags.is_flambda2 (ocaml#62)
ed87247 flambda-backend: Preallocation of blocks in Translmod for value let rec w/ flambda2 (ocaml#59)
a4b04d5 flambda-backend: inline never on Gc.create_alarm (ocaml#56)
cef0bb6 flambda-backend: Config.flambda2 (ocaml#58)
ff0e4f7 flambda-backend: Pun labelled arguments with type constraint in function applications (ocaml#53)
d72c5fb flambda-backend: Remove Cmm.memory_chunk.Double_u (ocaml#42)
9d34d99 flambda-backend: Install missing artifacts
10146f2 flambda-backend: Add ocamlcfg (ocaml#34)
819d38a flambda-backend: Use OC_CFLAGS, OC_CPPFLAGS, and SHAREDLIB_CFLAGS for foreign libs (#30)
f98b564 flambda-backend: Pass -function-sections iff supported. (#29)
e0eef5e flambda-backend: Bootstrap (#11 part 2)
17374b4 flambda-backend: Add [@@Builtin] attribute to Primitives (#11 part 1)
85127ad flambda-backend: Add builtin, effects and coeffects fields to Cextcall (#12)
b670bcf flambda-backend: Replace tuple with record in Cextcall (#10)
db451b5 flambda-backend: Speedups in Asmlink (#8)
2fe489d flambda-backend: Cherry-pick upstream PR#10184 from upstream, dynlink invariant removal (rev 3dc3cd7 upstream)
d364bfa flambda-backend: Local patch against upstream: enable function sections in the Dune build
886b800 flambda-backend: Local patch against upstream: remove Raw_spacetime_lib (does not build with -m32)
1a7db7c flambda-backend: Local patch against upstream: make dune ignore ocamldoc/ directory
e411dd3 flambda-backend: Local patch against upstream: remove ocaml/testsuite/tests/tool-caml-tex/
1016d03 flambda-backend: Local patch against upstream: remove ocaml/dune-project and ocaml/ocaml-variants.opam
93785e3 flambda-backend: To upstream: export-dynamic for otherlibs/dynlink/ via the natdynlinkops files (still needs .gitignore + way of generating these files)
63db8c1 flambda-backend: To upstream: stop using -O3 in otherlibs/Makefile.otherlibs.common
eb2f1ed flambda-backend: To upstream: stop using -O3 for dynlink/
6682f8d flambda-backend: To upstream: use flambda_o3 attribute instead of -O3 in the Makefile for systhreads/
de197df flambda-backend: To upstream: renamed ocamltest_unix.xxx files for dune
bf3773d flambda-backend: To upstream: dune build fixes (depends on previous to-upstream patches)
6fbc80e flambda-backend: To upstream: refactor otherlibs/dynlink/, removing byte/ and native/
71a03ef flambda-backend: To upstream: fix to Ocaml_modifiers in ocamltest
686d6e3 flambda-backend: To upstream: fix dependency problem with Instruct
c311155 flambda-backend: To upstream: remove threadUnix
52e6e78 flambda-backend: To upstream: stabilise filenames used in backtraces: stdlib/, otherlibs/systhreads/, toplevel/toploop.ml
7d08e0e flambda-backend: To upstream: use flambda_o3 attribute in stdlib
403b82e flambda-backend: To upstream: flambda_o3 attribute support (includes bootstrap)
65032b1 flambda-backend: To upstream: use nolabels attribute instead of -nolabels for otherlibs/unix/
f533fad flambda-backend: To upstream: remove Compflags, add attributes, etc.
49fc1b5 flambda-backend: To upstream: Add attributes and bootstrap compiler
a4b9e0d flambda-backend: Already upstreamed: stdlib capitalisation patch
4c1c259 flambda-backend: ocaml#9748 from xclerc/share-ev_defname (cherry-pick 3e937fc)
00027c4 flambda-backend: permanent/default-to-best-fit (cherry-pick 64240fd)
2561dd9 flambda-backend: permanent/reraise-by-default (cherry-pick 50e9490)
c0aa4f4 flambda-backend: permanent/gc-tuning (cherry-pick e9d6d2f)

git-subtree-dir: ocaml
git-subtree-split: 23a7f73
EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024
* line editing for industrial.eml

* more capitalisation / title changes

Signed-off-by: Christine Rose <professor.rose@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants