Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dep optimization #1

Open
wants to merge 73 commits into
base: main
Choose a base branch
from
Open

Dep optimization #1

wants to merge 73 commits into from

Conversation

nojb
Copy link
Owner

@nojb nojb commented Aug 5, 2024

No description provided.

@nojb
Copy link
Owner Author

nojb commented Aug 5, 2024

@MA0010 this PR is for comments on your branch; you should receive an email notification each time there is a new comment/review.

Comment on lines 49 to 51
- ``(order_only <dep_spec>)`` read the s-expression in ``<dep_spec>`` and interpret it as
a dependency specification (i.e. any type mentioned in this list), and declares this dependency as order_only,
meaning that if a change occurs to it, the rule will not be run again.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite right; (order_only ...) does not read any s-expression. Also the description is not as accurate as it could be. I suggest:

``(order_only <deps>)`` declares ``<deps>`` as "order only" dependencies. This means that these
dependencies will be guaranteed to be up-to-date before building the target, but changes in them will
not cause the target to be rebuilt.

Comment on lines 2 to 1
[@@@alert "-unstable"]

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff is not needed. You can test this change with 5.1 or even 4.14 to avoid the bug in OCaml.

Comment on lines 6 to 12
let maybe_async f =

(match Config.(get background_actions) with
| `Enabled -> Scheduler.async_exn
| `Disabled -> fun f -> Fiber.return (f ()))
| `Enabled -> Scheduler.async_exn f
| `Disabled -> Fiber.return (f ()))
in
fun f -> (Lazy.force maybe_async) f
fun f -> maybe_async f
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this diff needed?

{ done_or_more_deps : Action_plugin.done_or_more_deps;

needed_deps : Dep.Set.t
}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget to ocamlformat!

Comment on lines 510 to 522
let complete_path = function
| Dep.File x -> Dep.file (Path.relative eenv.working_dir (Path.to_string x))
| Dep.File_selector x ->
let path = Path.to_string (File_selector.dir x) in
let dir = Path.relative eenv.working_dir path in
Dep.file_selector (File_selector.edit_dir ~dir x)
(*| Dep.Alias x ->
let name = Alias.name x in
let dir = Path.to_string (Alias.dir x) in
let dir = Path.relative eenv.working_dir dir in
let alias = Alias.make name dir in
Dep.alias alias *)
| x -> x
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, this logic should be part of the decoder.

Comment on lines 141 to 145
type action_res =
{ done_or_more_deps : Action_plugin.done_or_more_deps;

needed_deps : Dep.Set.t
}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type should be defined in its own submodule with a union function instead of action_res_union below.

@@ -264,14 +278,15 @@ let exec_run_dynamic_client ~display ~ectx ~eenv prog args =
that is incompatible with this version of dune."
prog_name
]
| Ok Done -> Done
| Ok Done -> {done_or_more_deps = Done; needed_deps = Dep.Set.empty}
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest define return : ?needed_deps:Dep.Set.t -> done_or_more_deps -> action_res and use that instead of these record literals.

@@ -59,6 +59,7 @@ type t = private
; mode : Mode.t
; info : Info.t
; loc : Loc.t
; requires_a_deps : bool
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this wasn't necessary anymore.

Comment on lines 356 to 358
(if List.length xs > 1
then Syntax.since ~what:"Passing several arguments to 'needed_deps'" Stanza.syntax (3, 4)
else return ())
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should not be here. This was used in the cat parser to alert passing several arguments, since then the execution of cat depends on the version of dune.

rgrinberg and others added 3 commits August 6, 2024 10:25
None of this code needs to exist in the engine and lives just as happily
as a separate library.

Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
In 35b3e0d we started passing the
configured parallelism to package builds by setting their `jobs`
variable to the degree of parallelism configured for dune, but it
seems this change was only applied for builds started with
`--watch`. This commit applies the configured parallelism to package
builds when dune is run without `--watch`.

Signed-off-by: Stephen Sherratt <stephen@sherra.tt>
@MA0010
Copy link

MA0010 commented Aug 6, 2024

@nojb I have added a new commit containing the suggested tweaks + some other changes. I still need to figure out the issue with working_dir type; it seems for now that it is only pointing to the build directory _build/default, but it is used in a lot of functions that require a Path.t type instead of the more specific Path.Build.t.

Comment on lines 157 to 166
let return ?needed_deps done_or_more_deps =
match needed_deps with
| None -> { done_or_more_deps; needed_deps = Dep.Set.empty }
| Some needed_deps -> { done_or_more_deps; needed_deps }
;;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let return ?needed_deps done_or_more_deps =
match needed_deps with
| None -> { done_or_more_deps; needed_deps = Dep.Set.empty }
| Some needed_deps -> { done_or_more_deps; needed_deps }
;;
let return ?needed_deps done_or_more_deps =
let needed_deps = Option.value ~default:Dep.Set.empty needed_deps in
{ done_or_more_deps; needed_deps }
;;

Action_res.return Done
| Needed_deps needed ->
let ds = Dep.decode eenv.working_dir in
(*the files should first be parsed*)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(*the files should first be parsed*)

Comment on lines 30 to 34
let edit_dir ~dir t =
let predicate = t.predicate in
let only_generated_files = t.only_generated_files in
of_predicate_lang ~dir ~only_generated_files predicate
;;
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary, right?

@@ -350,6 +351,10 @@ let cstrs_dune_file t =
, Syntax.since Stanza.syntax (2, 7)
>>> let+ script = sw in
Cram script )
; ( "needed_deps"
, Syntax.since Stanza.syntax (2, 7)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version seems wrong.

Comment on lines 260 to 262
(*let l = List.fold_left ~init:[] ~f:(fun acc x -> match (dep expander x) with
| Simple _ as a -> (to_action_builder a) :: acc
| Other x -> x :: acc) s*)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(*let l = List.fold_left ~init:[] ~f:(fun acc x -> match (dep expander x) with
| Simple _ as a -> (to_action_builder a) :: acc
| Other x -> x :: acc) s*)

Comment on lines 646 to 647
match exec_result.action_exec_result.needed_deps with
| deps, facts -> Some (deps, Dep.Facts.digest facts ~env:action.env)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use let deps, fact = ....

@@ -9,14 +9,19 @@ module Workspace_local = struct
{ rule_digest : Digest.t
; dynamic_deps_stages : (Dep.Set.t * Digest.t) list
; targets_digest : Digest.t
; needed_deps : (Dep.Set.t * Digest.t) option
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the option is not needed, right?

rgrinberg and others added 13 commits August 6, 2024 14:44
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
This solves a lot of the waiting leftover HTTP processes etc. The new
fake curl will read the list of files to serve from `fake-curls` and map
each line to a port.

To match the behavior of the oneshot-webserver it will only serve each
port once, subsequent requests for the same port will be answered by 404
(so not exactly the same but similar enough), the ports that were served
are recorded into an `already-served` file.

The same file can be served multiple times, as long as it is present
multiple times in `fake-curls`.

Signed-off-by: Marek Kubica <marek@tarides.com>
Signed-off-by: Stephen Sherratt <stephen@sherra.tt>
This changes the variable that enables toolchains from
DUNE_CONFIG__TOOLCHAINS_ENABLED=[true|false] to
DUNE_CONFIG__TOOLCHAINS=[enabled|disabled], following the established
convention for such variables in dune, and for compatibility with
upcoming support for compile-time configuration.

Signed-off-by: Stephen Sherratt <stephen@sherra.tt>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
do not pass entire package when we only need the depopts

Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
And separate some Memo code into separate functions

Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
First pass at progress indication for lockdir generation, where the
status line displays a message while the package repos are being
updated, and a second message during solving. This might be updated in
the future to display finer-grained status updates.

Signed-off-by: Stephen Sherratt <stephen@sherra.tt>
@MA0010
Copy link

MA0010 commented Aug 12, 2024

Hello @nojb, I have added 2 new commits. The first contains some code tweaks and corrections, and the second contains tests written for the new features.

There are two tests for now, but I think that the needed-deps.t is enough (there is no need for the separate order-only.t, because the feature is tested inside needed-deps.t).

Please when you are free, try to look at the test files and give your opinion on the structure and any suggestions that you come up with. I will now gaurd the new features under the new dune version and do a primary commit.

Thanks!
Have a nice holiday!

Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
unreachable packages are those that aren't reachable via the
dependencies of local packages

typically, such packages are included when solving via post deps

Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
Leonidas-from-XIV and others added 2 commits August 13, 2024 11:57
Signed-off-by: Marek Kubica <marek@tarides.com>
Co-authored-by: Etienne Millon <me@emillon.org>
@@ -46,6 +46,9 @@ Dependencies in ``dune`` files can be specified using one of the following:
- ``(include <file>)`` read the s-expression in ``<file>`` and interpret it as
additional dependencies. The s-expression is expected to be a list of the
same constructs enumerated here.
- ``(order_only <deps>)`` declares ``<deps>`` as "order only" dependencies. This means that these
dependencies will be guaranteed to be up-to-date before building the target, but changes in them will
not cause the target to be rebuilt.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please resepect existing style (indentation, line width, etc).


Example::

(needed_deps deps)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a full example; a full example and an explanation of the expected behaviour should be added.

``file <filename>
alias <alias_name>
file_selector <glob>
universe``
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we need a fuller description of each constructor. See for example the section on "Dependency Specification" in the Dune manual. https://dune.readthedocs.io/en/stable/concepts/dependency-spec.html

@@ -300,6 +303,7 @@ let is_useful_to memoize =
| System _ -> true
| Bash _ -> true
| Extension (module A) -> A.Spec.is_useful_to ~memoize
| Needed_deps _ -> memoize
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does memoize mean here?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It means that it makes sense to store the rule in the cache, but it does not make sense to look up the target of the action in the distributed (or shared) cache. However, giving a second look at our final definition of needed_deps, I think that both of these operations are not necessary, so this should be set to false.

Comment on lines 40 to 41
(* let dir = Path.Build.of_string (Path.to_string w_dir) in*)
(*let dir = Path.Build.of_string "_build/.actions/default" still under test*)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove stale comments.

@@ -3,6 +3,7 @@ open Import
module Name : sig
include module type of Dune_util.Alias_name with type t = Dune_util.Alias_name.t

val of_string : string -> t
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change needed?

@@ -28,6 +29,30 @@ module T = struct
let universe = Universe
let file_selector g = File_selector g

let decode w_dir =
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let decode w_dir =
let decode dir =

@@ -103,8 +105,13 @@ let decode files =
, let+ () = Syntax.since Stanza.syntax (3, 1)
and+ filename = filename in
Include filename )
; ( "order_only"
, let+ () = Syntax.since Stanza.syntax (3, 18)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last released version of Dune is 3.16; isn't 3.17 enough?

Suggested change
, let+ () = Syntax.since Stanza.syntax (3, 18)
, let+ () = Syntax.since Stanza.syntax (3, 17)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is the case, then yes. I thought that 3.17 was released because the version was bumped in the dune codebase.

@MA0010
Copy link

MA0010 commented Aug 14, 2024

Hello @nojb, just to explain my last commits. I have removed the version bumping of dune. Also, I applied the suggested tweaks to the code, and finally I expanded the docs.
Please when you have time, check the new written docs and give me any suggestions you might have.

Thanks!

Dune_sexp belongs to dune's frontend and the engine has no business
relying on it.

This is a change we've done internally, so this is just syncing.

Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
emillon and others added 16 commits September 4, 2024 10:54
Signed-off-by: Etienne Millon <me@emillon.org>
* Better patch format parsing

Signed-off-by: Ambre Austen Suhamy <ambre@tarides.com>

* Clarify that we don't care about the timestamp itself

Signed-off-by: Ambre Austen Suhamy <ambre@tarides.com>

* Undo style changes

Signed-off-by: Ambre Austen Suhamy <ambre@tarides.com>

---------

Signed-off-by: Ambre Austen Suhamy <ambre@tarides.com>
Signed-off-by: Marek Kubica <marek@tarides.com>
* Use backslashes when passing staged pps driver on Windows

Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
* perf: better names for trace events

Signed-off-by: Javier Chávarri <javier.chavarri@gmail.com>
* feat: create an alias from pkg to package

Signed-off-by: Etienne Marais <dev@maiste.fr>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>
support for ocamlformat is hardcoded anyway, so just setup the rules
with the extra expansion step for actions defined using the dune lang.

Co-authored-by: Stephen Sherratt <stephen@sherra.tt>
Signed-off-by: Rudi Grinberg <me@rgrinberg.com>

<!-- ps-id: ba96eea9-9b6c-4eb4-be11-cea45f0c9f51 -->
Make it possible to attempt to read a lockdir from a directory without
raising a [User_error] if the lockdir is not present.

Signed-off-by: Stephen Sherratt <stephen@sherra.tt>
…void useless recompilations when Dune's stderr is redirected (ocaml#10883)

Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
Signed-off-by: Javier Chávarri <javier.chavarri@gmail.com>
)

Signed-off-by: Marek Kubica <marek@tarides.com>
* fix: enabled_if with `dune describe`

Fixes ocaml#10779.

Signed-off-by: Alpha DIALLO <moyodiallo@gmail.com>
…aml#10899)

Signed-off-by: Marek Kubica <marek@tarides.com>
This makes it possible to check whether the user is currently inside a
dune project. This is needed for dev-tools to have fallback behaviour
when run outside of a dune project which is necessary to allow users
to configure their editors to invoke dev-tools via dune. For example,
if a user configures their editor to start ocamllsp with the command
`dune ocamllsp`, dune needs to be able to handle the case where the
user edits an ocaml file outside of a dune project, for example by
attempting to invoke ocamllsp from PATH.

Signed-off-by: Stephen Sherratt <stephen@sherra.tt>
@MA0010
Copy link

MA0010 commented Sep 10, 2024

Hello @nojb, I have added 3 new commits that (hopefully) contain a functional code for the shared cache and build_deps edit, ready to be tested.

However, for now, I removed passing the Loc.t to the function that parses the csexp of the file stored in the shared cache. Also, it remains to pass the correct location of the rule when raising the error while building needed_deps. I will work on that, but I wanted to commit the code so that it can be tested, since there is some behaviour that needs to be verified.

@nojb
Copy link
Owner Author

nojb commented Sep 10, 2024

Thanks, I'll take a look.

"Cannot parse file digest %s in cache metadata entry"
deps_digest)))
| _ -> Error (Failure "Cannot parse cache metadata entry")
;;*)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove debug code.

@@ -108,6 +108,7 @@ let rec encode : Action.For_shell.t -> Dune_lang.t =
| Pipe (outputs, l) ->
List (atom (sprintf "pipe-%s" (Outputs.to_string outputs)) :: List.map l ~f:encode)
| Extension ext -> List [ atom "ext"; ext ]
| Needed_deps xs -> List (atom "needed_deps" :: List.map xs ~f:path)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"needed deps" does not read very well; I suggest to change it to "actual_deps" everywhere.

@@ -46,6 +46,9 @@ Dependencies in ``dune`` files can be specified using one of the following:
- ``(include <file>)`` read the s-expression in ``<file>`` and interpret it as
additional dependencies. The s-expression is expected to be a list of the
same constructs enumerated here.
- ``(order_only <deps>)`` declares ``<deps>`` as "order only" dependencies.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, maybe we could replace "order_only" by "maybe_deps" ?

@@ -138,6 +139,27 @@ module Exec_result = struct
;;
end

module Action_res = struct
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does Action_res stand for? Maybe Action_deps would be better?

Comment on lines 577 to 578
Produce.return
(Action_res.return need ~needed_deps)
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Produce.return action_res, right?

let* action_res = exec t ~display ~ectx ~eenv in
(match action_res with
| { done_or_more_deps = d; needed_deps } ->
Produce.return (Action_res.return d ~needed_deps))
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Produce.return action_res?

}
;;

let return ?needed_deps done_or_more_deps =
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you never pass needed_deps or Need_more_deps _ at the same time, I suggest to have instead:

val no_deps: t
val need_more_deps: ... -> t
val actual_deps: Dep.Set.t -> t

| false -> Fiber.return (Miss Miss_reason.Dynamic_deps_changed))
in
loop prev_trace.dynamic_deps_stages
loop prev_trace.dynamic_deps_stages prev_trace.needed_deps
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is overly complicated; the check should have the form check1 prev_trace.dynamic_deps_stages || check2 prev_trace.neede_deps, there is no need to mix the check for both types of dependencies.

@nojb
Copy link
Owner Author

nojb commented Sep 11, 2024

Let's hold off on renaming "order_only" to "maybe_deps" and "needed_deps" to "actual_deps", we may have other ideas later.

Signed-off-by: HasanA <mhmd_alameen1023@outlook.com>
@MA0010 MA0010 force-pushed the dep_optimization branch 4 times, most recently from 8f6a480 to eae7315 Compare September 16, 2024 13:59
Signed-off-by: HasanA <mhmd_alameen1023@outlook.com>
Signed-off-by: HasanA <mhmd_alameen1023@outlook.com>
@MA0010
Copy link

MA0010 commented Sep 17, 2024

Hello @nojb, the code should be ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.