Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT: Simplify the API and inner implementation #85

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

elichai
Copy link
Contributor

@elichai elichai commented Apr 14, 2024

As discussed in #84 this refactors the API to replace RAII based pushd/popd into subtree like structure where you get a subshell by cloning/push_dir.

This is a draft for an initial implementation, I'm also playing with the idea of having our own trait AsOsStr with some kind of private enum like:

enum OsStr {
    Shared(Arc<OsStr>),
    Static(&'static str),
}

that way things like sh.env("foo", "bar") will not do any heap alloctation aside for pushing into the hashmap

@elichai elichai marked this pull request as draft April 14, 2024 08:42
@elichai
Copy link
Contributor Author

elichai commented Apr 15, 2024

With this things like #83 become much easier

Copy link
Owner

@matklad matklad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also playing with the idea of having our own trait AsOsStr with some kind of private enum like:

I'd be against that:

  • traits and custom enums are complexity. xshell favors simplicity very much
  • This should not be a performance bottleneck. You use xshell to read files and/or run externall processes. The relevant systcalls should dwarf any nanoseconds we spend on allocation and memcpying

In general xshell is control plane, not the data plane. It is out of program's hot loop, so it should be happy to allocate if that provides better user experience!


#[derive(Debug, Default, Clone)]
struct CmdData {
pub struct Cmd {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, we are getting rid of lifetime on Cmd? But this is excellent! This makes me think that we do want to go this way, feels so much simpler now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yess :D My main motivation is to remove the lifetime and the RefCells that complicated passing refrences due to borrow handles

src/lib.rs Outdated
Comment on lines 387 to 388
cwd: Arc<Path>,
env: HashMap<Arc<OsStr>, Arc<OsStr>>,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also do

struct Shell {
    inner: Arc<SHellInner>
}
struct SHellINnner {
    parent: Option<Arc<ShellInner>>, 
    env: ...
    cwd: ...
}

and then Cmd can hold a reference to ShellInner.

I think it doesn't matter much in the end of the day what exactly we use here. What matters is that Arcs should not bleed into the public API, which I think is true of the current implementation, so that's all good!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm that's interesting, Then in a Cmd if you modify either the cwd or the env you'll CoW it via Arc::get_mut.

As for keeping the parent, are you imagining some API that exposes it somehow? or do you want the env to be applied recursively? (that'll probably cost more than cloning the full env)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or do you want the env to be applied recursively? (that'll probably cost more than cloning the full env)

This is what I had in mind, but, yeah, thats probably overkill, and its better to just clone the hash map itself.

The good news is that ultimately it doesn't matter, because in Rust muttable and persistant data structures have the same API.

}

impl std::panic::UnwindSafe for Shell {}
impl std::panic::RefUnwindSafe for Shell {}

/// You can use `Shell` in a tree manner by cloning the shell and modifying the `cwd`/`env` as needed.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should exand on this quite a bit in the "guide" section in the module-level docs. But that can be left to a follow up, after the dust on the API settles.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course :) I didn't invest any time into docs/examples/tests because I wanted to first get some feedback on the ideas

@@ -406,35 +406,31 @@ impl Shell {
/// All relative paths are interpreted relative to this directory, rather
/// than [`std::env::current_dir`].
#[doc(alias = "pwd")]
pub fn current_dir(&self) -> PathBuf {
self.cwd.borrow().clone()
pub fn current_dir(&self) -> &Path {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this change!

None => Err(VarError::NotPresent),
}
.map_err(|err| Error::new_var(err, key.to_os_string()))
.map_err(|err| Error::new_var(err, key.into()))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to think through holistic set of names for env&path manipulation functions. Perhaps

current_dir
set_current_dir
with_current_dir

env_var
set_env_var
remove_env_var
with_env_var

env_vars

Notes:

  • either we have set_env_var with option or we have set/remove pair
  • in the latter case, I don't think we need without_env_var --- this is a rare operation, let the caller explicitly clone and remove env var
  • we need os versions as well
  • ideally, we also need clear_env. But that's hard, because windows just doesn't work if you clear entire env. So, we shouold probably have clear_env which on windows acutally leaves some of the vars intact, and clear_env_fully which makes it completely clear. No need to add this API right of the bat, but we should make sure it'll fit with our naming conventions otherwise.
  • env_vars probably wants to return an iterator, like the stdlib? Although I don't understand why std returns an iterator here, it holds a verctor internally anyway, and could have returned a vec. Need to make a decission betwen sticking with mirroring std, or trying to improve upon it.

@@ -475,7 +466,7 @@ fn string_escapes() {

#[test]
fn nonexistent_current_directory() {
let sh = setup();
let mut sh = setup();
sh.change_dir("nonexistent");
let err = cmd!(sh, "ls").run().unwrap_err();
let message = err.to_string();
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a small ask, but consider whether, to better judge if the APIs we add are ergonomic, we want to add an extended example.

I imagine something like examples/api_walkthrough which show cases all common methods of the API, using a somewhat real-world-ish examples. I imagine the example can be organized as a set of fucntion, where each function does some "real world" task, and the set of functions together covers most of the public APIs.

The benefits here wouold be:

  • for us, we'd get a better feel for the API looking at the call-site, rather than the definition site
  • for users, they'll get another excellent documentation option.

@elichai
Copy link
Contributor Author

elichai commented Apr 15, 2024

@matklad Thank you for the thorough feedback and review!

This should not be a performance bottleneck. You use xshell to read files and/or run externall processes. The relevant systcalls should dwarf any nanoseconds we spend on allocation and memcpying

Yeah I was going back and forth a lot trying to find a "good" common ground between cloning everything and having a simple API and library :)

@elichai
Copy link
Contributor Author

elichai commented May 21, 2024

Hi, I'm back from a long vacation :)
pushed a small update, if you're Ok with the direction I can soon find time to work on updating tests and examples to better show the intended use-cases

Copy link
Owner

@matklad matklad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, I definitely love the direction this is going into!

Comment on lines 475 to 481
if let Some((key, _)) =
self.env_vars_os().find(|(a, b)| a.to_str().or(b.to_str()).is_none())
{
return Err(Error::new_var(VarError::NotUnicode(key.into()), key.into()));
}
Ok(self.env_vars_os().map(|(k, v)| (k.to_str().unwrap(), v.to_str().unwrap())))
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this looks weird. I'd expect either Resut<Vec<_>> or impl Iterator (with a panic on next).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Panic on next? Did you mean Result on next?
If so, this is not super ergonomic unless you call collect on it (hard to chain maps/filters and another variable in a loop)

Should we just return Result<HashMap<... >>? Like, converting to a Vec of pairs also feels a little weird, but I'm really not sure what's the "right" option here

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant panic on next, like the std does.

the API which I think makes sense is returning Result Vec. The api that matches std is a panicky iterator. To be honest, I am a bit confused how to chose between the two.

But returning an iterator while also doing proactive utf8 validation seems inferior to either

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants