Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify fc.webPath to be able to generate paths starting with // #4896

Closed
astlouisf opened this issue Apr 12, 2024 · 19 comments
Closed

Modify fc.webPath to be able to generate paths starting with // #4896

astlouisf opened this issue Apr 12, 2024 · 19 comments

Comments

@astlouisf
Copy link

astlouisf commented Apr 12, 2024

🚀 Feature Request

fc.webPath should be able to generate paths starting with '//'.

The following grammar derivation (starting from the path definition in rfc3986) suggests that they should be valid paths

path = path-abempty / path-absolute / path-noscheme / path-rootless / path-empty
     = path-abempty
     = *( "/" segment )
     = ( "/" segment ) *( "/" segment )
     = ( "/" segment ) ( "/" segment ) *( "/" segment )
     = "/" segment "/" segment *( "/" segment )
     = "/" *pchar "/" segment *( "/" segment )
     = "/" "/" segment *( "/" segment )

I haven't dug in the code yet, but the current implementation doesn't seem to ever produce paths matching this pattern:

it('never generate //', () => {
  fc.assert(fc.property(fc.webPath(), (pathname) => {
    expect(pathname).not.toMatch(/^\/\/.*/);
  }), {numRuns: 100000});
});

Motivation

It would catch more bugs :) and seems like a current blind spot of the webPath implementation.

Example

The following test should fail (sometimes).

it('preserve origin', () => {
  fc.assert(fc.property(fc.webPath(), fc.webUrl(), (pathname, urlString) => {
    const url = new URL(urlString);
    expect(new URL(pathname, urlString).origin).toBe(url.origin);
  }));
});
@astlouisf astlouisf changed the title Modify webPath to be able to generate paths starting with // Modify fc.webPath to be able to generate paths starting with // Apr 12, 2024
@gruhn
Copy link
Contributor

gruhn commented Apr 13, 2024

I looked into this and I was surprised initially. Looks like webPath is just constructed from a joined fc.array(fc.webSegment()) and since webSegment can generate empty strings, webPath should totally be able to generate "//". However, looks like the default size parameter for the array is xsmall so only very few path segments are generated. I computed some statistics and looks like webPath pretty much never generates more than 2 segments (which would be necessary to generate "//"):

fc.statistics(fc.webPath(), path => String(path.split('/').length >= 3), 10_000)
// ==> false..100.00%

However, if I manually increase the size parameter (fc.webPath({ size: '+1' })) I can generate "//" sometimes and make both your tests fail.

@astlouisf
Copy link
Author

astlouisf commented Apr 15, 2024

I've experimented a bit more and it doesn't seem to generate paths directly starting with a segment either

fc.statistics(fc.webPath({size:'+1'}), (path) => { return path.match(/^[^/]/) ? 'initial segment': 'absolute or empty'; });
// ==> absolute or empty..100.00%

this is expected considering

but it raises the question about which part of RFC 3986 fc.webPath corresponds to since path-rootless would allow an initial segment.

@gruhn
Copy link
Contributor

gruhn commented Apr 17, 2024

@dubzzz could we have

  return safeJoin(
    segments,
    '/',
  );

instead of

  return safeJoin(
    safeMap(segments, (v) => `/${v}`),
    '',
  );

That would still allow web paths with leading "/" when the first segment is an empty string, right?

@dubzzz
Copy link
Owner

dubzzz commented Apr 17, 2024

I believe segments don't come with the /, so we might only have one leading slash with that approach and no others.

That said, that might be a good idea.

@gruhn
Copy link
Contributor

gruhn commented Apr 17, 2024

I mean if we the array of segments looks like this: ["", "", "foo", "bar"] and we join it with / we get: //foo/bar, no?

@dubzzz
Copy link
Owner

dubzzz commented Apr 17, 2024

Oh yes. In theory (except bug), it should

@dubzzz
Copy link
Owner

dubzzz commented Apr 17, 2024

I quickly looked into the code and definitely I feel that // should be produced at some point. I'll try to get what goes wrong. From my quick scan of the code I'd expect it to work fine but apparently it doesn't.

@gruhn
Copy link
Contributor

gruhn commented Apr 17, 2024

Also just checked when I change segmentsToPathMapper to

safeJoin(segments, '/');

the distribution changes quite a bit:

function countLeadingSlashes(path: string): number {
  if (path.charAt(0) === '/') {
    return 1 + countLeadingSlashes(path.slice(1))
  } else {
    return 0
  }
}

fc.statistics(fc.webPath({ size: '+1' }), path => countLeadingSlashes(path) + " leading slashes", 100_000)   
// 0 leading slashes..92.74%
// 1 leading slashes...6.72%
// 2 leading slashes...0.49%
// 3 leading slashes...0.04%
// 4 leading slashes...0.01%

So while previously there was always at least one leading slash, now there is no leading slash most of the time. Not sure if that's desired.

Side note: how do you treat distribution changes like this? Is that a minor/major/patch/etc version bump?

Also not sure if this is completely faithful to the spec now.

@astlouisf
Copy link
Author

astlouisf commented Apr 17, 2024

If I understand correctly, fc.webPath used to be an internal arbitrary that was used to implement fc.webUrl. This lead me to believe that paths generated by fc.webPath correspond only to path-abempty from the spec since fc.webUrl seems to always generate an authority.

This would mean that a leading slash is expected.

@dubzzz
Copy link
Owner

dubzzz commented Apr 17, 2024

Well actually from my understanding it's theoretically feasible to get such "//" value out of the current version of the implementation. I'm currently trying to understand why it seems to never happen.

The code:

fc.array(fc.webSegment({ size: "small" }), { size: "small" }) //1.
  .map(data => data.map((v) => `/${v}`).join('')) //2.

Should be able to create an instance being equal to ["",""] in 1. And then to result into //.

Currently checking what I'm missing.

@dubzzz
Copy link
Owner

dubzzz commented Apr 17, 2024

So strange, I'll have to dig further:

// Fails (expected)
fc.assert(
  fc.property(
    fc.array(fc.webSegment({ size: "small" }), { size: "small" }),
    p => {
      return !(p[0] === p[1] && p[0] === '');
    }
  ), {numRuns:1000000}
);

// Fails (expected)
fc.assert(
  fc.property(
    fc.array(fc.webSegment({ size: "small" }), { size: "small" })
      .map(data => data.map((v) => `/${v}`).join('')),
    p => {
      return !p.startsWith('//');
    }
  ), {numRuns:1000000}
);

// No fail (not expected)
fc.assert(
  fc.property(
    fc.webPath(),
    p => {
      return !p.startsWith('//');
    }
  ), {numRuns:1000000}
);

@dubzzz
Copy link
Owner

dubzzz commented Apr 17, 2024

I made a typo: it should be array(webSegment({ size: 'small' }), { size: 'xsmall' })

@dubzzz
Copy link
Owner

dubzzz commented Apr 17, 2024

Ok, so by adding the "+1" for the size requested to the webPath I get the two //.

fc.assert(
  fc.property(
    fc.webPath({size:'+1'}),
    p => {
      return !p.startsWith('//');
    }
  ), {numRuns:1000000}
);

It also makes your initial test fails:

fc.assert(
  fc.property(
    fc.webPath({size:'+1'}),
    p => {
      return !/^\/\/.*/.test(p);
    }
  ), {numRuns:1000000}
);

Regarding your question: yes it used to be

If I understand correctly, fc.webPath used to be an internal arbitrary that was used to implement fc.webUrl.


Regarding rootless, maybe we should add a withRootLess option to support this need.

but it raises the question about which part of RFC 3986 fc.webPath corresponds to since path-rootless would allow an initial segment.

@dubzzz
Copy link
Owner

dubzzz commented Apr 17, 2024

As justified by @astlouisf, I think we need the first / because of webUrl today. So the fix consisting into doing the following:

safeJoin(segments, '/');

Is probably not enough alone. as such it would be a breaking for the contract that users may expect from webPath (given today they can see it always starts by a /). I'd go for an option to include root-less for this major and possibly drop it and make it the default for the next major of the library.

@astlouisf
Copy link
Author

astlouisf commented Apr 18, 2024

@dubzzz Thank you for investigating this!

I'm not sure root-less should necessarily be added as part of fc.webPath (even though I brought it up 😅). The more I dug into RFC 3986 the more I realized that paths really depend on contexts and, sometimes, those contexts are conflicting.

If other type of paths are desired, I think specific arbitraries should be implemented.

In any cases, I also think the documentation for fc.webPath could potentially be improved by specifying that fc.webPath generates path compatible with path-abempty.

@dubzzz
Copy link
Owner

dubzzz commented Apr 18, 2024

We can definitely just update the documentation then. Opened to PR clarifying it

@astlouisf
Copy link
Author

I still think fc.webPath should generate path of the form //* in it's default configuration.

@dubzzz
Copy link
Owner

dubzzz commented Apr 18, 2024

Good point, I have a non breaking change in mind to handle that need in a timely manner. I'll deal with it.

For the update of the documentation, I prefer letting you doing it. You spent time on this issue so I'll be happy to have your PR and add you as a contributor 👍

samhh added a commit to samhh/fp-ts-std that referenced this issue Apr 19, 2024
Avoiding fast-check for this until this is
resolved:
  dubzzz/fast-check#4896

Not using Jest snapshots as they're not supported
by the Node.js or Bun test runners.

Co-authored-by: Alexandre St-Louis Fortier <alexandre@unsplash.com>
dubzzz added a commit that referenced this issue Apr 26, 2024
Even if it used to be already the case, users had to specify a larger than default size to be able to see paths containing a "//" into them. As "//" is crucial in paths and totally part of the spec, they need to be generated with our default size too.

The current PR makes sure we have a better balance between number of segments in a path and length of the segments. Our previous version (with default size) used to sacrify the number of segments and to prefer longer one. We now mix both.

Related to #4896
dubzzz added a commit that referenced this issue Apr 26, 2024
Even if it used to be already the case, users had to specify a larger
than default size to be able to see paths containing a "//" into them.
As "//" is crucial in paths and totally part of the spec, they need to
be generated with our default size too.

The current PR makes sure we have a better balance between number of
segments in a path and length of the segments. Our previous version
(with default size) used to sacrify the number of segments and to prefer
longer one. We now mix both.

Related to #4896

<!-- Context of the PR: short description and potentially linked issues
-->

<!-- ...a few words to describe the content of this PR... -->
<!-- ... -->

<!-- Type of PR: [ ] unchecked / [ ] checked -->

**_Category:_**

- [x] ✨ Introduce new features
- [ ] 📝 Add or update documentation
- [ ] ✅ Add or update tests
- [ ] 🐛 Fix a bug
- [ ] 🏷️ Add or update types
- [ ] ⚡️ Improve performance
- [ ] _Other(s):_ ...
  <!-- Don't forget to add the gitmoji icon in the name of the PR -->
  <!-- See: https://gitmoji.dev/                                  -->

<!-- Fixing bugs, adding features... may impact existing ones -->
<!-- in order to track potential issues that could be related to your PR
-->
<!-- please check the impacts and describe more precisely what to expect
-->

**_Potential impacts:_**

<!-- Generated values: Can your change impact any of the existing
generators in terms of generated values, if so which ones? when? -->
<!-- Shrink values: Can your change impact any of the existing
generators in terms of shrink values, if so which ones? when? -->
<!-- Performance: Can it require some typings changes on user side?
Please give more details -->
<!-- Typings: Is there a potential performance impact? In which cases?
-->

- [x] Generated values: anything related on web-path
- [x] Shrink values: anything related on web-path
- [x] Performance: anything related on web-path might be slightly
slower, but probably unnoticeable
- [ ] Typings
- [ ] _Other(s):_ ...
@dubzzz
Copy link
Owner

dubzzz commented Apr 28, 2024

Released at 3.18.0. Feel free to re-open in case it does not fix the original issue.

@dubzzz dubzzz closed this as completed Apr 28, 2024
thewilkybarkid added a commit to PREreview/prereview.org that referenced this issue Aug 28, 2024
CI started failing as there was an unexpected update to fast-check. This version now generates URLs containing '//' as the beginning of the path, which then break's constructing URL objects which consider it to be an invalid protocol relative URL.

Refs #1834, 8f6b150, dubzzz/fast-check#4896
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants