-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add generators for URLs and domain names #18
base: master
Are you sure you want to change the base?
Conversation
Thanks a lot for the PR! I'll try to review in the next couple of days. Can you please add your copyright on top of the new files? |
Will do. I also realized I should add Examples, so I'll push them as well. |
80bed69
to
86653bc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay! Finally managed to add some comments.
The thing I am concerned with the most is that only a tiny subset of possible URLs is generated. In general, I want rapid generators to produce every possible kind of valid data, to make sure that it is easy to catch overlooked corner cases in the things tested with rapid.
tld.go
Outdated
ZW | ||
` | ||
|
||
var tlds = strings.Split(tldsByAlpha, "\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First and last elements of tlds
are blank strings, is this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, no I need to runtldsByAlpha
through TrimSpace
.
url.go
Outdated
assertf(4 <= maxLength, "maximum length (%v) should not be less than 4, to generate a two character domain and a one character subdomain", maxLength) | ||
assertf(maxLength <= 255, "maximum length (%v) should not be greater than 255 to comply with RFC 1035", maxLength) | ||
assertf(1 <= maxElementLength, "maximum element length (%v) should not be less than 1 to comply with RFC 1035", maxElementLength) | ||
assertf(maxElementLength <= 63, "maximum element length (%v) should not be greater than 63 to comply with RFC 1035", maxElementLength) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer removing DomainOf
entirely for now, and using named constants for 255
and 63
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do
url.go
Outdated
} | ||
|
||
func (g *domainNameGen) String() string { | ||
return fmt.Sprintf("Domain(maxLength=%v, mmaxElementLength%v)", g.maxLength, g.maxElementLength) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to make String()
close to the code using to construct the generator, so after removing DomainOf
it should just return Domain()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack.
url.go
Outdated
|
||
var b strings.Builder | ||
elements := newRepeat(1, 126, 1) | ||
b.Grow(elements.avg()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b
is unused below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was me cargo-culting from another implementation that used a repeat. I'll clean it up.
domain = subDomain + "." + domain | ||
} | ||
|
||
return domain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFC 1035 specifies that <domain> ::= <subdomain> | " "
, should we support domain being space or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's in the spec, then we probably should, but I don't really grasp what " " as a domain even means. Guess I'll go back and re-read the spec.
url_external_test.go
Outdated
) | ||
|
||
func TestURL(t *testing.T) { | ||
pathEscapeRegex := regexp.MustCompile(`^[0-9A-Fa-f]{2}`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to compile at the top-level; also t.Parallel()
is missing for this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
url.go
Outdated
|
||
func (g *urlGenerator) value(t *T) value { | ||
scheme := SampledFrom(g.schemes).Draw(t, "scheme").(string) | ||
domain := Domain().Draw(t, "domain").(string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we generate URLs with host component containing an IP (v4 or v6) address as well? That does not sound easy (especially given the IPv6 syntax variants), but otherwise we are omitting quite an important URL subset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. If I add those options, should I do so via public IPv4 and IPv6 Generators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sounds good. Maybe a single generator of net.IP
with an option like v6 bool
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding the IPv4 and IPv6 generators is pretty straightforward, but I'm not sure how to incorporate them into the URL generator.
I can think of three ways to do this:
- One generator,
URL()
, that generates both DNS URLs and IP URLs - Two generators,
URL()
that generates DNS URLs andURLOf(schemes []string, domain *Generator)
. - Three generators,
URL()
,URLIPv4(), and
URLIPv6()`
I think you're looking for the first option, but wanted to float the other two and see what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think going with just URL()
for now is the safest option (because it can be extended in the future based on real feedback) -- in other cases we are guessing/speculating a bit about the use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks! I hope I'll be able to review/merge all this before the end of the week.
url.go
Outdated
func (g *urlGenerator) value(t *T) value { | ||
scheme := SampledFrom(g.schemes).Draw(t, "scheme").(string) | ||
domain := Domain().Draw(t, "domain").(string) | ||
port := IntRange(0, 2^16-1). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFC does not limit the port range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intuition is that the RFC is refraining from specifying something that's out of the scope of the standard, rather than positively asserting that there is no limit on the range of valid ports.
Your call, but I think limiting the generation to the standard 16-bit valid port range is reasonable.
StringOf(RuneFrom(nil, unicode.PrintRanges...)).Map(url.PathEscape), | ||
).Draw(t, "path").([]string)...) | ||
|
||
return url.URL{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not generating query or fragment parts, seems like quite a big omission.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but wasn't sure what to do.
With the very simple API currently exposed, there's no configurability of whether or not to generate query parameters or fragments. I suppose the consumer could use a Filter or Map to strip the parts they don't want. What are your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, too -- but client-side stripping (if required) sounds like an OK option.
url_example_test.go
Outdated
// Z.CU | ||
// r.ABBotT | ||
// r.AcCoUNTaNT | ||
// R. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
R.
is probably not what we wanted to generate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's probably coming from the two empty string entries that are incorrectly included.
b8127fb
to
0a22d63
Compare
By the way, any idea how to make CI work for this PR? It seems stuck. |
I don't think anything is being scheduled for the PR, but I'm not sure why. Edit: just looked at the workflow file, and there's no trigger for pull_requests. I can add that if you want. Here's one of my projects for comparision: https://github.com/wfscheper/stentor/blob/main/.github/workflows/build.yml |
Shouldn't "on (any) push" be enough? Also, can you please try rebasing on top of current master? Maybe it can help CI to unstuck itself as well. |
I think GitHub treats pushes to pull requests differently than pushes to
"normal" branches.
I'll try a rebase later today.
…On Thu, Nov 12, 2020, 11:28 Gregory Petrosyan ***@***.***> wrote:
Shouldn't "on (any) push"
<https://github.com/flyingmutant/rapid/blob/master/.github/workflows/ci.yml#L3>
be enough?
Also, can you please try rebasing on top of current master? Maybe it can
help CI to unstuck itself as well.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#18 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAPTM5GSAMBL5HO2WDFC7TSPQEL3ANCNFSM4TGGHHMA>
.
|
0a22d63
to
5122cc0
Compare
I've pushed the Can you please rebase this PR on top of current master and look into the CI failure? I think once the CI is happy I'll merge the PR, and fix the nitpicks (if I'll find something else to nitpick, that is) afterwards. |
5122cc0
to
40936f2
Compare
I'm at a loss on how to address the 1.13 issue. I think I may need to install go 1.13 and try stepping through the example with delve. I should have some time tonight. |
3f9a86a
to
07032e1
Compare
Hey, apologies for leaving this hanging for so long. I see that you added some build tags to try and get around the example tests not lining up correctly. I also added a bash script to update the tld constant, which was out of date. Something to wire into a github action, perhaps? |
No problem, rapid is not changing too rapidly :-) I use build tags mainly to work around Unicode database updating from release to release.
I think just doing manual update every couple of releases might be good enough for now. I don't want to complicate the automation too much. |
This adds a Generator for RFC 3986 compliant URLs, and two generators for RFC 1035 compliant domain names. Closes flyingmutant#17 Signed-off-by: Walter Scheper <walter.scheper@gmail.com>
Signed-off-by: Walter Scheper <walter.scheper@gmail.com>
Signed-off-by: Walter Scheper <walter.scheper@gmail.com>
Need to strip the leading and trailing newlines so we don't get empty strings in our top-level domain list.
This removes the configurable domain generation in favor of one that generates domains within the full rnage allowed by the specification.
To make the generated URL conform to how `url.Parse` works, Path and Fragment are set to the unescaped form. RawPath and RawFragment are left unset, and we simply rely on the URL struct to handle the escaping as needed.
These generators return `net.IP`, and are also used in generating URLs.
This adds a simple shell script to update the tld.go constant with the latest contents of https://data.iana.org/TLD/tlds-alpha-by-domain.txt
07032e1
to
1499ace
Compare
Hello! GitHub randomly reminded me about this abandoned PR of mine. Thought I'd take a bit and update it, and was pleasantly surprised by the new API using generics. Awesome stuff! Only hiccup I had was the loss of the old Generator.Map() functionality. Is there an equivalent that I missed? |
Yep, I was pleasantly surprised how little changes (conceptual, not LoC) were required to make rapid fully type-safe.
Methods can't have their own generic parameters, because of that |
ip_addr.go
Outdated
return fmt.Sprintf("IP(ipv6=%v)", g.ipv6) | ||
} | ||
|
||
func (g *ipGen) value(t *T) net.IP { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is good idea to use netip
? With generics we already require Go 1.18.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Done.
url_example_test.go
Outdated
|
||
// String generation depends on the Unicode tables, which change with Go versions: | ||
//go:build go1.16 | ||
// +build go1.16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need, now we require Go 1.18 anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
This also removes the unnecessary build tag in the example tests.
This adds an RFC 1035 compliant domain name generator and an RFC 3986 compliant URL generator.
Closes #17
Signed-off-by: Walter Scheper ratlaw@gmail.com