Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add typescript types to bin #538

Merged
merged 14 commits into from
Aug 13, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .ncurc.cjs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"use strict";

module.exports = {
"dep": ["prod", "dev", "packageManager"],
"reject": [
"chai", // Moved to es6
"@types/chai", // Should match chai
Expand Down
1 change: 1 addition & 0 deletions .npmignore
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,4 @@ tsconfig*.json
web-test/
yarn.lock
pnpm-workspace.yaml
bin/tsconfig.json
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@ Released: TBD

### New features

- [#477](https://github.com/peggyjs/peggy/issues/477) Option to output .d.ts
files next to .js from CLI.
- [#530](https://github.com/peggyjs/peggy/issues/531) Allow es6 plugins from CLI
- [#532](https://github.com/peggyjs/peggy/issues/532) Allow es6 options files
from the CLI
Expand Down
113 changes: 113 additions & 0 deletions bin/generated_template.d.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
/** Provides information pointing to a location within a source. */
export interface Location {
hildjj marked this conversation as resolved.
Show resolved Hide resolved
/** Line in the parsed source (1-based). */
line: number;
/** Column in the parsed source (1-based). */
column: number;
/** Offset in the parsed source (0-based). */
offset: number;
}

/** The `start` and `end` position's of an object within the source. */
export interface LocationRange {
/** Any object that was supplied to the `parse()` call as the `grammarSource` option. */
source: any;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments say this is an object. If this cannot be trusted then you should update the comment and pursue a different change. I can work with you on this.


To start off I see this code block:

    if (offset && peg$source && (typeof peg$source.offset === "function")) {
      res.start = peg$source.offset(res.start);
      res.end = peg$source.offset(res.end);
    }

offset seems to only be true when options.true (within the compiler itself) is true. However this still indicates a shape.

Maybe type it like so?

// A mutable object can be assigned to a readonly object just fine so this really does allow any object.
type AnyObject = {
    // This MUST be `unknown` instead of `any` because `any` allows arrays and functions as well.
    // Functions and arrays are allowed with `any` because both of these types can be indexed by strings successfully.
    // There is no reasonable explanation why `unknown` acts differently. They're literally just special cased differently in indexed types.
    // See https://github.com/microsoft/TypeScript/wiki/Breaking-Changes/83af27fca396d172b4d895d480b10c3bacf89112#-k-string-unknown--is-no-longer-a-wildcard-assignment-target
    readonly [K: string]: unknown;
}

// Based upon your remark "any object" I'm assuming arrays, functions, numbers, etc. are invalid so I typed it as `AnyObject`.
export interface Source extends AnyObject {
    /**
     * This function is used when trace is set to true to do ???
     * TODO: Better documentation. I don't know what's going on here entirely to be honest.
     */
    offset?: (location: Location) => Location
}

export interface LocationRange {
    source: Source;

    ...
}

The inline comments there are for the benefit of explanation I would not leave them in.


On the other hand I see some internal code that suggests that it could be a string or a File or anything else. In which case I would suggest

type Source = File | string | {} | null | undefined;

I wrote it this way for intellisense purposes. The type {} actually means "any type that isn't null or undefined" and then I added null and undefined to allow both of those. This is better than File | string | unknown because in intellisense it'd just collapse and show unknown.

Though I bet you probably just want some subset like objects/arrays/so on? For instance you probably don't want a function but I guess I don't know.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry about being too pedantic. I indeed did ask for it. :)

Yeah, the comment is wrong. grammarSource is often a string containing the file name, but could also be an object that produces a useful string when String(grammarSource) is applied, e.g. something with a toString() method. They also need to be testable for equality with ===. As such, numbers would work also, as long as they aren't NaN.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GrammarLocation class in lib/grammar-location.js is a good example of why you might want an object as the grammarSource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you're right that GrammarLocation has a few methods that the generated parser can use if they are there. They should be documented.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah so my final type; type Source = File | string | {} | null | undefined; in the last comment may be close to what you want. It'll allow literally anything with some intellisense hints about common uses.

But here's another try at it based upon your new constraints:

// I'm sure you can come up with a better name for this
// This will contain all of the methods you mentioned that the parser can use if they're there.
interface ObjectSource {
    ... // Methods go here, you can make them required if you want as `AnyObject` will catch it in.
}

// `AnyObject` is a bit broader than you talked about; there's no guarantee it has a useful `toString` method. You might think that you could fix this with the type `{ toString(): string }` but unfortunately `Object` itself defines a `toString` method that returns `"[object Object]"`
// `
//
// You can probably decide for yourself if `null` and `undefined` makes sense. They most likely make sense to mean "not present" rather than a valid value though.
// It sounds like arrays probably don't make sense here. If they do you could add `readonly unknown[]` to the union to allow any array (readonly or not)
// `unique symbol` - the return of `Symbol(...)`  may not be usable because of the `===`/`toString` constraints?
// `symbol` - the return of `Symbol.for()` may make more sense but still a bit strange to put here.
// It sounds like functions almost certainly don't make sense here.
type GrammarSource = ObjectSource | AnyObject | string | number | bigint;

Notably this new version does NOT just allow anything but I left a comment about everything it omits.

Copy link
Contributor Author

@hildjj hildjj Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got this:

export interface GrammarSourceObject {
  readonly source?: undefined | string | unknown;
  readonly offset?: undefined | ((loc: Location) => Location);
  readonly toString: () => string;
}

export type GrammarSource = string | GrammarSourceObject | unknown;

That unknown is too broad, and I'd be open to just removing it in favor of the things we know work and are useful.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, what things are known to work and are useful? I can help you figure that out.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm driving across Kansas today, currently charging the car in Maple Hill. Sorry in advance for spotty connectivity.

I think I'm probably the only one who ever uses anything other than a string, and the only other thing I ever use is GrammarLocation, which conforms to GrammarSourceObject. Let's just take the unknown out and let people file bugs when they want more. :)

/** Position at the beginning of the expression. */
start: Location;
/** Position after the end of the expression. */
end: Location;
}

export interface LiteralExpectation {
type: "literal";
text: string;
ignoreCase: boolean;
}
export interface ClassParts extends Array<string | ClassParts> {
hildjj marked this conversation as resolved.
Show resolved Hide resolved
}
export interface ClassExpectation {
type: "class";
parts: ClassParts;
inverted: boolean;
ignoreCase: boolean;
}
export interface AnyExpectation {
type: "any";
}
export interface EndExpectation {
type: "end";
}
export interface OtherExpectation {
type: "other";
description: string;
}
export type Expectation =
| AnyExpectation
| ClassExpectation
| EndExpectation
| LiteralExpectation
| OtherExpectation;

declare class _PeggySyntaxError extends Error {
hildjj marked this conversation as resolved.
Show resolved Hide resolved
/**
* Constructs the human-readable message from the machine representation.
*
* @param expected Array of expected items, generated by the parser
* @param found Any text that will appear as found in the input instead of
* expected
*/
static buildMessage(expected: Expectation[], found: string | null): string;
message: string;
expected: Expectation[];
found: string | null;
hildjj marked this conversation as resolved.
Show resolved Hide resolved
location: LocationRange;
name: string;
hildjj marked this conversation as resolved.
Show resolved Hide resolved
constructor(
message: string,
expected: Expectation[],
found: string | null,
location: LocationRange,
);
format(sources: {
source?: any;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way I could avoid any here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's either that, unknown, or we put in another generic type, which will make the whole thing kind of a mess.

It really can be almost anything, as long as it can be converted to a string with String(source), and it matches the grammarSource that got passed in to parse.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you asked for a more full review, yeah I'd suggest unknown or Source here. You said it should match grammarSource hence Source (which I talked about typing in a different comment) if that even makes sense.

text: string;
}[]): string;
}

export interface ParserTracer {
trace(event: ParserTracerEvent): void;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic; it's a soundness hole you practically have to try to cause in this case but it's an easy fix.


After this preface I'm going to jump into a deep explanation into intentional unsoundness of TypeScript. You can ignore all of what I'm about to say and simply try to address a possible unsoundness with this simple diff

Suggested change
export interface ParserTracer {
trace(event: ParserTracerEvent): void;
}
export interface ParserTracer {
trace: (event: ParserTracerEvent) => void
}

By defining it this way you say that "trace" is a method. TypeScript handles this specially for classes and makes it "bivariant" - a term practically exclusive to TypeScript and makes it rather unsound. You can see this FAQ entry for their explanation of what bivariance is but in this specific case it lets you do unsound stuff like this:

// This expects the extra property `abc`.
type WithABC = ParserTracerEvent & { abc: number }

const foo: ParserTracer = {
  // This signature is accepted
  trace(event: WithABC): void {
    event.abc // Which means you can do whatever you want with `abc` even though it definitely doesn't exist.
  }
}

See this playground if you want to see for yourself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nod. That seems easy enough to fix.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't mark this as resolved but it can be.


export type ParserTracerEvent
= { type: "rule.enter"; rule: string; location: LocationRange }
| { type: "rule.fail"; rule: string; location: LocationRange }
| { type: "rule.match"; rule: string; result: any; location: LocationRange };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the source this is generated in two places; generateRuleFooter which is called with stack.source() which is always a string and in generateRuleHeader which is called with asts.indexOfRule(ast, rule.name) which is always a number.

There may be a reason I'm missing to keep it more generic but I believe string | number would suffice for the current runtime possibilities? Perhaps this will be changed in the future and is meant to be assumed to be arbitrary. In that case I would suggest unknown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might have been looking in the wrong place. Generate a grammar with --trace turned on to see the code that will run. Look for peg$tracer.trace calls. The result is anything that a rule could return from an action block. Example:

NaN = "nan" { return NaN }

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha! Yeah you're right, I see it in source code now too. My folly was checking for the string "rule.match" which only included the things I mentioned.

Looks like unknown is the best type for the job unless action blocks can't return, say, functions or something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can totally imagine them returning a function. For instance, you might implement an XPath parser that way... (/me checks. No, I didn't do that in my XPath parser, but it would have been reasonable)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't mark this as resolved but it can be.


export type StartRules = $$$StartRules$$$;
export interface ParseOptions<T extends StartRules = $$$DefaultStartRule$$$> {
hildjj marked this conversation as resolved.
Show resolved Hide resolved
/**
* Object that will be attached to the each `LocationRange` object created by
* the parser. For example, this can be path to the parsed file or even the
* File object.
*/
grammarSource?: any;
hildjj marked this conversation as resolved.
Show resolved Hide resolved
startRule?: T;
tracer?: ParserTracer;

// Internal use only:
peg$library?: boolean;
// Internal use only:
peg$currPos?: number;
// Internal use only:
peg$silentFails?: number;
// Internal use only:
peg$maxFailExpected?: Expectation[];
hildjj marked this conversation as resolved.
Show resolved Hide resolved
// Extra application-specific properties
[key: string]: any;
hildjj marked this conversation as resolved.
Show resolved Hide resolved
}

export declare const parse: typeof ParseFunction;
export declare const SyntaxError: typeof _PeggySyntaxError;
export type SyntaxError = _PeggySyntaxError;

// Overload of ParseFunction for each allowedStartRule
Loading