-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce symbol table API into pad #22850
base: blead
Are you sure you want to change the base?
Conversation
51e4140
to
9719855
Compare
…fication Using dedicated C tokens instead of literal values will improve readability of source code by separating intention from other usages of literal characters: - sigil (as language structure) - predefined variable name - prototype It will also make it feasible to add functionality, like - new sigil-less symbol tables (attributes, patterns, contracts, ...) - add support for custom symbol tables
…m symbol table Returns true value when provided `PADNAME *` represents symbol belonging to provided symbol table.
By wrapping implementation into single intent expressing token it can evolve without affecting its usage.
Using macro instead of direct access: - identifies code where such value is used / required - hides any future changes from code which uses it, eg - type change from 'char' to 'U32'
Returns name without symbol type so it will properly work when we will change symbol table id type or when we will add new symbol types. For backward compatibility user of this macro can rely on -1 being valid index.
stored in pad, excluding symbol table id (sigil) eg: - `$` => 0 - `$_` => 1 - `$self` => 4
into handy macros to avoid code duplication as well as simplify future changes.
eg: scalar, array, and hash are "variable"
Move printf format and proper expansion of parameters, into macros so when code evolves, it will be isolated from usage.
extracting common part of condition and using `Perl_Symbol_Table_*` constants
…nts first Common problem is when macro expands as multiple arguments. Without intermediate step allowing to expand arguments before macro call there will be an error due treating such expansion as single argument.
to improve readability of what is actually tested
There were two instances of same condition. From code looking like 3-state dispatch I guess it was meant to be `1`
9719855
to
7f7f7c7
Compare
What is the point of this very large change? AFAICT, it's not a bug fix or a response to an open issue. Does it respond to some previously expressed need? |
@jkeenan already discussed that on IRC, this PR is quite huge, it in fact combines parts of 3 changesets (in my eyes, maybe more by someone else) I will extract first two changes into separated PRs, each announced first for discussion on mailing list. |
If we are making brand new functions/APIs,
"()" is unlikely to have any reuse in libperl or xs__.so or reuse in B::CC/B::CC-like "op structs to C" compilers. |
Newxz(alloc2, | ||
STRUCT_OFFSET(struct padname_with_str, xpadn_str[0]) + len + 1, | ||
char); | ||
alloc = (struct padname_with_str *)alloc2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why Newxz() when 4 lines below Copy() will write over the null bytes? No need for interp backend or libc to do a redundant memset()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is current implementation modified with treating symbol table and name as separated arguments.
&/@/%/$ are easier on the eyes, and seeing #2-#4 in C/XS code, instantly means the C statement is interacting with a PP-concept or the PP-type system. Tokens Perl C/XS uses AV/HV/CV tokens everywhere. Short to type. 2 char tokens won't instantly blow past 80 char EOL/justify/format. Short tokens are a historical nice feature of P5 C API. The very long "Perl_Symbol_Table_*" prefix feels like another prog lang/framework. The proposed ident "Perl_Symbol_Table_Scalar" can only fit 3x before IDE EOL overflow @80ch. Even tokens SVt_PVLV (SVt_NULL ????)/SVt_PVAV/SVt_PVHV/SVt_PVCV like in old
new
I really don't like that this code is adding a 3rd arg, when the old API was just 2. And this new API forever adds runtime overhead at all the callers vs the current API which has an overhead of 1 byte!!! for the sigil (embedded into C lit string).
On -O1 AMD64, function arg constant int 0x1 has 5 bytes of overhead. If "0x1" or "0x24" the ""argument"" gets embedded in the C lit, its overhead is 1 byte, not 5 bytes (separate C arg), assuming A better way to do it, this example might have flaws, don't copy paste it without throughly thinking it over. old
new/proposed
I see currrent blead, and meaning of U32 flags
bits 0xF0 or 0xE0 are unassigned. 2 bits can be taken to encode $%&@. Alot better than 5 bytes at every caller. Adding 4 functions This branch would be better off as a "xs_english.h" with "#define PERL_XS_ENGLISH" vs the ABI changes/runtime bloat changes done, to "improve" code readability, when I find its less readable b/c of very long tokens, and bluring what C statements/sub expressions are "C as C" vs "P5 as C".
If someone thinks "$"/"&"/"@"/"%" are unreadable as "Perl in C", Even though I offer improvements/suggestions above, my suggestions are generic talking or abstract talking. The proposed padname API has too many flaws on ASCII src code level (future hacking/maint/reading), and ABI (after C compiler) level. I don't think its fixable. |
@bulk88 re constants for names:
with my change (fully extended, will be another PR), some these lines will look like (rest I didn't identify properly yet):
|
Changes summary
For purpose of this changes let's distinguish between:
For implementation of sigil-less symbols we will still need their symbol type, eg:
replace literals representing type of symbol stored in pad with explicit tokens
For example,
'&'
in code can mean also:This set of changes replaces its usage as symbol type with C token
Perl_Symbol_Table_Code
.Extend padname API with "split" macros
Instead of relying on layout of string content explicit macros providing symbol type and symbol name are introduced, eg:
Provide padname functions with split symbol type and symbol name parameters
Alternatives to existing functions but with two parameters.
Example:
Possible follow-ups
Perl_Var_Match
Perl_Prototype_Code
Open questions
perldelta
?