Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential compiler segfault in lookup.c #4153

Open
stefandd opened this issue Jun 23, 2022 · 21 comments
Open

Potential compiler segfault in lookup.c #4153

stefandd opened this issue Jun 23, 2022 · 21 comments
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed

Comments

@stefandd
Copy link
Contributor

stefandd commented Jun 23, 2022

static deferred_reification_t* lookup_nominal(pass_opt_t* opt, ast_t* from,
ast_t* orig, ast_t* type, const char* name, bool errors, bool allow_private)
{
pony_assert(ast_id(type) == TK_NOMINAL);
typecheck_t* t = &opt->check;
ast_t* def = (ast_t*)ast_data(type);
AST_GET_CHILDREN(def, type_id, typeparams);
const char* type_name = ast_name(type_id);
if(is_name_private(type_name) && (from != NULL) && (opt != NULL)
&& !allow_private)
{

This code is a bit bogus. If opt can be NULL, the declaration typecheck_t* t = &opt->check; could crash the compiler. In any case, it looks like the t declaration could be safely moved into the scope of the if-branch, where it is properly guarded against opt == NULL.

@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Jun 23, 2022
@SeanTAllen SeanTAllen added bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed labels Jun 23, 2022
@SeanTAllen
Copy link
Member

Thanks for reporting this one. This should be a straightforward PR for anyone who wants to take it.

@stefandd
Copy link
Contributor Author

Thanks for reporting this one. This should be a straightforward PR for anyone who wants to take it.

I am happy to submit a PR but wanted to not do it since there is a current PR (private types confusing error...) that involves this file, and I was worried that two edits of lookup.c are potentially floating around. If that is unfounded, I am happy to quickly create the PR

@ergl
Copy link
Member

ergl commented Jun 23, 2022

The variable t is also used outside of the if scope below:

if(t->frame->type != def)
{
if(errors)
{
ast_error(opt->check.errors, from,
"can't lookup private fields from outside the type");
}
return NULL;
}
break;
case TK_NEW:
case TK_BE:
case TK_FUN:
{
if(ast_nearest(def, TK_PACKAGE) != t->frame->package)
{

I actually don't know if opt is ever allowed to be NULL here, this function in particular is very liberal when it comes to dereferencing things inside of it (see all the instances of opt->check.errors).

Also, we have several pony_assert(opt != NULL); spread through the compiler, but it could be that there's some context in which opt can be NULL

@stefandd
Copy link
Contributor Author

@ergl Should I close this?

@SeanTAllen
Copy link
Member

I don't think this should be closed.

I think we should remove the opt != NULL and could reasonably add an assert that it isn't null to the function.

@jemc jemc removed the discuss during sync Should be discussed during an upcoming sync label Jun 28, 2022
stefandd added a commit to stefandd/ponyc that referenced this issue Jul 27, 2022
This attempts to fix issues ponylang#4130 and ponylang#4153.

The issue was when a private type in another package was used as a default value in a method call.

Fix to ponylang#4130
Since it has been decided to treat this as a bug instead of a missing error, this PR implements the fix suggested by @ergl, namely using `lookup_try()` instead of lookup() in call.c's `check_partial_function_call()` since it allows to permit private types.

Fix to ponylang#4153
This is also a simply fix to lookup.c that prevents a potential segfault by a dereferencing of `opt` (`typecheck_t* t = &opt->check;`) *before* `opt != NULL` was checked. As pointed out by @SeanTAllen, opt should not be NULL to begin with when lookup_nominal is called, and instead, an assert should be added and the NULL checks in that function removed.
stefandd added a commit to stefandd/ponyc that referenced this issue Jul 27, 2022
This attempts to fix ponylang#4130

This crash stems from the use of a private type as defined in another package when it was used as a default value in a method call. Since it has been decided to treat this as a bug instead of a missing error, this PR implements the fix suggested by @ergl, namely using `lookup_try()` instead of lookup() in call.c's `check_partial_function_call()` since the former can be configured to permit private types.

Fix to ponylang#4153
This is a simply change to `lookup_nominal()` in lookup.c that prevents a potential segfault by a dereferencing of `opt` (`typecheck_t* t = &opt->check;`) *before* `opt != NULL` was checked. As pointed out by @SeanTAllen, opt should not be NULL to begin with when `lookup_nominal()` is called, and instead, an assert should be added and the NULL checks in that function removed.

With this PR, the two examples below that crashed the compiler now both compile:

Original example:
```pony
// inside the "useful" package

primitive _PrivateDefault

actor Useful[A: Any val]
  fun tag config(value: (A | _PrivateDefault) = _PrivateDefault): Useful[A] => this

// inside "main"

use "useful"

primitive This
primitive That

type Stuff is (This | That)

actor Main
  new create(env: Env) =>
    let u = Useful[Stuff].config()
```

Minimal example:
```pony
// In the "lib" pacakge

primitive _Private

primitive Public
  fun apply[T](v: (T | _Private) = _Private): None => None

// In main
use lib = "lib"

actor Main
  new create(env: Env) =>
    let p = lib.Public.apply[U8]()
    env.out.print(p.string())
```
@stefandd
Copy link
Contributor Author

stefandd commented Jul 27, 2022

I don't think this should be closed.

I think we should remove the opt != NULL and could reasonably add an assert that it isn't null to the function.

I tried your suggestion exactly:
527864d#diff-44ccf14c19a6015e168dc16ed55bd0fdcfd0bf88351427bbbe76449a9e257495L56-L64%5D

It does compile and work fine in Release mode but failed on all platforms in Debug mode with an assertion error.

It is unclear to me why, if opt is NULL in Debug mode during that call, the next line dereferencing opt hasn't crashed the current compiler in Debug mode

typecheck_t* t = &opt->check;

@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Jul 27, 2022
@jemc
Copy link
Member

jemc commented Aug 2, 2022

We discussed this during sync - the next step would be to use a debugger like lldb to catch the assert fail and figure out what code path is calling a NULL opt, then fix it to not do that.

@stefandd
Copy link
Contributor Author

stefandd commented Aug 3, 2022

Here is the backtrace from the compiler built in Debug mode with this patch applied:

C:\ponyc\src\libponyc\type\lookup.c:60: lookup_nominal: Assertion `opt != NULL` failed.

Backtrace:
  (ponyint_assert_fail) [00007FF61CCBD5A6]
  (lookup_nominal) [00007FF61CC3AD7A]
  (lookup_base) [00007FF61CC3A950]
  (lookup_try) [00007FF61CC3A7CA]
  (add_special) [00007FF61CC4C4D0]
  (add_nominal) [00007FF61CC4D413]
  (add_type) [00007FF61CC4A279]
  (reachable_method) [00007FF61CC4A351]
  (reach) [00007FF61CC49943]
  (genexe) [00007FF61CC66064]
  (codegen) [00007FF61CBDE8E7]
  (generate_passes) [00007FF61CB6F37C]
  (compile_package) [00007FF61CB633C1]
  (main) [00007FF61CB63714]
  (__scrt_common_main_seh) [00007FF61F32DD48]
  (BaseThreadInitThunk) [00007FFAD69D7034]
  (RtlUserThreadStart) [00007FFAD7DE2651]

@stefandd
Copy link
Contributor Author

stefandd commented Aug 3, 2022

@jemc I think I found the offending call/ code path

static void add_special(reach_t* r, reach_type_t* t, ast_t* type,
const char* special, pass_opt_t* opt)
{
special = stringtab(special);
deferred_reification_t* find = lookup_try(NULL, NULL, type, special, false);

I think the newly added assertion is tripped by the call to lookup_try in add_special in reach.c where opt is set NULL!

From the design of the function it is likely it was written with the possibility of opt==NULL in mind, and removing the opt!=NULL checks and adding an assertion might not be the way forward.

However, if this call with opt NULL makes it way into lookup_nominal, why does it not currently crash the compiler (without the assertion) in either Debug or Release config when NULL is dereferend in the typecheck_t* t = &opt->check; line?

static deferred_reification_t* lookup_nominal(pass_opt_t* opt, ast_t* from,
ast_t* orig, ast_t* type, const char* name, bool errors, bool allow_private)
{
pony_assert(ast_id(type) == TK_NOMINAL);
typecheck_t* t = &opt->check;
ast_t* def = (ast_t*)ast_data(type);
AST_GET_CHILDREN(def, type_id, typeparams);
const char* type_name = ast_name(type_id);
if(is_name_private(type_name) && (from != NULL) && (opt != NULL)
&& !allow_private)
{

@jemc
Copy link
Member

jemc commented Aug 26, 2022

@stefandd - We already have the opt here in this scope (it's the last parameter received by add_special) so in my opinion there's no reason to pass NULL to lookup_try - we should pass the real opt.

@jemc
Copy link
Member

jemc commented Aug 26, 2022

As to the question of how it's not currently crashing the compiler, I don't know and couldn't say further on that without some investigation.

@SeanTAllen SeanTAllen removed the discuss during sync Should be discussed during an upcoming sync label Aug 30, 2022
stefandd added a commit to stefandd/ponyc that referenced this issue Sep 18, 2022
@stefandd
Copy link
Contributor Author

Unfortunately this definitely does seem more involved. In #4191, I tried first to only implement the suggestion (provide opt as parameter to the lookup_try call in add_special), but this failed to build, and then I tried to replace the other lookup_try(NULL, NULL,.. and lookup(NULL, NULL,.. calls in that module wherever opt was actually in the function scope, and this now still does not build.

Unsure how to proceed with this.

@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Sep 18, 2022
@jemc
Copy link
Member

jemc commented Sep 18, 2022

I tried first to only implement the suggestion (provide opt as parameter to the lookup_try call in add_special), but this failed to build

Can you give more detail - maybe a compiler error output?

@stefandd
Copy link
Contributor Author

It fails with the newly inserted assertion opt!=NULL being tripped. There is one lookup(NULL, NULL, t->ast, name) call left in reach.c, in add_method_name that could be causing this, but opt isn't in scope there, and I have no debugger access right now.

@jemc
Copy link
Member

jemc commented Sep 18, 2022

To make it work, we need to add opt everywhere - if it's not in scope, we need to make sure it is in scope by adding it to the function signature and passing it in from the caller.

The compiler was written with the assumption that opt is always non-null, and any code that is supplying null for it needs to be replaced with code that supplies non-null.

@stefandd
Copy link
Contributor Author

stefandd commented Sep 19, 2022

@jemc
I did that, replacing the remaining two calls to lookup..(NULL, ...) by bringing opt into scope. Now the compiler actually crashes because lookup in genmatch.c returns NULL in eq_param_type, crashing the AST_GET_CHILDREN line.

static ast_t* eq_param_type(compile_t* c, ast_t* pattern)
{
  ast_t* pattern_type = deferred_reify(c->frame->reify, ast_type(pattern),
    c->opt);
  deferred_reification_t* fun = lookup(c->opt, pattern, pattern_type, c->str_eq);

  AST_GET_CHILDREN(fun->ast, cap, id, typeparams, params, result, partial);

The cause for the returned NULL is in lookup_nominal
https://github.com/ponylang/ponyc/blame/24cf1efd9eab28d6ee2dbe8001919523c5a4864b/src/libponyc/type/lookup.c#L67-L77

Also, I know too little about the compiler to understand why ast_error(opt->check.errors, from, "can't lookup fields or methods " "on private types from other packages"); does not generate an error message at all ??? No textual output is generated and NULL is returned, causing the crash in eq_param_type.

Overall, for me it currently looks as if certain checks in lookup_nominal are suppressed by passing NULL for opt into the function. The attempt to now move into scope opt everywhere is likely breaking the assumption by whomever wrote these parts of the compiler that they can turn off certain checks by calling lookup functions with NULL.

@jemc
Copy link
Member

jemc commented Sep 19, 2022

why ast_error does not generate an error message at all ??? No textual output is generated

The ast_error appends an error into the error list held at opt->check.errors. It does not halt the compiler immediately nor print an error immediately - it simply adds an error to be printed at the end.

Overall, for me it currently looks as if certain checks in lookup_nominal are suppressed by passing NULL for opt into the function, and that the attempt to move into scope opt everywhere is breaking the assumption by whoever wrote these parts of the compiler, that they can turn off certain checks by calling lookup functions with NULL.

I think your assessment is correct, but this was the wrong approach by whomever wrote this code - they should have probably passed in an extra flag argument called bool suppress_errors or something like that. They definitely shouldn't have made the opt argument nullable, because it's used for so much more than just accumulating errors, and because it's not meant to be nullable anywhere else.

@stefandd
Copy link
Contributor Author

stefandd commented Sep 20, 2022

It seems then that resolving this is a bit of a rewrite of certain parts of the compiler which is a task better left to one of the core developers.

To summarize the situation:

  • in lookup_nominal (code of the first post): if an assert(opt != NULL) is inserted, it is tripped by quite a few calls of the style lookup..(NULL, NULL, ..) in reach.c and genmatch.c. This means that the typecheck_t* t = &opt->check; line in lookup_nominal accesses invalid memory in such cases (since the NULL object is not an opt struct). Despite that, it does currently not lead to any reported crashes. There are various (opt != NULL) guards in that function that were likely inserted to allow such naughty lookup..(NULL, NULL, ..) calls.

  • a remedy seemed to be to bring opt into scope/and use it to replace all lookup..(NULL, NULL, ..) calls with lookup..(opt, NULL, ..) calls and to remove the (opt != NULL) guards in lookup_nominal. This however lead to more trouble (crashes) since the function was apparently written with such opt == NULL use in mind.

One way forward would simply be to slightly alter lookup_nominal to prevent any invalid memory access by e.g. moving the declaration typecheck_t* t = &opt->check; within the guarded if scopes.

Another way forward is to continue to make the changes as suggested by @jemc and @SeanTAllen which would however need significant modifications to lookup_nominal in order to prevent spurious error messages to be generated (which were previously guarded against with the (opt != NULL) guards.

Please advise on how to proceed!

@jemc
Copy link
Member

jemc commented Sep 20, 2022

My suggested approach is still the same as in my previous comment - that the code should be updated in whatever was is necessary to make it no longer assume that opt can sometimes be null - probably by passing in a new flag argument indicating whether errors should be suppressed or not (rather than treating a null opt as an indication to suppress errors).

@stefandd
Copy link
Contributor Author

stefandd commented Sep 20, 2022

Ok, I can do this but would have to assume that all guarding (opt != NULL) in lookup_nominal were meant to serve as an error suppression flag and can instead by replaced by !flagXXX -- safe assumption?

@jemc
Copy link
Member

jemc commented Sep 20, 2022

Yes, I think so.

@SeanTAllen SeanTAllen removed the discuss during sync Should be discussed during an upcoming sync label Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

5 participants