Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build: Add Data Relocation and Protection flags (-z relro -z now) to address potential security issue #20122

Closed
tingshao opened this issue Apr 18, 2018 · 9 comments
Labels
build Issues and PRs related to build files or the CI.

Comments

@tingshao
Copy link
Contributor

tingshao commented Apr 18, 2018

  • Version: All
  • Platform: All supported platforms
  • Subsystem:

This issue is created separately to track the Data Relocation and Protection flag from the issue #18671 to make it independent and more clear to track as #18671 contains several compile flags.

The flag is for linker: -z relro -z now

The RELRO flag was a mitigation technique to harden the data sections of an ELF binary/process and could prevent the modification of GOT entries of a process. Thus able to make node more secure.

I made some investigation and tests based on the flag, It passed the builtin functional tests and have no obvious performance impact on the builtin benchmark tests.

I also made a performance test by loading some native add-ons (all enabled the -z relro -z now flag), and the result shows that the time spend on loading add-ons only increased 0.54%. Below is my testing steps:

  1. My machine is:
    cpu: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz with 8 cores
    memory: 32G
    os: Ubuntu 16.04
  2. Get the code of node-librealsense, node-glfw-3, and opencv4nodejs. Then made minor changes to enable -z relro -z now on them. Opencv4nodejs and node-librealsense totally exported about 600 methods and total library size is about 4MB.
  3. time the duration of below command with relro enabled node and 3 relro enabled add-ons as TR1 by:
    time /path/to/relro/enabled/node -e "require('node-librealsense');require('node-glfw-3');require('opencv4nodejs');
  4. time the duration of below command with relro enabled node and only 1 relro enabled add-on as TR2 by:
    time /path/to/relro/enabled/node -e "require('node-glfw-3')
  5. time the duration of below command with default built node and 3 default built add-ons as TD1 by:
    time /path/to/default/node -e "require('node-librealsense');require('node-glfw-3');require('opencv4nodejs');
  6. time the duration of below command with default built node and only 1 default built add-on as TD2 by:
    time /path/to/default/node -e "require('node-glfw-3')
  7. run the above test 1000 times, extract the real time and get the average result:
    TR1 (node-librealsense+opencv4nodejs+node-glfw-3): 140.23 ms
    TR2 (node-glfw-3): 45.804 ms
    TD1 (node-librealsense+opencv4nodejs+node-glfw-3): 139.331 ms
    TD2 (node-glfw-3): 45.415 ms

So the time spent on loading node-librealsense+opencv4nodejs is:
when relro enabled: DR1 = TR1 - TR2 = 94.426 ms
when relro not enabled(default built): DD1 = TD1 - TD2 = 93.916 ms
Time increased rate for loading node-librealsense+opencv4nodejs is: (DR1 - DD1)/DD1 = 0.0054. It's 0.54%.

Besides, I compared the time of running: time /path/to/node -e "". And result shows a 0.68% time increase.

Based on the above result, it seems to me that adding '-z relro -z now' doesn't impact the performance a lot and we can get more security, what do you think guys?

@bnoordhuis bnoordhuis added the build Issues and PRs related to build files or the CI. label Apr 18, 2018
@bnoordhuis
Copy link
Member

I think the problem with -z now was (and is) that it breaks add-ons with missing symbols. I don't know how common that is but it will almost inevitably break some add-ons.

We could either:

  1. hack the build so it's only turned on for node itself but not add-ons, or
  2. decide the extra security is worth some ecosystem fallout

(2) can be checked up to a point through citgm.

@tingshao
Copy link
Contributor Author

Yes, based on the consideration, It seems option 1 that doesn't affect add-ons is better.

@bnoordhuis
Copy link
Member

We could do (1) first and (2) in a major release (say, Node.js 11) since people recompile their add-ons anyway after a major upgrade.

@tingshao
Copy link
Contributor Author

I know some but still not quite clear about how -z now would cause add-ons with missing symbols fail to load, building issues some times are quite confusing. 😀

I could figure out the following 2 cases causing missing symbols, and please let me know if I missed some others.

Case 1: Make a shared library with a symbol labeled undefined in the symbol table. This could be done by the following code, and the UndefinedFunc() would be labeled undefined in symbol table.
`
#include <stdio.h>

typedef void (*FuncPtr)();
void UndefinedFunc(); // with no definition and would be marked undefined
void DefinedFunc(FuncPtr func) {}
void DefinedFunc2() {
DefinedFunc(UndefinedFunc); //reference the UndefinedFunc to reserve it in symbol table
}
`
And we can check the symbol table by ‘objdump -T libmyso.so’, it shows:

0000000000000000 D UND 0000000000000000 UndefinedFunc

When we dlopen the generated shared library, it results in an error saying undefined symbol: UndefinedFunc. If we load it implicitly, the linker would report error during linking.

Case 2: If a add-on was compiled against a new version library but runs with an old version library with less symbols. If lazy binding is used, the add-on may work well if those missing symbols are never called, But when -z now flag is open, it would cause loading error. I have verified it using a simple example.

Let’s summary it,

  1. Case 1 seems not relevant with the -z now flag, as before we use this flag it already failed.
  2. Case 2 seems relevant, and it seems a deployment issue and -z now would make it even worth.

Is case 2 the issue you mentioned? Thanks.

@bnoordhuis
Copy link
Member

(1) and (2) both. A variation of (1) where the missing symbol is external weak instead of strong won't be rejected by the dynamic linker, it's resolved on first use. (Sometimes. It's complicated.)

My hunch is that it's predominantly a problem with add-ons that link against shared libraries.

@tingshao
Copy link
Contributor Author

I investigated the weak symbol, as I never used it before :)
Then I created a shared lib with one external weak symbol, and loaded the lib using a simple main app. Yes, it didn’t fail until when the symbol was actually called. I then added the -z now flag, and it turned out that the result was the same: It loaded successfully and only failed when I actually call the symbol.

It seemed that the flag has no impact on weak symbol resolving?

Here is my code:
The shared library code so.c:

#include <stdio.h>
void __attribute__((weak)) UndefinedFunc(); // with no definition
void DefinedFunc() {
  printf("DefinedFunc\n");
  UndefinedFunc();
}

The main.c code:

#include <dlfcn.h>
#include <stdio.h>

extern void DefinedFunc();
typedef void (*FuncType)();

int main(int argc, char* argv[]) {
  void* handle = dlopen("./libmyso.so", RTLD_LAZY);
  if (!handle) {
    printf("load so failed!\n");
    return -1;
  }

  void* func_addr = (FuncType)dlsym(handle, "DefinedFunc");
  if (!func_addr) {
    printf("load DefinedFunc() error: %s\n",dlerror());
    return -1;
  } else {
    printf("load DefinedFunc() success\n");
  }

  FuncType func = (FuncType)func_addr;
  if (argc >1)
    func();
}

My compile commands:

gcc -shared -fPIC -Wl,-z,relro,-z,now so.c  -o libmyso.so
gcc main.c -Wl,-z,relro,-z,now  -L. -lmyso  -ldl -o out

The output:

tshao@tshao-dev:~/tmp$ ./out
load DefinedFunc() success
tshao@tshao-dev:~/tmp$ ./out 1
load DefinedFunc() success
DefinedFunc
Segmentation fault (core dumped)

When I ran ./out, the program only loaded the library, but didn't call the weak symbol 'UndefinedFunc' and the program didn't fail. When I ran ./out 1, it failed.

The UndefinedFunc() in libmy.so is a external undefined weak symbol:
0000000000000000 w D *UND* 0000000000000000 UndefinedFunc

@bnoordhuis
Copy link
Member

In your example, UndefinedFunc is weak data (note the w and D - it's essentially a function pointer), not a weak code stub. Circling back to #20122 (comment), you want to do (2) targeting master?

@tingshao
Copy link
Contributor Author

tingshao commented May 2, 2018

Thank you for the note. Above was just a try on the impacts. It seems quite complex than I originally thought. And I also think option (1) is the better choice and will focus on that. I will prepare a PR.

tingshao added a commit to tingshao/node that referenced this issue May 4, 2018
These flags could make some sections and the GOT entries of node
process read only to avoid being modified after dynamic linking is
done, thus the security could be enhanced.

Fixes: nodejs#20122
@tingshao
Copy link
Contributor Author

tingshao commented May 4, 2018

I made a PR #20513 for this issue, PTAL, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Issues and PRs related to build files or the CI.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants