build: Add Data Relocation and Protection flags (-z relro -z now) to address potential security issue #20122

tingshao · 2018-04-18T08:33:44Z

Version: All
Platform: All supported platforms
Subsystem:

This issue is created separately to track the Data Relocation and Protection flag from the issue #18671 to make it independent and more clear to track as #18671 contains several compile flags.

The flag is for linker: -z relro -z now

The RELRO flag was a mitigation technique to harden the data sections of an ELF binary/process and could prevent the modification of GOT entries of a process. Thus able to make node more secure.

I made some investigation and tests based on the flag, It passed the builtin functional tests and have no obvious performance impact on the builtin benchmark tests.

I also made a performance test by loading some native add-ons (all enabled the -z relro -z now flag), and the result shows that the time spend on loading add-ons only increased 0.54%. Below is my testing steps:

My machine is:
cpu: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz with 8 cores
memory: 32G
os: Ubuntu 16.04
Get the code of node-librealsense, node-glfw-3, and opencv4nodejs. Then made minor changes to enable -z relro -z now on them. Opencv4nodejs and node-librealsense totally exported about 600 methods and total library size is about 4MB.
time the duration of below command with relro enabled node and 3 relro enabled add-ons as TR1 by:
time /path/to/relro/enabled/node -e "require('node-librealsense');require('node-glfw-3');require('opencv4nodejs');
time the duration of below command with relro enabled node and only 1 relro enabled add-on as TR2 by:
time /path/to/relro/enabled/node -e "require('node-glfw-3')
time the duration of below command with default built node and 3 default built add-ons as TD1 by:
time /path/to/default/node -e "require('node-librealsense');require('node-glfw-3');require('opencv4nodejs');
time the duration of below command with default built node and only 1 default built add-on as TD2 by:
time /path/to/default/node -e "require('node-glfw-3')
run the above test 1000 times, extract the real time and get the average result:
TR1 (node-librealsense+opencv4nodejs+node-glfw-3): 140.23 ms
TR2 (node-glfw-3): 45.804 ms
TD1 (node-librealsense+opencv4nodejs+node-glfw-3): 139.331 ms
TD2 (node-glfw-3): 45.415 ms

So the time spent on loading node-librealsense+opencv4nodejs is:
when relro enabled: DR1 = TR1 - TR2 = 94.426 ms
when relro not enabled(default built): DD1 = TD1 - TD2 = 93.916 ms
Time increased rate for loading node-librealsense+opencv4nodejs is: (DR1 - DD1)/DD1 = 0.0054. It's 0.54%.

Besides, I compared the time of running: time /path/to/node -e "". And result shows a 0.68% time increase.

Based on the above result, it seems to me that adding '-z relro -z now' doesn't impact the performance a lot and we can get more security, what do you think guys?

The text was updated successfully, but these errors were encountered:

bnoordhuis · 2018-04-18T08:46:21Z

I think the problem with -z now was (and is) that it breaks add-ons with missing symbols. I don't know how common that is but it will almost inevitably break some add-ons.

We could either:

hack the build so it's only turned on for node itself but not add-ons, or
decide the extra security is worth some ecosystem fallout

(2) can be checked up to a point through citgm.

tingshao · 2018-04-20T09:32:22Z

Yes, based on the consideration, It seems option 1 that doesn't affect add-ons is better.

bnoordhuis · 2018-04-20T10:07:52Z

We could do (1) first and (2) in a major release (say, Node.js 11) since people recompile their add-ons anyway after a major upgrade.

tingshao · 2018-04-26T03:36:54Z

I know some but still not quite clear about how -z now would cause add-ons with missing symbols fail to load, building issues some times are quite confusing. 😀

I could figure out the following 2 cases causing missing symbols, and please let me know if I missed some others.

Case 1: Make a shared library with a symbol labeled undefined in the symbol table. This could be done by the following code, and the UndefinedFunc() would be labeled undefined in symbol table.
`
#include <stdio.h>

typedef void (*FuncPtr)();
void UndefinedFunc(); // with no definition and would be marked undefined
void DefinedFunc(FuncPtr func) {}
void DefinedFunc2() {
DefinedFunc(UndefinedFunc); //reference the UndefinedFunc to reserve it in symbol table
}
`
And we can check the symbol table by ‘objdump -T libmyso.so’, it shows:

0000000000000000 D UND 0000000000000000 UndefinedFunc

When we dlopen the generated shared library, it results in an error saying undefined symbol: UndefinedFunc. If we load it implicitly, the linker would report error during linking.

Case 2: If a add-on was compiled against a new version library but runs with an old version library with less symbols. If lazy binding is used, the add-on may work well if those missing symbols are never called, But when -z now flag is open, it would cause loading error. I have verified it using a simple example.

Let’s summary it,

Case 1 seems not relevant with the -z now flag, as before we use this flag it already failed.
Case 2 seems relevant, and it seems a deployment issue and -z now would make it even worth.

Is case 2 the issue you mentioned? Thanks.

bnoordhuis · 2018-04-26T07:58:53Z

(1) and (2) both. A variation of (1) where the missing symbol is external weak instead of strong won't be rejected by the dynamic linker, it's resolved on first use. (Sometimes. It's complicated.)

My hunch is that it's predominantly a problem with add-ons that link against shared libraries.

tingshao · 2018-04-28T06:04:32Z

I investigated the weak symbol, as I never used it before :)
Then I created a shared lib with one external weak symbol, and loaded the lib using a simple main app. Yes, it didn’t fail until when the symbol was actually called. I then added the -z now flag, and it turned out that the result was the same: It loaded successfully and only failed when I actually call the symbol.

It seemed that the flag has no impact on weak symbol resolving?

Here is my code:
The shared library code so.c:

#include <stdio.h>
void __attribute__((weak)) UndefinedFunc(); // with no definition
void DefinedFunc() {
  printf("DefinedFunc\n");
  UndefinedFunc();
}

The main.c code:

#include <dlfcn.h>
#include <stdio.h>

extern void DefinedFunc();
typedef void (*FuncType)();

int main(int argc, char* argv[]) {
  void* handle = dlopen("./libmyso.so", RTLD_LAZY);
  if (!handle) {
    printf("load so failed!\n");
    return -1;
  }

  void* func_addr = (FuncType)dlsym(handle, "DefinedFunc");
  if (!func_addr) {
    printf("load DefinedFunc() error: %s\n",dlerror());
    return -1;
  } else {
    printf("load DefinedFunc() success\n");
  }

  FuncType func = (FuncType)func_addr;
  if (argc >1)
    func();
}

My compile commands:

gcc -shared -fPIC -Wl,-z,relro,-z,now so.c  -o libmyso.so
gcc main.c -Wl,-z,relro,-z,now  -L. -lmyso  -ldl -o out

The output:

tshao@tshao-dev:~/tmp$ ./out
load DefinedFunc() success
tshao@tshao-dev:~/tmp$ ./out 1
load DefinedFunc() success
DefinedFunc
Segmentation fault (core dumped)

When I ran ./out, the program only loaded the library, but didn't call the weak symbol 'UndefinedFunc' and the program didn't fail. When I ran ./out 1, it failed.

The UndefinedFunc() in libmy.so is a external undefined weak symbol:
0000000000000000 w D *UND* 0000000000000000 UndefinedFunc

bnoordhuis · 2018-04-30T09:43:35Z

In your example, UndefinedFunc is weak data (note the w and D - it's essentially a function pointer), not a weak code stub. Circling back to #20122 (comment), you want to do (2) targeting master?

tingshao · 2018-05-02T03:54:05Z

Thank you for the note. Above was just a try on the impacts. It seems quite complex than I originally thought. And I also think option (1) is the better choice and will focus on that. I will prepare a PR.

These flags could make some sections and the GOT entries of node process read only to avoid being modified after dynamic linking is done, thus the security could be enhanced. Fixes: nodejs#20122

tingshao · 2018-05-04T06:12:59Z

I made a PR #20513 for this issue, PTAL, thanks.

tingshao mentioned this issue Apr 18, 2018

build: Utilize Modern Compiler Flags to Address Potential Security Issues #18671

Closed

bnoordhuis added the build Issues and PRs related to build files or the CI. label Apr 18, 2018

tingshao mentioned this issue May 4, 2018

build: add '-z relro -z now' linker flags #20513

Closed

2 tasks

addaleax closed this as completed in 2d4dd10 May 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build: Add Data Relocation and Protection flags (-z relro -z now) to address potential security issue #20122

build: Add Data Relocation and Protection flags (-z relro -z now) to address potential security issue #20122

tingshao commented Apr 18, 2018 •

edited

Loading

bnoordhuis commented Apr 18, 2018

tingshao commented Apr 20, 2018

bnoordhuis commented Apr 20, 2018

tingshao commented Apr 26, 2018

bnoordhuis commented Apr 26, 2018

tingshao commented Apr 28, 2018

bnoordhuis commented Apr 30, 2018

tingshao commented May 2, 2018 •

edited

Loading

tingshao commented May 4, 2018

build: Add Data Relocation and Protection flags (-z relro -z now) to address potential security issue #20122

build: Add Data Relocation and Protection flags (-z relro -z now) to address potential security issue #20122

Comments

tingshao commented Apr 18, 2018 • edited Loading

bnoordhuis commented Apr 18, 2018

tingshao commented Apr 20, 2018

bnoordhuis commented Apr 20, 2018

tingshao commented Apr 26, 2018

bnoordhuis commented Apr 26, 2018

tingshao commented Apr 28, 2018

bnoordhuis commented Apr 30, 2018

tingshao commented May 2, 2018 • edited Loading

tingshao commented May 4, 2018

tingshao commented Apr 18, 2018 •

edited

Loading

tingshao commented May 2, 2018 •

edited

Loading