Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C/C++ header files randomly go missing, then reappear #1735

Closed
Warblefly opened this issue Feb 28, 2017 · 17 comments
Closed

C/C++ header files randomly go missing, then reappear #1735

Warblefly opened this issue Feb 28, 2017 · 17 comments

Comments

@Warblefly
Copy link

Warblefly commented Feb 28, 2017

On Build 15042, while compiling in parallel using the MinGW version of GCC with make -j 8, header files are randomly unavailable to the compiler (pre-processor?). Sometimes, the compiler finds them in the wrong path then, upon trying to use them, cannot open them; and sometimes, they simply don't open from their correct place.

In previous Windows Insider builds up to and including 15019, my compilation ran perfectly. After this, another issue prevented me from using the Linux subsystem.

Here is an example of the output:

make[2]: stat: /mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/include/boost/config/stdlib/libcpp.hpp: Invalid argument
make[2]: *** No rule to make target '/mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/include/boost/config/stdlib/libcpp.hpp', needed by 'samples/cpp/CMakeFiles/tutorial_non_linear_svms.dir/tutorial_code/ml/non_linear_svms/non_linear_svms.cpp.obj'. Stop.
CMakeFiles/Makefile2:3772: recipe for target 'samples/cpp/CMakeFiles/tutorial_non_linear_svms.dir/all' failed
make[1]: *** [samples/cpp/CMakeFiles/tutorial_non_linear_svms.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....

You'll notice that the pre-processor prints "Invalid argument" (rather than something like "file not found", and that the error seems to be a return from a call to stat(). The file referred to, above, most certainly exists at the path where it is named, and can be opened and edited by other programs e.g. vim.

Other large builds that fail include cross-compiling Qt 5.8 (similar problem) and, also, FFmpeg likewise failed. When I build with make -j 1, failures still occur occasionally. Re-starting the compilation makes the problem go away temporarily, until another header file goes missing further down the Makefile.

If you'd like me to help with strace outputs, in case this is a new bug, I'll try: but the failures are random so I can't guarantee they'll be repeatable. They also occur usually after the build has been running for some minutes.

The rest of the system, in use as an editing machine for Avid Media Composer, is working just fine.

@aseering
Copy link
Contributor

Hi @Warblefly -- hm, that's interesting. Just to confirm, though -- you say that you're using the MinGW version of GCC? Are you using the new beta Bash on Windows / Windows Subsystem for Linux feature as well? The two are not the same; they are different ways to achieve the same end.

@Warblefly
Copy link
Author

Thank you for asking. I start a new console using the "Bash on Ubuntu on Windows" shortcut; then, within that window, compiled my own x86_64-w64-mingw32-gcc (and all other tools), and it is that gcc that fails to find the header one moment; then finds it the next.

The filesystem this is running on is a mounted NTFS system, as the pathnames may make clear.

Since Bash on Ubuntu on Windows came in, my large parallel compile cycles have all worked perfectly, until the build that followed 15019.

It's just failed again: but at a different point in the compilation. If I run this again, the compilation will pass this point, then fail somewhere else. It's that "invalid argument" that gets to me. And, in this particular case, the file isn't even where the error print-out says it is. In fact, it's in the top directory of the include tree. I can't work out why gcc is finding it in the wrong place; then, of course, unable to open it for pre-processing.

In file included from /mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/include/minwindef.h:163:0,
                 from /mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/include/windef.h:8,
                 from /mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/include/windows.h:69,
                 from /mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/include/winsock2.h:23,
                 from libavformat/os_support.h:112,
                 from libavformat/internal.h:28,
                 from libavformat/rawdec.c:24:
/mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/include/winnt.h:628:21: fatal error: /mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/x86_64-w64-mingw32/include/bs2b/guiddef.h: Invalid argument
 #include <guiddef.h>
                     ^
compilation terminated.
common.mak:60: recipe for target 'libavformat/rawdec.o' failed
make: *** [libavformat/rawdec.o] Error 1
make: *** Waiting for unfinished jobs....
Build failure. Please see error messages above.
john@JW-AVID-2016:/mnt/e/Users/john/Documents/MultimediaTools-mingw-w64$ 

@Warblefly
Copy link
Author

Yep. It's compiled the previous file, where there was an error, perfectly. But then it went hunting down the glib-2.0 include path. That's not even a path I give to the compiler. Still returns "invalid argument" not "file not found". Odd.


/mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/bin/x86_64-w64-mingw32-gcc -I. -I./ -D_ISOC99_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -U__STRICT_ANSI__ -D__USE_MINGW_ANSI_STDIO=1 -D__printf__=__gnu_printf__ -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 -DPIC -DOPJ_STATIC -DZLIB_CONST -DHAVE_AV_CONFIG_H -std=c11 -fomit-frame-pointer -pthread -I/mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/x86_64-w64-mingw32/include -Iinclude/ -mms-bitfiIn file included from /mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/include/winsock2.h:23:0,
                 from libavformat/os_support.h:112,
                 from libavformat/internal.h:28,
                 from libavformat/redspark.c:26:
/mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/include/windows.h:102:22: fatal error: /mnt/e/Users/john/Documents/MultimediaTools-mingw-w64/sandbox/mingw-w64-x86_64/x86_64-w64-mingw32/lib/glib-2.0/include/winspool.h: Invalid argument
 #include <winspool.h>

@aseering
Copy link
Contributor

aseering commented Mar 1, 2017

Hm... I'm not sure what is going on. I haven't hit this issue in probably 10 or 20 wall-clock-hours of compiling stuff in DrvFs on build 15042 so far. But I'm not using mingw (clever setup, by the way!); I'm using Ubuntu's stock gcc. Also, I have hit some slightly different weirdness as documented in #1729 .

Maybe there's a timing hole of some sort in WSL's logic for setting up new process environments? Just a guess.

(I'm not a WSL dev myself; I was kinda hoping that one would drop in with some thoughts :-) )

@Warblefly
Copy link
Author

It's very odd. Happens only at times of greatest stress: big suites e g. FFmpeg, DJV viewer, Qt 5.8. Most of the rest of the compilation progresses smoothly. I have never seen this trouble before in months of compiling this huge suite almost daily. And because the failure comes at semi-automatic random moments, I'm at a loss to diagnose it. Maybe it's time for a strace.

@Warblefly
Copy link
Author

Come to think of it, I have had other occasional fails with "invalid argument". These cases have been where the argument appears to be a directory, not a file (then quickly reappears as a file) — which is nonsense for a C or C++ header file and, hence, quite reasonably causes the compiler to barf saying "invalid" rather than "missing".

@aseering
Copy link
Contributor

aseering commented Mar 1, 2017

Hm... I wonder if this might be an IO race condition?

My test machine has a very fast disk (NVM SSD) and a less-fast CPU (dual-core mobile i7). @Warblefly , do you have a slower disk and/or faster CPU? Just a thought. If so, that might help explain why you see so many more failures than I do.

@Warblefly
Copy link
Author

Yes. i7 at 4.5GHz with hyperthreading but traditional HDD 6Gbit/s SATA connection onto data drive which would be accessed through DrvFs. System boots off SDD (which would be through VolFs were I to move files location to /home/john rather than /mnt/e...) though the SDD is also on a 6GBit/s interface.

@SvenGroot
Copy link
Member

Hi @Warblefly,

Could you share repro steps for this problem, so we can try to diagnose what is happening? Make sure to include how you set up your build environment to build and use the MinGW gcc.

Thanks,
Sven

@Warblefly
Copy link
Author

Certainly. I'm just going to run an overnight build attempt of my suite on Insider Preview 15046, which arrived today (been at a trade show today). Will write more in the morning (UK time), and send links to source code and a strace, if the problem persists.

@Warblefly
Copy link
Author

I am very sorry to report that, on Insider Preview 15046, my entire build proceeds without error. No files go missing or change their types; nothing fails, even under stressful multi-process compilation. Of course, that's actually very good news! Something has been fixed, either in WSL or in Windows itself.

My repo: https://github.com/Warblefly/MultimediaTools-mingw-w64#readme

@sunilmut
Copy link
Member

@Warblefly - Closing it out since the issue is resolved for you now. If I misread your post and the issue is not resolved, please let us know and we will reopen the issue. Thanks for using WSL and the feedback!

@Warblefly
Copy link
Author

Warblefly commented Mar 24, 2017

MY ERROR: the directory 'build', shown below, was open in another terminal after I had tried to delete it here. This is a manifestation of the way Linux subsystem on Windows is implemented; as soon as I took the other shell window out of the 'build' directory, the odd entry went away.

(ORIGINAL REPORT)
For the first time in a long time, this error has reappeared. Here's the output from ls -l — you'll see that 'build' (created a few minutes earlier) has stuck around in an unfortunate state. This is on Windows Insider Build 15063, and no third-party antivirus.

ls: cannot access 'build': No such file or directory
total 78688
drwxrwxrwx 0 root root  4096 Mar 24 10:31 3rdparty
drwxrwxrwx 0 root root  4096 Mar 24 10:31 apps
d????????? ? ?    ?        ?            ? build
drwxrwxrwx 0 root root  4096 Mar 24 10:31 cmake
-rwxrwxrwx 1 root root 51021 Mar 24 10:31 CMakeLists.txt
-rwxrwxrwx 1 root root 51002 Mar 24 10:31 CMakeLists.txt.orig
-rwxrwxrwx 1 root root   191 Mar 24 10:31 CONTRIBUTING.md
drwxrwxrwx 0 root root  4096 Mar 24 10:31 data
drwxrwxrwx 0 root root  4096 Mar 24 10:31 doc
drwxrwxrwx 0 root root  4096 Mar 24 10:31 include
-rwxrwxrwx 1 root root   509 Mar 24 10:31 index.rst
-rwxrwxrwx 1 root root  1772 Mar 24 10:31 LICENSE
-rwxrwxrwx 1 root root  1025 Mar 24 11:02 lsout.txt
drwxrwxrwx 0 root root  4096 Mar 24 10:31 modules
-rwxrwxrwx 1 root root   437 Mar 24 10:31 opencv-boost-thread.patch
-rwxrwxrwx 1 root root     0 Mar 24 10:31 opencv-boost-thread.patch.done
drwxrwxrwx 0 root root  4096 Mar 24 10:31 platforms
-rwxrwxrwx 1 root root   566 Mar 24 10:31 README.md
drwxrwxrwx 0 root root  4096 Mar 24 10:31 samples

@kwaegel
Copy link

kwaegel commented Nov 30, 2017

This just started happening for me in build 16299_rs3_release, using the stock compiler GCC 5.4.0.

@therealkenc
Copy link
Collaborator

If you are on 16299 then you're seeing #2448, which is different (confusingly) to this one reported back in February here. Try out the work-around in that issue or try Insider 17035 or later.

@kwaegel
Copy link

kwaegel commented Nov 30, 2017

Workaround fixes the build issue, but causes the entire desktop to lock up after compiling ~40 files at -j8. Looks like I'll need to wait for the official Fall Creators update patch, since I can't really run an Insider release at work.

In the meantime, it looks like manually re-running make -j1 multiple times will eventually build everything, it just takes 10-15x as long.

@therealkenc
Copy link
Collaborator

If you are holding out, consider compiling your stuff on LxFs (/home). It is in general a better experience, even discounting the present bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants