-
-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests fail on Debian stretch with beignet #231
Comments
Thanks for reporting. Could you give a bit more info though? Which device are you testing on? And can you post the results of the failing test runs? |
Lenovo Thinkpad X230. How do I get those results? Is it just the terminal output? |
Would be helpful to run CMake just runs the test executables, but stores the output somewhere else. You can probably find that in subfolders on disk. Consult the CMake/CTest documentation to get more info. Otherwise, you can just manually run the test executables, e.g. |
|
Can CLBlast fall back to usual OpenBLAS for unsupported operations by the way? |
OK thanks, probably
What do you mean exactly? And which routines? Are you using the Netlib API by the way instead of the OpenCL API? That's not recommended for speed as you might already know, definitely not on small GPUs such as the Intel GPU you have. But anyway, let's fix the tests first. One thing you can try is to run the tuners (see README), because perhaps the defaults are not suitable for your particular GPU? I've tested on other Intel GPUs with Beignet with succes. |
I see messages in tests about failed operations because of missing features in GPU ("Unsupported precision" such as double or half floats). With Netlib API I expect all GPU details to be fully abstracted from user application, but this dependance on GPU features with exceptions in case of missing things is abstraction leak. A proper way would be fall back to CPU implementation if GPU can't do something. I don't know which exact routines (never programmed for BLAS so far), but I expect CLBlast with Netlib API to be drop-in replacement for OpenBLAS (or something like that). It may even include
for i in clblast_test_*; do echo $i; DISPLAY= ./$i; echo $?; done &> log ? |
Tried tuning (log and jsons), but the python script fails afterwards:
|
@vi To get rid of the python error, delete the |
Thanks for sharing the output of the tests! A quick glance shows that it might be just the reduce and matrix-multiplication kernels failing, they are used in quite a few cases. Let's first see if the tuning can fix them.
Don't think that's going to work, since he just got a fresh copy anyway.
OK, thanks for sharing the JSONs. I'll take a look myself this weekend at what's going wrong and I'll try to fix it and make the error message more meaningful for future cases. I'll report back as soon as I have something for you.
I understand your point, but that might be less trivial to implement than you suggest it. First of all, you'll have to query OpenCL to see what's supported and what not. Then, you'll have to call a BLAS routine, which is not trivial to do since CLBlast also uses |
OK, I've just tested with your JSON files (thanks again for sharing), and I didn't encounter any issue. So it is likely that some things changed in the database in the meantime since the release of CLBlast v1.2.0. So there are two things you could do:
|
6. Revert what Same running time, same |
OK, thanks for testing. Let's first try to resolve the
This error message is from Beignet, not from CLBlast. It suddenly cannot find your device anymore, which is strange. This seems to suggest something is wrong with your OpenCL set-up, or there is perhaps a bug in beignet? I could not find any issue on the Beignet Bugzilla tracker, but perhaps you can search and file one? First also check if it is reproducible, i.e. does it always fail at exactly the same test? Then there is another issue with |
Shall I try downgrading beignet to v1.2? |
|
Is the system supposed to be usused durign tuning/testing or it is OK to browse around (ignoring graphics lags)? |
Not sure, you could try perhaps. I believe there was at least one other CLBlast user with your GPU, since there were already tuning results, so it must have worked correctly on some system at some point.
On my Debian 9 test system I get exactly the same output when I run
When I test with Beignet I also run X at the same time and I don't see any issues. So I would search the issue with Beignet or with the GPU drivers. Try other versions perhaps, or otherwise report the issue with Beignet. Could well be that the other failing tests are related to this as well... |
So, did you have any luck with another version of Beignet? Or did you report this issue with the developers of Beignet? |
Not yet. And I'm not sure what steps should I do for the reporting. Is there some minimal failing case which supposed to work, but doesn't? |
I'm not sure we need a minimal failing case here. If you look at the error you are getting returned from
However, it did use your GPU in the tests just moments before that. So it seems some time related instability. First also check if it is reproducible, i.e. does it always fail at exactly the same test instance? Beignet bugs can be filed here. You could refer to this issue perhaps? First also double-check the list of existing Beignet bugs |
Tried running utests_run, it seems to work...
|
Notes:
I though about reducing the clblast_test_xgbmv to something smaller, but the testing system is too complicated and I stopped trying after observing this. How, for example, move the |
OK, thanks for trying. I general I don't think anything can be done from the CLBlast side. Because if calling But you are right that trying to pinpoint whether it always the same test that fails is a good idea. What you can first do indeed is only to test that particular case. I'll help you out. Let's try two steps:
|
I instead tried this: diff --git a/test/correctness/testblas.cpp b/test/correctness/testblas.cpp
index aa4b478..be28ed3 100644
--- a/test/correctness/testblas.cpp
+++ b/test/correctness/testblas.cpp
@@ -23,7 +23,7 @@ namespace clblast {
// The transpose configurations to test with: template parameter dependent
template <> const std::vector<Transpose> TestBlas<half,half>::kTransposes = {Transpose::kNo, Transpose::kYes};
-template <> const std::vector<Transpose> TestBlas<float,float>::kTransposes = {Transpose::kNo, Transpose::kYes};
+template <> const std::vector<Transpose> TestBlas<float,float>::kTransposes = {Transpose::kNo, Transpose::kNo};
template <> const std::vector<Transpose> TestBlas<double,double>::kTransposes = {Transpose::kNo, Transpose::kYes};
template <> const std::vector<Transpose> TestBlas<float2,float2>::kTransposes = {Transpose::kNo, Transpose::kYes, Transpose::kConjugate};
template <> const std::vector<Transpose> TestBlas<double2,double2>::kTransposes = {Transpose::kNo, Transpose::kYes, Transpose::kConjugate};
diff --git a/test/correctness/testblas.hpp b/test/correctness/testblas.hpp
index 4e02fd2..9c0830b 100644
--- a/test/correctness/testblas.hpp
+++ b/test/correctness/testblas.hpp
@@ -144,7 +144,7 @@ template <typename T, typename U> const std::vector<size_t> TestBlas<T,U>::kMatS
template <typename T, typename U> const std::vector<size_t> TestBlas<T,U>::kVecSizes = {0, kBufferSize - 1, kBufferSize};
// The layout/triangle options to test with
-template <typename T, typename U> const std::vector<Layout> TestBlas<T,U>::kLayouts = {Layout::kRowMajor, Layout::kColMajor};
+template <typename T, typename U> const std::vector<Layout> TestBlas<T,U>::kLayouts = {Layout::kRowMajor, Layout::kRowMajor};
template <typename T, typename U> const std::vector<Triangle> TestBlas<T,U>::kTriangles = {Triangle::kUpper, Triangle::kLower};
template <typename T, typename U> const std::vector<Side> TestBlas<T,U>::kSides = {Side::kLeft, Side::kRight};
template <typename T, typename U> const std::vector<Diagonal> TestBlas<T,U>::kDiagonals = {Diagonal::kUnit, Diagonal::kNonUnit}; and got this:
Is there a simple program that just calls clGetDeviceIDs in endless loop? (I'm not familiar with OpenCL/Cuda/GPU world in general yet). |
Hmm, interesting, so you think it would perhaps always happen at the n-th call to that function? Something like this could help you perhaps:
|
This does not fail (increased NUM_RUNS and removed the main printf), even if I also start clblast_test_xgbmv in parallel. Running the test under valgrind (snippet):
There are a lot of After
|
In your last example it fails at a different place then before, so the location is not deterministic? |
It fails when file descriptors run out. CLBlast test (or some dep) opens them, but does not close properly.
|
OK, never heard of those. CLBlast doesn't open any files while testing. Must be Beignet related then I guess? Can you then re-run all the tests with that 'fix' applied and report which ones still have open issues? |
For completess: the GPU may have felt a little bit sick at the time of test. At least the graphical scaling glitch is still here. |
Sorry I forget about this issue. Thanks for testing. The tuner result looks OK. Not sure how to continue though since I can't test myself and start to debug the issue, because I can't reproduce it. Perhaps this other issue #149 might help you out. It seems it is also the same GPU, but a different OpenCL (not Beignet but Apple OpenCL). |
If needed I can run special modified versions for debugging or maybe give access for remote debugging on my laptop. But maybe I should "play" with Beignet versions first. I've already built one from source code, but not sure yet how to install it into Debian (or can it be used without installation). |
You can |
How do I select the right one ensuring no pieces of the wrong one is on the way and also without disruptive changes to the system from root? It it just |
Not sure, I'm not an expert on that... But what I meant was 'select' in the OpenCL platform sense. If you do it right, you might have both Beignet's co-existing on your system, |
Any updates here? Or should we conclude it is not CLBlast-related? |
Not experimented yet with other Beignets. Not sure if it is appropriate to report bugs there without trying fresher build... Maybe CLBlast is doing things OK, but also can contain workaround for broken platforms... If/when I come back to experimenting with OpenCL in general and Beignet and/or CLBlast in particular, I'll comment. |
Intel now has a new open-source implementation that is replacing Beignet. Perhaps it is time to try the new Intel NEO? |
Is it something new-ish? Unlikely that it would work on my laptop. |
Indeed, it seems that your hardware is not supported. Neo is new indeed, Beignet is now discontinued, so that won't lead to solving this issue either it seems. How do you suggest we proceed? Do you still have time to test things? We could also close this issue and say that older hardware is not properly supported in all cases... |
Yes, I'm constantly trying various things (CLBlast being a detour from experimenting with various deep learning toys and thinking "what if I can workaround missing OpenCL support for ... by using CLBlast instead of usual BLAS library").
Maybe like previously, me trying updated beingnet (or just waiting until eventually updated Beignet comes to Debian Stable), then maybe reporting additional issues to Beignet. |
Any updates from your side? |
Not yet. Is it something urgent or you just don't want a danging open issue? I'll report results here if/when I resume experimentation regardless of closedness status of this issue. For now I just treat my laptop as Not Ready For GPU Computing. |
OK. Yes, I see this as a list of things I have to work on :-) I can also add your setup to the list of known issues, close this issue, and we can follow-up later with you and/or Intel when you have time to see if anything can be fixed? |
Got round and installed beignet from master. 3 passing tests in Beignet's own tests almost succeed: Using clblast dda1e56 and beignet 591d387327ce35f03a6152d4c823415729e221f2. |
Tried beignet 1.2.1 (097365ed1a79cd03dc689b37b03552e455eb3854) and seeing more successful tests. |
Tests now look much better:
Opened filehandles of Shall I run the tuning process? |
OK, that is good news, so Beignet 1.2.1 works quite good. One failing test it seems, shall we try and see if we can solve that? It is a bit of a special thing though, not really needed in all cases. But perhaps you can give me the output when running |
|
First phase of tuning succeed (there are 40 JSONs), but database.py failed:
|
Results and output of the first phase of tuning: https://vi-server.org/pub/clblast_beignet_gen3_tuning.7z |
OK, thanks for the feedback.
So that's definitely a Beignet bug, so let's forget about that. This 'preprocessor' is not enabled anyway for your GPU, so a failed test won't harm you. Good to see that the tuning also works! About the Python script, I tried to reproduce with your results but didn't get your issue. Perhaps you have an old database on disk? You could try to remove |
After If database format changes without changing the download URL, does it mean that old CLBlast versions are untunable anymore? Maybe it should download not from master, but from current commit? |
Yes, you are right. It should ideally be a git submodule or something. But not a super urgent thing I guess, because it is mostly power users that do this and the use-case of tuning first and then a few months later again is not so common. So, what I'll do now is add your results new to the latest master and also make a note that Beignet 1.2.1 is the one to go for with your device. And then we can close this issue, am I right? |
After the tuning tests seem to rung longer:
What about the connections leak? It seems like CLBlast (or Beignet, or at least the tests and tuners) opens something and not closes it properly. |
Seems OK. This issue is a already bit long and takes some browser resources to load and render. New issues would be opened about other problems like connections leak. |
Could very well be, the tests typically test corner cases and very small matrices, so time is actually mostly taken by CPU reference code, CPU-GPU copy, and a bit by (perhaps slower) kernels. Since the main issue is solved, I'll close this indeed. |
With
beignet 1.3.2-1
and CLBlast v1.2.0 it fails multiple tests:Additionally matmul build with NETLIB CLBlast fails multiplication if matrix is big enough:
On master branch it also fails.
The text was updated successfully, but these errors were encountered: