Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Rendering Replica models in the simulator (under CONSTRUCTION) #132

Closed
wants to merge 39 commits into from

Conversation

bigbike
Copy link
Contributor

@bigbike bigbike commented Aug 6, 2019

Motivation and Context

Initialize the PR to get early feedbacks.

Purpose:
This PR is to upgrade the simulator so that it can render the Replica model released by FRL (https://github.com/facebookresearch/Replica-Dataset).

The master version can render the semantic scene of the replica model correctly. However, it cannot render the replica model (with the old ptex textures (.rgb format), or the new ones (.hdr format)) on either MacOS or Linux.

How Has This Been Tested

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • [x ] New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • [x ] My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have completed my CLA (see CONTRIBUTING)
  • I have added tests to cover my changes.
  • All new and existing tests passed.

-) fixed a couple of loggings;
-) use join in corrade to generate the file path;
-) fix bug: numBytes
-) load customized image (.rgb file) using file to memory mapping;
@bigbike bigbike requested a review from mosra August 6, 2019 22:50
@facebook-github-bot facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Aug 6, 2019
Cr::Containers::Array<const char, Cr::Utility::Directory::MapDeleter> data =
Cr::Utility::Directory::mapRead(rgbFile);
const int dim = static_cast<int>(std::sqrt(data.size() / 3)); // square
const size_t numBytes = io::fileSize(rgbFile);
Copy link
Contributor Author

@bigbike bigbike Aug 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to load the textures from the file. Such textures are not in common picture format (e.g., png, jpg, tga), but some customized format in binary.

In ReplicaSDK, they mapped a texture (rgbFile) to the memory, and uploaded it to a texture (atlas). The code is as follows.

const size_t numBytes = std::experimental::filesystem::file_size(rgbFile);
// Open file
int fd = open(std::string(rgbFile).c_str(), O_RDONLY, 0);
void* mmappedData = mmap(NULL, numBytes, PROT_READ, MAP_PRIVATE | MAP_POPULATE, fd, 0);
meshes[i]->atlas.Upload(mmappedData, GL_RGB, GL_UNSIGNED_BYTE);
munmap(mmappedData, numBytes);
close(fd);

@mosra : Do you think the current implementation is proper?
Do you have better ideas in terms of leveraging Magnum to do the same work?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait wait but this is practically reverting #106 to what was there before, and that change was done in order to help with #84 (and besides that, reducing platform-specific code). The code you're showing is precisely what mapRead() does if I see correctly, apart from the non-portable MAP_POPULATE flag

Is there any issue with the way it was done using Directory::mapRead()? Because in my testing I couldn't see any problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allow me to reply this one first before I read the other comments.
I found an issue in the previous code that the "dim" was wrong, which resulted from "numBytes".

After reading your comment, assuming the mapRead() did exactly the same, then it means getting the "numBytes" using "data.size()" is incorrect.

What is the correct way to get this number?

I would change it back.

(I am not aware of the #106 or #84 since I was on leave for a couple of months. Good to know.)

Thank you very much for the comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh. The array returned from mapRead() should have its size equal to number of bytes in that file (and my tests so far were confirming that). The rest of the code looks exactly the same, so that would mean mapRead() returns a wrong size?

What does numBytes give and what data.size()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. It is also possible my testing code has some bugs.
Allow me to take another look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested it again. The numbers are identical. There must be something wrong in my previous debugging code. My bad. I sincerely apologize for the confusion. Sorry, Vladimir.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem ;) In any case, I'm not claiming my implementation is bug-free either, so if you see something strange with it, let me know.

Magnum::ImageView2D image(Magnum::PixelFormat::RGB8UI, {dim, dim}, data);
renderingBuffers_[iMesh]
->tex.setWrapping(Magnum::GL::SamplerWrapping::ClampToEdge)
.setMagnificationFilter(Magnum::GL::SamplerFilter::Linear)
.setMinificationFilter(Magnum::GL::SamplerFilter::Linear)
// .setStorage(1, GL::TextureFormat::RGB8UI, image.size())
// .setStorage(1, Magnum::GL::TextureFormat::RGB8UI, image.size())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mosra :
The code in the master version commented it out.
Do I need to set the storage here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was commited this way by @msavva in October 2018, I don't know what was the original reason :) Generally, calling setSubImage() without setStorage() or setImage() being called first will result in a GL error and nothing being done, so I guess that's what is happening here? (Unless the texture storage is being set up somewhere else, but I don't see any such code anywhere else.)

If you are on linux you can run the viewer with --magnum-gpu-validation on and it'll tell you in the console output about any GL errors. (Mac doesn't implement GL debug output, so there you'd need to test for GL::Renderer::error() manually by hand.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am developing everything on the Mac. I guess I will have to require a linux box. :)

So here, if I understand you correctly, I should actively call setStorage(). Correct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, or change the below to setImage() ... but setStorage() is the preferred thing to do.

@bigbike
Copy link
Contributor Author

bigbike commented Aug 6, 2019

Another question:
I quote your comment in the previous PR: "Looking at the sources, the only thing that requires GLSL 4.30 is the SSBO in atlas.glsl (which could be replaced with GL::BufferTexture, as is already done here?). So I'd say choosing GL 4.1/4.2 on both would make the code simpler?"
So my understandings are,

  • On macOS, we use 4.1 while on linux, we use 4.2;
  • If we use GL::bufferTexture to handle the SSBO in atlas.glsl, which requires 4.2, that implies rendering Replica models cannot be supported on macOS;

Do I understand it correctly?
Thank you very much!

@dhruvbatra
Copy link
Contributor

CC: @jstraub

Copy link
Collaborator

@mosra mosra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got a bit frigtened by the reverted mmap change, can you elaborate? :) The other is just minor stuff, hinting at APIs you could use.

} else if (endsWith(path, "mesh.ply")) {
// Warning:
// The order of if clause matters. cannot move "mesh.ply" before
// "ptex_quad_mesh.ply" or "semantic_quad_mesh.ply"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would do

ASSERT(!endsWith("quad_mesh.ply"));

here instead of the comment :)

Cr::Containers::Array<const char, Cr::Utility::Directory::MapDeleter> data =
Cr::Utility::Directory::mapRead(rgbFile);
const int dim = static_cast<int>(std::sqrt(data.size() / 3)); // square
const size_t numBytes = io::fileSize(rgbFile);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait wait but this is practically reverting #106 to what was there before, and that change was done in order to help with #84 (and besides that, reducing platform-specific code). The code you're showing is precisely what mapRead() does if I see correctly, apart from the non-portable MAP_POPULATE flag

Is there any issue with the way it was done using Directory::mapRead()? Because in my testing I couldn't see any problem.

src/esp/assets/PTexMeshData.cpp Show resolved Hide resolved
// officially released Replica dataset
return (Corrade::Utility::String::stripSuffix(filename, "mesh.ply") +
"textures");
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why the lambda is needed if it's used just on a single place :) Also there's Directory::path() to make this an oneliner:

const std::string atlasDir = Corrade::Utility::Directory::join(
  Corrade::Utility::Directory::path(filename), "textures");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new format is using "textures" as the folder name, while the old one was using "ptex_textures". Cannot put them in a single line.

Well, I can use Ternary Operator instead of the lambda.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh well, sorry, missed that part, thought it's just cutting off a different filename 🙈

const std::string rgbFile = Corrade::Utility::Directory::join(
atlasFolder_, std::to_string(iMesh) + "-color-ptex.rgb");

ASSERT(io::exists(rgbFile), Error : Cannot find the rgb file);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's CORRADE_ASSERT() btw, which could make esp/logging.h obsolete -- and it supports printing everything that Utility::Debug can :)

Suggested change
ASSERT(io::exists(rgbFile), Error : Cannot find the rgb file);
CORRADE_ASSERT(io::exists(rgbFile), "Error : Cannot find the rgb file");

Also, for io::exists there's Directory::exists() ... and and and

... 🤔 should I open a PR moving everything from esp/logging.h and io/io.h to builtin Corrade functionality? 😄 (cc: @msavva @erikwijmans)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logging.h and io.h are legacy.
I totally agree to move them to Corrade. But I do not think it is that urgent.

Magnum::ImageView2D image(Magnum::PixelFormat::RGB8UI, {dim, dim}, data);
renderingBuffers_[iMesh]
->tex.setWrapping(Magnum::GL::SamplerWrapping::ClampToEdge)
.setMagnificationFilter(Magnum::GL::SamplerFilter::Linear)
.setMinificationFilter(Magnum::GL::SamplerFilter::Linear)
// .setStorage(1, GL::TextureFormat::RGB8UI, image.size())
// .setStorage(1, Magnum::GL::TextureFormat::RGB8UI, image.size())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was commited this way by @msavva in October 2018, I don't know what was the original reason :) Generally, calling setSubImage() without setStorage() or setImage() being called first will result in a GL error and nothing being done, so I guess that's what is happening here? (Unless the texture storage is being set up somewhere else, but I don't see any such code anywhere else.)

If you are on linux you can run the viewer with --magnum-gpu-validation on and it'll tell you in the console output about any GL errors. (Mac doesn't implement GL debug output, so there you'd need to test for GL::Renderer::error() manually by hand.)

@mosra
Copy link
Collaborator

mosra commented Aug 7, 2019

If we use GL::bufferTexture to handle the SSBO in atlas.glsl, which requires 4.2, that implies rendering Replica models cannot be supported on macOS

No. SSBOs are since GL 4.3 and those aren't supported on macOS, as it's stuck on 4.1 + some extensions on top. GL::BufferTexture requires just GL 3.1, and it definitely is supported there. So if we go with 4.1 and GL::BufferTexture we're all good on Mac.

@bigbike
Copy link
Contributor Author

bigbike commented Aug 7, 2019

@jstraub:
I found "calculateAdjacency()" was time-consuming. I believe that is why the openmp is employed.
Can we pre-compute it (or compute it once on the 1st run) and store the adjacency to the disk?

@bigbike
Copy link
Contributor Author

bigbike commented Aug 7, 2019

If we use GL::bufferTexture to handle the SSBO in atlas.glsl, which requires 4.2, that implies rendering Replica models cannot be supported on macOS

No. SSBOs are since GL 4.3 and those aren't supported on macOS, as it's stuck on 4.1 + some extensions on top. GL::BufferTexture requires just GL 3.1, and it definitely is supported there. So if we go with 4.1 and GL::BufferTexture we're all good on Mac.

Good to know. This is great news.
Before, I thought GL::BufferTexture would require 4.2.

@mosra
Copy link
Collaborator

mosra commented Aug 7, 2019

I found "calculateAdjacency()" was time-consuming.

What does the function do, exactly? To me it looks like for every face it finds out IDs of adjacent faces (plus some orientation encoding after)? If that's so, that could be done without the hashmap, without nested vectors, in O(n) time and just one or two allocations instead of ~3n. I did this very recently for MeshTools::generateSmoothNormals() and have it still fesh in my head, so I could extract the code out to a reusable utility that we could use here. (The code in question, for reference, is here.)

@bigbike
Copy link
Contributor Author

bigbike commented Aug 7, 2019

Have you seen my comment ? :) I think the algorithm could be rewritten in a way that doesn't make it take ages to run (and thus no need to save/restore it), but I need a clarification that my assumption is correct.

Yes, I saw it, which is very cool if we can optimize their algorithm to O(N). But such code is not immediately available yet, and I have to debug and run the code hundreds of times a day. To ease my pain, such save/load functions are rather helpful at the moment. Hope you understand.

We can treat them as temporary code here, and remove them later if they are indeed not necessary.
My opinion is that I prefer to save/load any data that can be pre-computed or pre-processed regardless the computation complexity.

Based on offline sync, we would like to only support the public released replica models.
- add gamma, saturation;
- change the line_adjacency;
- change the shader: texelFetch(...).r;
Copy link
Contributor Author

@bigbike bigbike left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mosra :
Updated the code with the latest shaders from ReplicaSDK. They are modified so that they can be compiled with GL4.10.
However, it still does not work as before. The program gave a "grey" screen. You may download my branch and play it with any released Replica model.

I left a couple of questions in the code. Hope you can help me to accelerate the debugging. Thanks!


void PTexMeshDrawable::draw(const Magnum::Matrix4& transformationMatrix,
Magnum::SceneGraph::Camera3D& camera) {
adjTex_.bind(1);
adjFaces_.bind(1);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mosra : adjFaces is the buffer texture, and it is bound to 1. Not sure if 1 is correct binding point.

Copy link
Collaborator

@mosra mosra Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know -- I don't see any layout(binding = N) on the shader side, or anything related in the constructor, for either this texture or the other, which leads me to think that this is the root of the problem. To be clear -- Magnum doesn't have any implicit binding points or anything, so you can't expect the shader to pick (samplers, uniforms, attributes) up unless you explicitly define the location / binding points for these as well. If things seemed to work with other shaders until now, it's just a lucky accident (as far as I can see, the GenericShader also only ever uses one texture at a time).

(This comment is getting a bit long, so bear with me please.)

What's the designated way of doing this in Magnum (and what you are doing here differently) is, step-by-step (docs for reference -- look there for more details about each point):

  • For each uniform, defining layout(location = N) in the shader code, and then using the same N on C++ side. (Extensions needed for that are available on all systems we care about, so no need to save results of uniformLocation() on the C++ side.) Especially not using setUniform(uniformLocation("some string"), value) each time in every frame, all those string operations are extremely wasteful.

  • For each input attribute, typedefing it in the header with location corresponding to the layout() qualifier in the shader code and type matching the type you are most likely to have the vertex data in. GL pads the remaining components with (0, 0, 0, 1) so for example, if you pass a Vector2 and have vec4 on the shader side, you'll get (x, y, 0, 1) there -- no need to supply the 0 and 1 explicitly. Here you're using Vector4 and I don't see a reason to supply the fourth component, so it could be Vector3 I think, saving 25% of memory. Of course the PTexMeshData needs to be updated as well.

  • For each sampler, either doing layout(binding = N) in the shader code or calling setUniform(uniformLocation("sampler name"), N) in the constructor. Here you need the latter, as macOS doesn't have GL_ARB_shading_language_420pack. Then, for each sampler providing bindSomething() APIs on the C++ side, using the same N as a parameter to texture.bind(). Here you have bindTexture() just for one of the samplers and it has the binding point as a parameter, which doesn't make sense, and in the other case you're calling bind() manually, with the 1 having no relation to the shader at all. What I'm usually doing is defining an enum with the binding point IDs and then using those enums for both the bind() and setUniform() calls -- so for example:

    enum: UnsignedInt {
      AtlasTextureUnit = 0,
      AdjacencyTextureUnit = 1
    };
    
    ...
    
    PTexMeshShader::PTexMeshShader() {
      setUniform(uniformLocation("atlasTex"), AtlasTextureUnit);
      setUniform(uniformLocation("meshAdjFaces"), AdjacencyTextureUnit);
    }
    
    PTexMeshDhader& PTexMeshShader::bindAtlasTexture(GL::Texture2D& texture) {
      texture.bind(AtlasTextureUnit);
      return *this;
    }
    
    PTexMeshDhader& PTexMeshShader::bindAdjacencyTexture(GL::BufferTexture& texture) {
      texture.bind(AdjacencyTextureUnit);
      return *this;
    }

    This way the shader API on the C++ side is complete and you can be sure that you are always binding the right type of a texture to the right binding point. It's generally advised to choose mutually exclusive texture units for each shader (unless it's expected to use the same textures in each), as you save on texture rebinding when switching between different shaders. So e.g. if the generic shader makes use of units 0-3, the PTex shader would take units 4 and 5.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each uniform, defining layout(location = N) in the shader code

I thought about it. But "Explicit uniform location" is not supported until 4.3. So I cannot use 'layout(location = N)' for the uniform in the shaders;

setUniform(uniformLocation("some string"), value) each time in every frame, all those string operations are extremely wasteful.

To save the call, I was thinking for certain uniforms, e.g., float "exposure". In C++, can we store a copy (denoted as "current_exposure") in the PTexMeshShader class to store the current value? In draw(), we first check if the exposure in the drawable is the same as the "current_exposure". If not, we setUniform(uniformLocation("exposure"), new_value), and let current_exposure = new_value;
It makes things complicated. But do you think it is necessary or it can reduce the overhead?

Magnum doesn't have any implicit binding points or anything, so you can't expect the shader to pick (samplers, uniforms, attributes) up unless you explicitly define the location / binding points for these as well.

Thank you for the clarification. If possible, I would even suggest you to put it in your docs. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for the informative and invaluable comment! It saves me from a lot of guesswork on Magnum usage.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But "Explicit uniform location" is not supported until 4.3.

Right, sorry. For some reason I thought it got in much earlier.

It makes things complicated. But do you think it is necessary or it can reduce the overhead?

Uh oh, i meant something else -- the uniformLocation(), due to all its string operations, is the problem, not setUniform(). So call it once in the constructor, save the ID and then do just setUniform(exposureUniform_, value). Check the performance plot in #151 -- just by going from GenericShader (which calls uniformLocation() every time) to magnum's Shaders::Flat (which caches the locations), I was able to shave some additional milliseconds.

Long time ago I tried to cache the uniform values as well, but it didn't prove any measurable performance improvement. And the maintenance overhead was absolutely not worth it. What could make sense is going with UBOs, but those are known to be even slower than classic setUniform() calls on Mac and some Intel GPUs, so ... ¯\_(ツ)_/¯

@@ -29,7 +29,7 @@ const uint FACE_MASK = 0x3FFFFFFF;

int GetAdjFace(int face, int edge, out int rot) {
// uint data = meshAdjFaces[face * 4 + edge];
uint data = uint(texelFetch(meshAdjFaces, face * 4 + edge));
uint data = texelFetch(meshAdjFaces, face * 4 + edge).r;
Copy link
Contributor Author

@bigbike bigbike Aug 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mosra : Is it the correct way to fetch the unsigned int in the buffer texture?
BTW, please note: I changed the type of meshAdjFaces from sampleBuffer to usamplerBuffer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the shader compiler doesn't complain, yes :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous code will not be complained by the compiler either. But I doubt if that is correct. So I changed it.

currentMesh->abo.setData(adjFaces[iMesh],
Magnum::GL::BufferUsage::StaticDraw);

// experiment code (may not work):
// using GL_LINES_ADJACENCY here to send quads to geometry shader
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mosra: I believe it is the most suspicious part.
The purpose is to send quads to geometry shader using GL_LINES_ADJACENCY.
The corresponding code in RelicaSDK is:
glDrawElements(GL_LINES_ADJACENCY, mesh.ibo.num_elements, mesh.ibo.datatype, 0);
Did I set it correctly using magnum?

Copy link
Collaborator

@mosra mosra Aug 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so.

@bigbike
Copy link
Contributor Author

bigbike commented Aug 14, 2019

Updated a version based on your comment.

The code is suffering segfault now. I am debugging it.

@mosra mosra closed this Aug 28, 2019
@mosra mosra reopened this Aug 28, 2019
@bigbike
Copy link
Contributor Author

bigbike commented Aug 28, 2019

@mosra:
Interesting. The segfault shows up in the test.
(I did not use the magnum within the simulator, but cloned one from your github to a different, independent folder.)

(base) ~/temp/magnum/build/src/Magnum/GL/Test$ ./GLBufferTextureTest
Starting Magnum::GL::Test::BufferTextureTest with 2 test cases...
OK [1] constructNoCreate()
OK [2] constructCopy()
Finished Magnum::GL::Test::BufferTextureTest with 0 errors out of 4 checks.
(base) ~/temp/magnum/build/src/Magnum/GL/Test$ ./GLBufferTextureGLTest
Renderer: AMD Radeon Pro 555X OpenGL Engine by ATI Technologies Inc.
OpenGL version: 4.1 ATI-2.11.20
Using optional features:
GL_ARB_ES2_compatibility
GL_ARB_separate_shader_objects
GL_ARB_texture_filter_anisotropic
GL_ARB_texture_storage
GL_ARB_vertex_array_object
GL_EXT_debug_label
GL_EXT_debug_marker
Using driver workarounds:
no-layout-qualifiers-on-old-glsl
Starting Magnum::GL::Test::BufferTextureGLTest with 7 test cases...
OK [1] construct()
OK [2] wrap()
OK [3] bind()
SKIP [4] bindImage()
GL_ARB_shader_image_load_store is not supported.
OK [5] setBuffer()
Segmentation fault: 11

@bigbike
Copy link
Contributor Author

bigbike commented Aug 28, 2019

@mosra :

When I use build.sh, the first thing it would do is to checkout the acf06eb6d.

In this case, how can I test the simulator with the "magnum" master?

@mosra
Copy link
Collaborator

mosra commented Aug 28, 2019

Oh, that's great it's reproducible with the test alone. Can you get a backtrace of that, please? If that's a macOS driver bug, I could try to work around it on the engine side.

For the submodules, you can pass --no-update-submodules to setup.py (got added in #38) :)

@bigbike
Copy link
Contributor Author

bigbike commented Aug 28, 2019

Oh, that's great it's reproducible with the test alone. Can you get a backtrace of that, please? If that's a macOS driver bug, I could try to work around it on the engine side.

For the submodules, you can pass --no-update-submodules to setup.py (got added in #38) :)

Good to know. Thanks.

The following is the bt:

* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xc8)
    frame #0: 0x00007fff37c44fba GLEngine`gleUpdateCtxDirtyStateForBufStampChange + 331
GLEngine`gleUpdateCtxDirtyStateForBufStampChange:
->  0x7fff37c44fba <+331>: movl   0xc8(%rcx), %ecx
    0x7fff37c44fc0 <+337>: subl   0xb2fc(%rbx), %ecx
    0x7fff37c44fc6 <+343>: testl  %ecx, %ecx
    0x7fff37c44fc8 <+345>: jle    0x7fff37c4500d            ; <+414>
Target 0: (GLBufferTextureGLTest) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0xc8)
  * frame #0: 0x00007fff37c44fba GLEngine`gleUpdateCtxDirtyStateForBufStampChange + 331
    frame #1: 0x00007fff37c45140 GLEngine`gleDirtyBufferObjectCurrentBindPoints + 109
    frame #2: 0x0000000100167cf8 libMagnumGL-d.2.dylib`Magnum::GL::Buffer::dataImplementationDefault(this=0x00007ffeefbfe840, size=16, data=0x0000000100013c10, usage=StaticDraw) at Buffer.cpp:454:5
    frame #3: 0x0000000100166a92 libMagnumGL-d.2.dylib`Magnum::GL::Buffer::setData(this=0x00007ffeefbfe840, data=(_data = 0x0000000100013c10, _size = 16), usage=StaticDraw) at Buffer.cpp:321:5
    frame #4: 0x000000010000478e GLBufferTextureGLTest`Magnum::GL::Test::(anonymous namespace)::BufferTextureGLTest::setBufferEmptyFirst(this=0x00007ffeefbff4e8) at BufferTextureGLTest.cpp:222:12
    frame #5: 0x00000001001039f0 libCorradeTestSuite.2.dylib`Corrade::TestSuite::Tester::exec(std::__1::basic_ostream<char, std::__1::char_traits<char> >*, std::__1::basic_ostream<char, std::__1::char_traits<char> >*) + 10038
    frame #6: 0x0000000100003298 GLBufferTextureGLTest`main(argc=1, argv=0x00007ffeefbff550) at BufferTextureGLTest.cpp:276:1
    frame #7: 0x00007fff59e383d5 libdyld.dylib`start + 1

@bigbike
Copy link
Contributor Author

bigbike commented Aug 29, 2019

The simulator was suffering the segfault as well with the latest magnum master.
It was also from the "setData". I tried to disable it, and simply used:

    glBindBuffer(GL_TEXTURE_BUFFER, currentMesh->adjFacesBuffer.id());
    glBufferData(GL_TEXTURE_BUFFER, sizeof(uint32_t) * adjFaces[iMesh].size(), (void*)adjFaces[iMesh].data(), GL_STATIC_DRAW);

It did not work either.

It has nothing to do with the order sequence now. What was the change in the setBuffer? It may cause the failure here.

But, if the setBuffer() was disabled, the segfault disappeared. Something is wrong with the mosra/magnum@7b43ab5

@mosra
Copy link
Collaborator

mosra commented Aug 30, 2019

Sorry for the late replies, having too much on my plate :/

Something is wrong with the mosra/magnum@7b43ab5

I'm pretty confident it's a macOS driver bug, not a problem with magnum. Nevertheless, the engine has to deal with the problem and make it non-crashy, somehow. The createIfNotAlready() is simply calling glBindBuffer() in case the buffer isn't created yet. So equivalent to the crashy pure-GL code you tried above -- again I don't think this is due to anything conflicting with magnum. Proof by statistics -- as far as I have checked, this order of operations isn't causing any crashes or GL errors anywhere else, only on macOS.

So we're looking at a driver bug. At this point I'm not sure about anything anymore, to be frank, and I guess I really need some access to a Mac to debug this.

A totally random idea -- can you apply the following patch to the test and run it again? It's silly (and not required by the spec at all), but maybe that is the missing bit that causes the driver to blow up.

diff --git a/src/Magnum/GL/Test/BufferTextureGLTest.cpp b/src/Magnum/GL/Test/BufferTextureGLTest.cpp
index 29d95d90d..104953f4f 100644
--- a/src/Magnum/GL/Test/BufferTextureGLTest.cpp
+++ b/src/Magnum/GL/Test/BufferTextureGLTest.cpp
@@ -207,7 +207,7 @@ void BufferTextureGLTest::setBufferEmptyFirst() {
     #endif
 
     BufferTexture texture;
-    Buffer buffer;
+    Buffer buffer{Buffer::TargetHint::Texture};
     texture.setBuffer(BufferTextureFormat::RGBA8UI, buffer);
 
     MAGNUM_VERIFY_NO_GL_ERROR();

@bigbike
Copy link
Contributor Author

bigbike commented Aug 30, 2019

@mosra :

Tried. Segfault.
Do not worry. I will help you to find this bug and have a workaround.

If you are a contingent worker at FB, can you ask the HR or related staff to get a macbook pro? See if they can ship one to your country. I do not think it is a problem!

Sorry, I later have 1 hour meeting, will be back in the afternoon. Feel free to go to sleep:)

@mosra
Copy link
Collaborator

mosra commented Aug 30, 2019

Uh.

Plan Z, then: can you try with plain textures? Similarly to what I did with the Primitive ID texture in the Primitive ID shader. I'm pretty sure this will fail on larger models and be slower than BufferTexture, but could be a workable short-term solution to get this running on Mac:

  • TextureFormat::R32UI
  • texelFetch() in the shader to avoid texture filtering and all that unnecessary stuff
  • Texture1D instead of Texture2D might get you bigger max size limits

@bigbike
Copy link
Contributor Author

bigbike commented Aug 30, 2019

Uh.

Plan Z, then: can you try with plain textures? Similarly to what I did with the Primitive ID texture in the Primitive ID shader. I'm pretty sure this will fail on larger models and be slower than BufferTexture, but could be a workable short-term solution to get this running on Mac:

  • TextureFormat::R32UI
  • texelFetch() in the shader to avoid texture filtering and all that unnecessary stuff
  • Texture1D instead of Texture2D might get you bigger max size limits

On Mac, I disabled the buffer texture in the shader. And it gave fairly good results:

test rgba 00000

test rgba 00000

The current status:
It works 100% on Linux. It gives pretty good result on Mac with the workaround.

On Mac:
I believe somehow the buffer texture works (comparing it did not work at all), but I cannot confirm the value (adjFaces within) is correct on GPU when the shader uses it. (by the way, Do you have any ideas how to verify it?). Once I can verify the buffer texture values are correct, I can go further to debug the "fish net" artifacts.

Current workaround: Disable the buffer texture. Just not use the adjFaces.

(Simulator segfaults with the magnum master, So I had to roll back to use the default magnum shipped with the simulator, and added "glBindBuffer, glTexBuffer" before the setBuffer (I still kept the setBuffer in front of setData() though)).

Thinking
adjFaces are ONLY used in texture filtering. The effects brought by it can only be visible when you zoom in quite a bit on the texture. I understand such seamless filtering is rather critical for industry level, film production (e.g., to avoid lighting artifacts when using displacement maps). But considering our project and application, I think the image quality is excellent. So at this moment, I think the practical solution is to fully enable all the functionalities on Linux, while on Mac we disable the buffer texture. (We will try to completely fix it on Mac, once we resolve the segfault in Magnum).
Keep this in our mind that we only use Mac for development visualization the scene. We never train any models using the MacBook. We do the training in the linux boxes. So as long as the linux version is OK, we can go ahead like this.

About the segfault on Mac, and the magnum solution:
As a computer graphics guy for many years, I feel the pain from you. I would like to thoroughly understand what happened, and if there is a workaround. So I will keep debugging it with you.

For the Texture1D idea, I think it is a good idea to try. I would like to see at least some types of "buffer" work, and can pass the adjFaces to the shader correctly on Mac. But like you said, it would fail when the number of texels was big, plus it was slower. So I may not adopt it as our solution here. I will still use the texture buffer in this case. How do you think?

@mosra
Copy link
Collaborator

mosra commented Aug 31, 2019

Do you have any ideas how to verify it?

Magnum has a framework for shader tests and it recently got improved even further for easier iteration. The most used builtin shaders are now all tested with it (see https://github.com/mosra/magnum/tree/master/src/Magnum/Shaders/Test). So if it would be feasible to craft small enough data (and expected output) to test on, then I think this would be a way to go. With the test scaffolding ready, the adjacency could be then (for example) verified by a specially crafted texture where the adjacent faces have a completely different color (and the test checks that the gradient in between is monotonic and doesn't include colors from wrong faces).

Yes, it's some nontrivial work upfront, but it would be useful far beyond this bug I think -- later refactorings will be much easier to do, not to mention opening possibilities for benchmark-guided optimizations and saving us a lot of time when we get to making things work on Vulkan.

Simulator segfaults with the magnum master

This is unfortunate, but I still believe Magnum does the right thing (and so I don't think the change should be reverted) -- the test that triggers the crash on Mac caused GL_INVALID_OPERATION on all drivers before, now it works everywhere except Mac.

On Mac, I disabled the buffer texture in the shader.

So it's basically not used at all now, right? Which means if you'd #ifdef out the crashy part on Mac, the rendering wouldn't change and you could keep using current Magnum master, right?

I would like to thoroughly understand what happened, and if there is a workaround.

Unless I get access to a Mac, I'm afraid I can't debug this any further. I have some ideas, but with this way of "remote debugging" it would cost both you and me an extreme amount of time and I think that time could be spent more wisely elsewhere.

@bigbike
Copy link
Contributor Author

bigbike commented Sep 4, 2019

@mosra:
Good to see your suggestions. The debugTools seems very interesting. I am going to learn more from your online docs.
As suggested by the other team members in the engineering meeting, I am going to split this PR into a couple of small ones to make the code review easier for the others.

@bigbike
Copy link
Contributor Author

bigbike commented Oct 14, 2019

We divided it into a couple of PRs, which were committed in.
The system now can support PTex Mesh rendering.
Many thanks are given to all the reviewers. especially @mosra !
Close this PR.

@bigbike bigbike closed this Oct 14, 2019
@bigbike bigbike deleted the replica branch October 14, 2019 22:45
eundersander pushed a commit to eundersander/habitat-sim that referenced this pull request Aug 6, 2020
…research#132)

Limit requirements search scope to habitat_baselines folder.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Do not delete this pull request or issue due to inactivity.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants