Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please consider adding a non locking file mode for usdc files ( USD Memory Asset Class ) #1852

Closed
koen-v opened this issue May 5, 2022 · 11 comments

Comments

@koen-v
Copy link

koen-v commented May 5, 2022

The way usd handles usdc file by default locks the file from writing, this is to prevent corruption.

USD by default uses ArFilesystemAsset class to hold on to files. This class keeps the file open while the asset is in use and allows assets to be delayed read from disk. The problem with this approach is that the file may be written to by another thread or process while the file is left open. Subsequent reads will read garbage and either corrupt the asset or crash.

To solve this in our internal build, a new asset class, ArMemoryAsset, has been created. This new class will read the entire file up front and close it. All read operations are serviced using an in-memory buffer. Files are now opened with _SH_SECURE which prevent files to be opened for write while the file is being read from. This will prevent any corruption or crashes due to reads occurring after writes.

Fixes for USD

  • USDC files being locked while a usdStage is open for read when using _SH_SECURE.
  • USDC corruption if the file is written to while open for read when using _SH_DENYNO.

We are happy to share this code, we have a CLA in place. However, we have not implemented a nice way to choose between the options, we completely replace all usdc reading with this, which won't be appropriate in many cases. Some options we considered, try to solve it with a resolver, perhaps an env var, or setting in the pluginfo, or perhaps even adding ".usdb" as a new file extension.

The memory asset class:

/// \class ArMemoryAsset
///
/// ArAsset implementation for asset represented by a block of memory.
class ArMemoryAsset
    : public ArAsset
{
public:
    /// Constructs an ArMemoryAsset for the given \p file. 
    /// The ArMemoryAsset object takes ownership of \p file and will
    /// delete the memory on destruction.
    AR_API
    explicit ArMemoryAsset(std::shared_ptr<const char>& file, size_t fileSize);

    /// Constructs an ArMemoryAsset for the given \p file. 
    /// The ArMemoryAsset object takes ownership of \p file and will
    /// close it immediately.
    AR_API
    explicit ArMemoryAsset(FILE* file);

    /// Returns the size of the file held by this object.
    AR_API
    virtual size_t GetSize() override;

    /// Creates a read-only memory map for the file held by this object
    /// and returns a pointer to the start of the mapped contents.
    AR_API
    virtual std::shared_ptr<const char> GetBuffer() override;
    
    /// Reads \p count bytes from the file held by this object at the
    /// given \p offset into \p buffer.
    AR_API
    virtual size_t Read(void* buffer, size_t count, size_t offset) override;

    /// Returns the FILE* handle this object was created with and an offset
    /// of 0, since the asset's contents are located at the beginning of the
    /// file.
    AR_API        
    virtual std::pair<FILE*, size_t> GetFileUnsafe() override;

private:
    std::shared_ptr<const char> _buffer;
    size_t _size = 0;
};

And the change we made to the resolver ( v2 shown here)

std::shared_ptr
ArDefaultResolver::_OpenAsset(
const ArResolvedPath& resolvedPath)
{
FILE* f = ArchOpenFile(resolvedPath.GetPathString().c_str(), "rb");
if (!f) {
return nullptr;
}

return std::shared_ptr<ArAsset>(new ArMemoryAsset(f));

}


Thanks,
Koen

@spiffmon
Copy link
Member

spiffmon commented May 7, 2022

Thank you for filing, @koen-v , and for the details! I think we will want to implement this feature at the UsdUsdcFileFormat level, which gives us one way of selecting the behavior (SdfFileFormatArgs on the reference/subLayer assetPath), but more importantly allows us to take further advantage of having the whole layer in memory: when we Reload() or TransferContents the layer, we will be able to do the same fine-grained change-processing that we currently can do for usda format layers, rather than just throwing up our hands and saying "everything in the layer may have changed". This may have a positive impact on workflows where you actually do regenerate a layer in one app, and update a consuming app to see the changes.

@koen-v
Copy link
Author

koen-v commented May 7, 2022

Hello Spiff, that sounds like a great solution, thank you very much for considering a solution to this!

Please let me know if there is anything we can do to help from our side.
Cheers,
Koen

@jilliene
Copy link

jilliene commented May 9, 2022

Filed as internal issue #USD-7361

@hybridherbst
Copy link
Contributor

I'd also like to see this in usdview for USDZ files – for us it's quite common to regenerate the viewed files frequently, and currently one needs to close usdview, regenerate, open the file in the viewer again. Better would be a "refresh" button/shortcut, or even automatic reload :)

@spiffmon
Copy link
Member

spiffmon commented Jun 3, 2022

That's good to know, @hybridherbst ... however, as far as usdview is concerned, "File > Reopen Stage" should work for your purposes, since it first (attempts to, given python lifetime management) closes the current stage and all layers, before re-opening the root layer. Does that not work for you?

@hybridherbst
Copy link
Contributor

It doesn't work since the file is locked and can't be overwritten.

Our usecase is specifically viewing monolithic/self-contained USDZ files – since the viewer locks those when opening they can't be overwritten without closing the viewer, so I can't really test the "Reopen Stage" workflow (since it's locked it's always the same content).

@spiffmon
Copy link
Member

spiffmon commented Jun 3, 2022

Ah, of course - sorry I didn't see that!

@oumad
Copy link

oumad commented Jun 22, 2022

Any updates regarding this issue ?
This is very problematic for us when a different user updates an asset while it is staged by someone else in maya-usd or Unreal Engine, the asset appears updated for everyone except for that person who had it staged, and no stage reloading or mute/unmuting works, the only way to see the updates in that instance is to close the process and re-open it. This ofcourse doesn't happen with usda data, only the payloads which are usually in usdc.

@spiffmon
Copy link
Member

Hi @oumad , while we agree this will be a useful and welcome feature, we are currently fully engaged with upgrading USD to handle new platforms and dependent packages without regression, general support, and a small number of features relevant to lighting and rendering workflows. We are also working on expanding our team, but it may still be some time before we can tackle this.

@oumad
Copy link

oumad commented Jun 27, 2022

Hi @oumad , while we agree this will be a useful and welcome feature, we are currently fully engaged with upgrading USD to handle new platforms and dependent packages without regression, general support, and a small number of features relevant to lighting and rendering workflows. We are also working on expanding our team, but it may still be some time before we can tackle this.

I understand the priorities, thank you for getting back to me. Good luck with all of your challenges ! I hope there is anything I could help with.

@sunyab
Copy link
Contributor

sunyab commented Aug 18, 2022

Hi all, just wanted to mention that we're actively working on this right now.

Our current plan is to add API in Sdf that allows users to specify layers that should be read fully into memory and detached from the underlying backing store. This will use new API on SdfFileFormat that requires layer data to be read into memory and new API in Ar to retrieve an "in-memory" asset from an existing ArAsset. We'll provide default implementations that should just work, but may be optimized for particular SdfFileFormat or ArAsset implementations.

The Sdf API will accept some form of pattern so that you can easily express things like "all layers should be read into memory" or "all layers except these should be read into memory." Users will also be able to specify these expressions via an environment variable. We'll also add command-line arguments to usdview to support this feature.

pixar-oss pushed a commit that referenced this issue Sep 1, 2022
overwritten with a crate-based layer. This would result
in the contents of the crate layer being copied into the
existing in-memory SdfData object instead of the layer
taking ownership of the streaming CrateData object and
avoiding the cost of copying all of its data into memory.

See #1852

(Internal change: 2246984)
pixar-oss pushed a commit that referenced this issue Sep 1, 2022
Sometimes the call to copyfile does not change the mtime on
the destination file, which causes the call to Reload to
be skipped. Force the reload to avoid this issue.

See #1852

(Internal change: 2247029)
pixar-oss pushed a commit that referenced this issue Sep 1, 2022
pixar-oss pushed a commit that referenced this issue Sep 3, 2022
A detached layer is detached from its serialized backing
store, isolating it from any external changes to the
serialized data. Prior to this change, whether a layer was
detached was determined solely by the layer's file format
plugin and the ArAsset implementation (if one was used by the
file format).

For example, .usdc layers are typically *not* detached layers.
The Crate file format with the default ArFilesystemAsset holds
on to a memory mapping for the file on disk and reads data from
it as needed. In this case, having a .usdc layer open in a
process and overwriting the file in another can lead to crashes
or other workflow issues. The new detached layer feature lets
users to specify layers that should not maintain persistent
connections to their data store. In the example above, a
detached .usdc layer would not maintain a memory mapping to
the source file on disk. This would allow other processes to
overwrite the file without introducing the issues described
above.

Detached layers are specified via include/exclude patterns
in the new SdfLayer::DetachedLayerRules object. SdfLayer
will do simple substring matches against these patterns. For
example, a user could specify that all layers should be
detached, except for those that contain "Sim" or "FX" in
their identifier.

New virtual methods for initializing and reading to a detached
layer were added to the SdfFileFormat interface to support this
feature. The default implementations use an SdfData object, which
is the same in-memory data container used by the .usda file format.

A new virtual method was also added to SdfAbstractData to query
whether that data object is detached. By default, this returns
true if the data object is not streaming, false otherwise.

The default implementation for SdfFileFormat::_ReadDetached will
issue a warning if it detects that the file format produced a
non-detached data object from Read and perform a (potentially
expensive) copy of the layer data into a new SdfData object.
File formats that are able to provide a more efficient
implementation should do so, but for convenience they can
opt in to the copying behavior by just calling
_ReadAndCopyLayerDataToMemory in their implementation.

The main takeaway is that non-streaming file formats typically
will not have to implement anything new, since they are already
classified as detached by default. Streaming file formats will
need to implement at least _ReadDetached.

Fixes #1852

(Internal change: 2247699)
pixar-oss pushed a commit that referenced this issue Sep 3, 2022
Note that support for the .usda file format was added in
a previous change.

Fixes #1852

(Internal change: 2247700)
(Internal change: 2247748)
pixar-oss pushed a commit that referenced this issue Sep 3, 2022
pixar-oss pushed a commit that referenced this issue Sep 12, 2022
Detached layers in usdview can be specified using the arguments
--detachLayers, --detachLayersInclude, and --detachLayersExclude.

Fixes #1852

(Internal change: 2247862)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants