Please consider adding a non locking file mode for usdc files ( USD Memory Asset Class ) #1852

koen-v · 2022-05-05T12:15:18Z

The way usd handles usdc file by default locks the file from writing, this is to prevent corruption.

USD by default uses ArFilesystemAsset class to hold on to files. This class keeps the file open while the asset is in use and allows assets to be delayed read from disk. The problem with this approach is that the file may be written to by another thread or process while the file is left open. Subsequent reads will read garbage and either corrupt the asset or crash.

To solve this in our internal build, a new asset class, ArMemoryAsset, has been created. This new class will read the entire file up front and close it. All read operations are serviced using an in-memory buffer. Files are now opened with _SH_SECURE which prevent files to be opened for write while the file is being read from. This will prevent any corruption or crashes due to reads occurring after writes.

Fixes for USD

USDC files being locked while a usdStage is open for read when using _SH_SECURE.
USDC corruption if the file is written to while open for read when using _SH_DENYNO.

We are happy to share this code, we have a CLA in place. However, we have not implemented a nice way to choose between the options, we completely replace all usdc reading with this, which won't be appropriate in many cases. Some options we considered, try to solve it with a resolver, perhaps an env var, or setting in the pluginfo, or perhaps even adding ".usdb" as a new file extension.

The memory asset class:

/// \class ArMemoryAsset
///
/// ArAsset implementation for asset represented by a block of memory.
class ArMemoryAsset
    : public ArAsset
{
public:
    /// Constructs an ArMemoryAsset for the given \p file. 
    /// The ArMemoryAsset object takes ownership of \p file and will
    /// delete the memory on destruction.
    AR_API
    explicit ArMemoryAsset(std::shared_ptr<const char>& file, size_t fileSize);

    /// Constructs an ArMemoryAsset for the given \p file. 
    /// The ArMemoryAsset object takes ownership of \p file and will
    /// close it immediately.
    AR_API
    explicit ArMemoryAsset(FILE* file);

    /// Returns the size of the file held by this object.
    AR_API
    virtual size_t GetSize() override;

    /// Creates a read-only memory map for the file held by this object
    /// and returns a pointer to the start of the mapped contents.
    AR_API
    virtual std::shared_ptr<const char> GetBuffer() override;
    
    /// Reads \p count bytes from the file held by this object at the
    /// given \p offset into \p buffer.
    AR_API
    virtual size_t Read(void* buffer, size_t count, size_t offset) override;

    /// Returns the FILE* handle this object was created with and an offset
    /// of 0, since the asset's contents are located at the beginning of the
    /// file.
    AR_API        
    virtual std::pair<FILE*, size_t> GetFileUnsafe() override;

private:
    std::shared_ptr<const char> _buffer;
    size_t _size = 0;
};

And the change we made to the resolver ( v2 shown here)

std::shared_ptr
ArDefaultResolver::_OpenAsset(
const ArResolvedPath& resolvedPath)
{
FILE* f = ArchOpenFile(resolvedPath.GetPathString().c_str(), "rb");
if (!f) {
return nullptr;
}

return std::shared_ptr<ArAsset>(new ArMemoryAsset(f));

}

Thanks,
Koen

The text was updated successfully, but these errors were encountered:

spiffmon · 2022-05-07T04:20:18Z

Thank you for filing, @koen-v , and for the details! I think we will want to implement this feature at the UsdUsdcFileFormat level, which gives us one way of selecting the behavior (SdfFileFormatArgs on the reference/subLayer assetPath), but more importantly allows us to take further advantage of having the whole layer in memory: when we Reload() or TransferContents the layer, we will be able to do the same fine-grained change-processing that we currently can do for usda format layers, rather than just throwing up our hands and saying "everything in the layer may have changed". This may have a positive impact on workflows where you actually do regenerate a layer in one app, and update a consuming app to see the changes.

koen-v · 2022-05-07T11:49:47Z

Hello Spiff, that sounds like a great solution, thank you very much for considering a solution to this!

Please let me know if there is anything we can do to help from our side.
Cheers,
Koen

jilliene · 2022-05-09T17:49:14Z

Filed as internal issue #USD-7361

hybridherbst · 2022-06-03T15:26:04Z

I'd also like to see this in usdview for USDZ files – for us it's quite common to regenerate the viewed files frequently, and currently one needs to close usdview, regenerate, open the file in the viewer again. Better would be a "refresh" button/shortcut, or even automatic reload :)

spiffmon · 2022-06-03T19:26:34Z

That's good to know, @hybridherbst ... however, as far as usdview is concerned, "File > Reopen Stage" should work for your purposes, since it first (attempts to, given python lifetime management) closes the current stage and all layers, before re-opening the root layer. Does that not work for you?

hybridherbst · 2022-06-03T19:40:40Z

It doesn't work since the file is locked and can't be overwritten.

Our usecase is specifically viewing monolithic/self-contained USDZ files – since the viewer locks those when opening they can't be overwritten without closing the viewer, so I can't really test the "Reopen Stage" workflow (since it's locked it's always the same content).

spiffmon · 2022-06-03T21:29:58Z

Ah, of course - sorry I didn't see that!

oumad · 2022-06-22T18:56:11Z

Any updates regarding this issue ?
This is very problematic for us when a different user updates an asset while it is staged by someone else in maya-usd or Unreal Engine, the asset appears updated for everyone except for that person who had it staged, and no stage reloading or mute/unmuting works, the only way to see the updates in that instance is to close the process and re-open it. This ofcourse doesn't happen with usda data, only the payloads which are usually in usdc.

spiffmon · 2022-06-27T18:51:32Z

Hi @oumad , while we agree this will be a useful and welcome feature, we are currently fully engaged with upgrading USD to handle new platforms and dependent packages without regression, general support, and a small number of features relevant to lighting and rendering workflows. We are also working on expanding our team, but it may still be some time before we can tackle this.

oumad · 2022-06-27T19:33:48Z

Hi @oumad , while we agree this will be a useful and welcome feature, we are currently fully engaged with upgrading USD to handle new platforms and dependent packages without regression, general support, and a small number of features relevant to lighting and rendering workflows. We are also working on expanding our team, but it may still be some time before we can tackle this.

I understand the priorities, thank you for getting back to me. Good luck with all of your challenges ! I hope there is anything I could help with.

sunyab · 2022-08-18T22:04:07Z

Hi all, just wanted to mention that we're actively working on this right now.

Our current plan is to add API in Sdf that allows users to specify layers that should be read fully into memory and detached from the underlying backing store. This will use new API on SdfFileFormat that requires layer data to be read into memory and new API in Ar to retrieve an "in-memory" asset from an existing ArAsset. We'll provide default implementations that should just work, but may be optimized for particular SdfFileFormat or ArAsset implementations.

The Sdf API will accept some form of pattern so that you can easily express things like "all layers should be read into memory" or "all layers except these should be read into memory." Users will also be able to specify these expressions via an environment variable. We'll also add command-line arguments to usdview to support this feature.

overwritten with a crate-based layer. This would result in the contents of the crate layer being copied into the existing in-memory SdfData object instead of the layer taking ownership of the streaming CrateData object and avoiding the cost of copying all of its data into memory. See #1852 (Internal change: 2246984)

Sometimes the call to copyfile does not change the mtime on the destination file, which causes the call to Reload to be skipped. Force the reload to avoid this issue. See #1852 (Internal change: 2247029)

See #1852 (Internal change: 2247063)

A detached layer is detached from its serialized backing store, isolating it from any external changes to the serialized data. Prior to this change, whether a layer was detached was determined solely by the layer's file format plugin and the ArAsset implementation (if one was used by the file format). For example, .usdc layers are typically *not* detached layers. The Crate file format with the default ArFilesystemAsset holds on to a memory mapping for the file on disk and reads data from it as needed. In this case, having a .usdc layer open in a process and overwriting the file in another can lead to crashes or other workflow issues. The new detached layer feature lets users to specify layers that should not maintain persistent connections to their data store. In the example above, a detached .usdc layer would not maintain a memory mapping to the source file on disk. This would allow other processes to overwrite the file without introducing the issues described above. Detached layers are specified via include/exclude patterns in the new SdfLayer::DetachedLayerRules object. SdfLayer will do simple substring matches against these patterns. For example, a user could specify that all layers should be detached, except for those that contain "Sim" or "FX" in their identifier. New virtual methods for initializing and reading to a detached layer were added to the SdfFileFormat interface to support this feature. The default implementations use an SdfData object, which is the same in-memory data container used by the .usda file format. A new virtual method was also added to SdfAbstractData to query whether that data object is detached. By default, this returns true if the data object is not streaming, false otherwise. The default implementation for SdfFileFormat::_ReadDetached will issue a warning if it detects that the file format produced a non-detached data object from Read and perform a (potentially expensive) copy of the layer data into a new SdfData object. File formats that are able to provide a more efficient implementation should do so, but for convenience they can opt in to the copying behavior by just calling _ReadAndCopyLayerDataToMemory in their implementation. The main takeaway is that non-streaming file formats typically will not have to implement anything new, since they are already classified as detached by default. Streaming file formats will need to implement at least _ReadDetached. Fixes #1852 (Internal change: 2247699)

Note that support for the .usda file format was added in a previous change. Fixes #1852 (Internal change: 2247700) (Internal change: 2247748)

Fixes #1852 (Internal change: 2247701)

Detached layers in usdview can be specified using the arguments --detachLayers, --detachLayersInclude, and --detachLayersExclude. Fixes #1852 (Internal change: 2247862)

pixar-oss pushed a commit that referenced this issue Sep 1, 2022

Refactor to centralize creation of a layer's data object.

25e1f5e

See #1852 (Internal change: 2247063)

pixar-oss pushed a commit that referenced this issue Sep 3, 2022

Add support for detached layers to .usd/c/z file formats.

df9ba3e

Note that support for the .usda file format was added in a previous change. Fixes #1852 (Internal change: 2247700) (Internal change: 2247748)

pixar-oss pushed a commit that referenced this issue Sep 3, 2022

Add support for detached layers to Alembic file format plugin

0789ed3

Fixes #1852 (Internal change: 2247701)

pixar-oss closed this as completed in bc06715 Oct 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Please consider adding a non locking file mode for usdc files ( USD Memory Asset Class ) #1852

Please consider adding a non locking file mode for usdc files ( USD Memory Asset Class ) #1852

koen-v commented May 5, 2022

spiffmon commented May 7, 2022

koen-v commented May 7, 2022

jilliene commented May 9, 2022

hybridherbst commented Jun 3, 2022

spiffmon commented Jun 3, 2022

hybridherbst commented Jun 3, 2022

spiffmon commented Jun 3, 2022

oumad commented Jun 22, 2022

spiffmon commented Jun 27, 2022

oumad commented Jun 27, 2022

sunyab commented Aug 18, 2022

Please consider adding a non locking file mode for usdc files ( USD Memory Asset Class ) #1852

Please consider adding a non locking file mode for usdc files ( USD Memory Asset Class ) #1852

Comments

koen-v commented May 5, 2022

spiffmon commented May 7, 2022

koen-v commented May 7, 2022

jilliene commented May 9, 2022

hybridherbst commented Jun 3, 2022

spiffmon commented Jun 3, 2022

hybridherbst commented Jun 3, 2022

spiffmon commented Jun 3, 2022

oumad commented Jun 22, 2022

spiffmon commented Jun 27, 2022

oumad commented Jun 27, 2022

sunyab commented Aug 18, 2022