Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to write a NeXus metadata file with the lambda2m? #13

Open
prjemian opened this issue Aug 25, 2023 · 13 comments
Open

How to write a NeXus metadata file with the lambda2m? #13

prjemian opened this issue Aug 25, 2023 · 13 comments
Assignees
Labels
question Further information is requested

Comments

@prjemian
Copy link
Contributor

@qzhang234 asks (on Teams):

I believe the first step to saving a Nexus file with Lambda2M is to write a Bluesky plan? Would you happen to have a template with Nexus writing to help me get started?

@prjemian prjemian added the question Further information is requested label Aug 25, 2023
@prjemian
Copy link
Contributor Author

RE(bp.count([lambda2m])) is the very basic test. Inside a plan:

def my_plan():

    yield from bp.count([lambda2m])

then RE(my_plan())

@prjemian
Copy link
Contributor Author

Template with NeXus writing:. Since we are using the NXWriter from apstools, there is no existing documentation how to customize yet. Here is the guide demonstrating the NXWriter:

https://bcda-aps.github.io/apstools/dev/examples/fw_nxwriter.html

Instead of NXWriter, we're using NXWriterAPS since it understands the APS. The other is for use outside of APS beamlines.

@prjemian
Copy link
Contributor Author

In a related issue, writing the run metadata could use standardized methods. We identified in our conf call today that the standardized methods could be more general. The related issue (BCDA-APS/bluesky_training#244) attempts to generalize the standardized methods.

@qzhang234 asked:

... how to use this with the Run Engine? Should I create a plan just like the one shown above, but replace motor_start_preprocessor with bp.count([lambda2m])), and then the plan would execute the count command and create the Nexus file?

Once this generalized method is implemented in the XPCS instrument package, it will not be necessary to change how to the plan is written. You can use the standard plans (bp.count, for example) and any labeled ophyd objects labeled as ad_metadata will be written to the new stream label_start_ad_metadata.

@qzhang234
Copy link
Contributor

@prjemian I see. So should I add this part to my Bluesky plan and then proceed to the normal bp.count?

image

@prjemian
Copy link
Contributor Author

That's not the way it works. This part:

RE.subscribe(nxwriter.receiver)

configures the session to record a new NeXus file with every run. So if you called

RE(bp.count([some_area_detector]))

you would get a NeXus file. Then if you called

RE(bp.scan([det1, det2], motor, 10, 20, 11))

you'd get another NeXus file. Same for

RE(AD_Acquire())

you'd get yet another NeXus file. No extra setup code needed.

@prjemian
Copy link
Contributor Author

As we discussed on Friday, the default behavior of the NXWriter() will try to copy the area detector image from the EPICS IOC-created file into the bluesky-created NeXus file. That's not what we want for XPCS. We'll need to provide some local changes to the standard NXWriter code.

@prjemian
Copy link
Contributor Author

Note for me: write_stream_external() is called when the area detector HDF plugin is used. Probably this code is the one to revise or replace.

@prjemian
Copy link
Contributor Author

prjemian commented Aug 29, 2023

Not sure we want this as the final code in the MyNXWriter class, but it is a starting point:

    def write_stream_external(self, parent, d, subgroup, stream_name, k, v):
        resource_id = self.get_unique_resource(d)
        fname = self.getResourceFile(resource_id)

        ds = subgroup.create_dataset("file", data=fname.name)
        h5addr = "/entry/data/data"
        ds.attrs["target"] = ds.name
        ds.attrs["source_file"] = str(fname)
        ds.attrs["source_address"] = h5addr
        ds.attrs["resource_id"] = resource_id
        ds.attrs["shape"] = v.get("shape", "")

        subgroup["external"] = h5py.ExternalLink(str(fname), h5addr)

    def get_unique_resource(self, d):
        # count number of unique resources (expect only 1)
        resource_id_list = []
        for datum_id in d:
            resource_id = self.externals[datum_id]["resource"]
            if resource_id not in resource_id_list:
                resource_id_list.append(resource_id)
        if len(resource_id_list) != 1:
            # fmt: off
            raise ValueError(
                f"{len(resource_id_list)}"
                f" unique resource UIDs: {resource_id_list}"
            )
            # fmt: on
        return resource_id_list[0]

On my local workstation, it results in this HDF5 structure:

            adsimdet --> /entry/instrument/bluesky/streams/primary/adsimdet_image
            adsimdet_image:NXdata
              @NX_class = "NXdata"
              @signal_type = "detector"
              @target = "/entry/instrument/bluesky/streams/primary/adsimdet_image"
              EPOCH:NX_FLOAT64 = 1693267642.3699787
                @long_name = "epoch time (s)"
                @target = "/entry/instrument/bluesky/streams/primary/adsimdet_image/EPOCH"
                @units = "s"
              external: missing external file
                @file = "/tmp/docker_ioc/iocad/tmp/adsimdet/2023/08/28/e355b37c-0d71-4dc5-a5b4_000000.h5"
                @path = "/entry/data/data"
              file:NX_CHAR = b'e355b37c-0d71-4dc5-a5b4_000000.h5'
                @resource_id = "5357ad07-3d01-46eb-b9de-969c9ae708cf"
                @shape = [ 100 1024 1024]
                @source_address = "/entry/data/data"
                @source_file = "/tmp/docker_ioc/iocad/tmp/adsimdet/2023/08/28/e355b37c-0d71-4dc5-a5b4_000000.h5"
                @target = "/entry/instrument/bluesky/streams/primary/adsimdet_image/file"
              time:NX_FLOAT64 = 0.0
                @long_name = "time since first data (s)"
                @start_time = 1693267642.3699787
                @start_time_iso = "2023-08-28T19:07:22.369979"
                @target = "/entry/instrument/bluesky/streams/primary/adsimdet_image/time"
                @units = "s"

@prjemian
Copy link
Contributor Author

Note external: missing external file is a local situation on my workstation. If the external file was accessible, then the external would actually show the data in the external file.

@qzhang234
Copy link
Contributor

@prjemian This is fantastic! Should I do a git pull on Kouga?

Also, per our discussion today, how do I unsubscribe nxwriter? Not super important at this moment, just curious

@prjemian
Copy link
Contributor Author

Nothing to pull well use the old copy and paste technique.

To unsubscribe, we need the integer key that was returned to us when we first subscribed. Since we did not store that key, it's not easy to get it later. That's why I had you comment out that part in the setup.

@prjemian
Copy link
Contributor Author

revision to one of the above methods:

    def write_stream_external(self, parent, d, subgroup, stream_name, k, v):
        resource_id = self.get_unique_resource(d)
        fname = self.getResourceFile(resource_id)

        h5addr = "/entry/data/data"
        ds = h5py.ExternalLink(str(fname), h5addr)  # TODO: check the path
        ds.attrs["target"] = ds.name
        ds.attrs["source_file"] = str(fname)
        ds.attrs["source_address"] = h5addr
        ds.attrs["resource_id"] = resource_id
        ds.attrs["shape"] = v.get("shape", "")
        subgroup["value"] = ds

@prjemian
Copy link
Contributor Author

prjemian commented Jun 10, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants