Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial burst metadata reader #1

Merged
merged 25 commits into from
Dec 16, 2021
Merged

Conversation

LiangJYu
Copy link
Contributor

@LiangJYu LiangJYu commented Dec 1, 2021

Features:

  • Create ISCE3-compatible Sentinel1 burst class given S1 SAFE, subswath index, and polarization.
  • Monotonically increasing bursts IDs.

Install:

  1. Set up and activate virtual environment with ISCE3.
  2. Clone repository.
$ cd ~/src
$ git clone https://github.com/LiangJYu/sentinel1-reader.git
  1. Install into virtual environment with pip. From clone directory:
$ cd sentinel1-reader
$ pip install .

Usage:

To process a single burst from a S1 SAFE zip and print burst ID + center lon/lat, from sentinel1-reader/bin directory:

$ python s1_read.py <path to S1 SAFE zip> <subswath index> <polarization>

Todo:

  • Figure out why pip install is not working.

Copy link
Contributor

@hfattahi hfattahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @LiangJYu. Very exciting that we are getting very close to have this reader work. I have not made it through the PR and test extensively. But here is few minor comments mainly suggestions on names of the class members. I will get to the rest of the PR soon.

sentinel1_reader/sentinel1_burst_slc.py Outdated Show resolved Hide resolved
sentinel1_reader/sentinel1_burst_slc.py Outdated Show resolved Hide resolved
sentinel1_reader/sentinel1_burst_slc.py Outdated Show resolved Hide resolved
sentinel1_reader/sentinel1_burst_slc.py Outdated Show resolved Hide resolved
sentinel1_reader/sentinel1_burst_slc.py Outdated Show resolved Hide resolved
sentinel1_reader/sentinel1_burst_slc.py Outdated Show resolved Hide resolved
sentinel1_reader/sentinel1_burst_slc.py Outdated Show resolved Hide resolved
Copy link
Contributor

@vbrancat vbrancat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @LiangJYu . I left some minor comments and I hope to try the PR out asap. I was wondering if you have some Jupiter notebook displaying the usage of the reader. Might be worth uploading them in a folder in this repo.

bin/s1_read.py Outdated Show resolved Hide resolved
item_valid = lambda item, sat_id: os.path.isfile(item) and sat_id in item
orbit_files = [item for item in os.listdir(orbit_dir)
if item_valid(f'{orbit_dir}/{item}', self.platform_id)]
if not orbit_files:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my curiosity. At the moment the reader is assuming to find the orbit file allocated somewhere in a directory. Would it make sense to introduce functionality to download the orbits in case these files are missing? We do have prototypes for that so it should not be too much work.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point @vbrancat I was actually thinking the same. But then I was hesitant if we want to have the reader to be isolated as a pure reader and let workflows at higher level be worried about downloading orbits and all the junk around (connection failure etc). I don't have a strong opinion really but I just wanted to make sure you also note that point.
Anyways I think the reader should accept external orbit and that should be the default behavior.
A middle ground would be to have a flag for orbit_download which is off by default. So if the orbit is not found and if the user specifically asks to download, then the reader attempts to download.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. The higher-level workflow should properly handle this (e.g. check the internet connection and all this stuff). However, I do not see any harm in providing an optional functionality to download the orbit (if not present). Maybe the functionality (it can be a simple flag) should not have too much logic in it (e.g. not checking the internet connection). I will let @LiangJYu illuminate us if this solution does not violate the "separation of functionalities" principle (https://en.wikipedia.org/wiki/Single-responsibility_principle)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me as far as it is optional to use.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially when running in production, I also agree the higher-level workflow should download the orbits. For other use cases, I'm not opposed to including a download option. Perhaps something like the following?

def get_local_orbit_file(t_start, t_stop, orbit_dir):
    # try to find and return path file in orbit_dir else return empty string

def get_online_orbit_file(t_start, t_stop):
    # try to find and return file from online source else return empty string

def get_isce3_orbit(..., online_ok=False):
    ...
    # try local first
    orbit_file = get_local_orbit_file(...)

    # try online if nothing local and ok to go online
    if not orbit_file and online_ok:
        orbit_file = get_online_orbit_file(...)

        # check if nothing found online
        if not orbit_file:
            raise ValueError("no orbit file found")
    ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.

sentinel1_reader/sentinel1_reader.py Outdated Show resolved Hide resolved
sentinel1_reader/sentinel1_reader.py Outdated Show resolved Hide resolved
center_lat = np.mean(burst_lats)
center_pts[i] = (center_lon, center_lat)

boundary_pts[i] = [(lon, lat) for lon, lat in zip(burst_lons,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if it makes sense to allocate them as WKT (might be more easily readable by other software).

sentinel1_reader/sentinel1_reader.py Outdated Show resolved Hide resolved
sentinel1_reader/sentinel1_reader.py Outdated Show resolved Hide resolved
setup.cfg Outdated Show resolved Hide resolved
d_seconds = 0.5 * (self.shape[0] - 1) * self.azimuth_time_interval
return self.sensing_start + datetime.timedelta(seconds=d_seconds)

def get_isce3_orbit(self, orbit_dir: str):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to load orbit while the XML is read and to make it a member of Sentinel1BurstSlc. It'll be highly unlikely that a swath will span multiple orbit files, so repeated calls current function is doing a lot repeated/redundant file reading.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that orbit may be a member of sentinel-1 burst SLC. But still you need a method to return the orbit as we will need orbit for workflows as you know.
The extra call to get_orbit did not bother me. actually it's more clear what is going on.

bin/s1_read.py Outdated Show resolved Hide resolved
bin/s1_read.py Outdated Show resolved Hide resolved
orbit made a member of sentinel1 burst class
orbit parsed from state vector list in annotation reader
orbit state vector load stand alone function
@LiangJYu
Copy link
Contributor Author

After the most recent commits, I think the interfaces will be changing much so this will be ready to use in workflow development.

Orbit download from ASF not implemented. Is it immediately required for workflow development?

@vbrancat
Copy link
Contributor

@LiangJYu, orbit downloading in the sentinel reader are not necessary at the moment. They are "nice to have" but they can be added later. Plus, we will have the S1 CSLC managing that. Let's get the s1_reader merged and start working on a draft of the S1 CSLC workflow.

@LiangJYu
Copy link
Contributor Author

@yunjunz Do have a preference on the repository directory/file structure? Do you prefer something closer to you've done in this PR?

@yunjunz
Copy link
Member

yunjunz commented Dec 10, 2021

@LiangJYu I do like the src/{module_name} structure more, it would be great if we could keep the same style here too.

@LiangJYu LiangJYu changed the title WIP initial burst metadata reader initial burst metadata reader Dec 10, 2021
@hfattahi
Copy link
Contributor

@LiangJYu would you please update the README with the instruction that you have in the PR description above.
Also you may add few lines of example as you had in a script. e.g.:

from sentinel1_reader import sentinel1_reader, sentinel1_orbit_reader

zip_path = "S1A_IW_SLC__1SDV_20190909T134419_20190909T134446_028945_03483B_B9E1.zip"
i_subswath = 2
pol = "HH"

#returns the list of the bursts
bursts = sentinel1_reader.zip2bursts(zip_path, osv_list, i_subswath, pol)

@@ -0,0 +1,131 @@
*.DS_Store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we do not need the .gitignore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like having an ignore file as an easy way of ensuring ancillary/extraneous files that get created in the repo directory don't get committed into the project. e.g. Calling git commit -am "some message" when there's random non-source files in the repo.

Are there any specific reason(s) why your find the ignore file unnecessary?

Comment on lines 88 to 89
inwidth = self.last_valid_sample - self.first_valid_sample
inlength = self.last_valid_line - self.first_valid_line
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should you add one to these lines here? If the last_valid_sample is 3 and first_valid_sample is 1, you have three lines, i.e. 1, 2, 3. The current code will return 2 lines.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this comment addressed yet. Please let me know if it makes sense or not. Thank you.


# determine start and end times from file name
file_name_tokens = os.path.basename(zip_path).split('_')
platform_id = file_name_tokens[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are probably expecting platform_id to be S1A or S1B. Maybe you can add a small check here to check whether the value is correct. The user could have provided a ZIP file containing Sentinel-1 data but renamed to a non-standard name.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Please, let me know if you think the check is not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See lines 35-37 below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add in 7de0f9a

Comment on lines 20 to 23
t_str : string
Time string to be parsed.
fmt : string
Format of string provided. Defaults to az time format found in annotation XML.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since datetime formats can vary, it's probably helpful if you add one example to the parameters' description. Something like:

Suggested change
t_str : string
Time string to be parsed.
fmt : string
Format of string provided. Defaults to az time format found in annotation XML.
t_str : string
Time string to be parsed (e.g., "2021-12-10T12:00:0.0").
fmt : string
Format of string provided. Defaults to az time format found in annotation XML (e.g., "%Y-%m-%dT%H:%M:%S.%f").

Returns:
------
_ : datetime.datetime
datetime.dateime object parsed from given time string.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
datetime.dateime object parsed from given time string.
datetime.datetime object parsed from given time string.

starting_slant_range : float
Starting slant range of the burst.
slant_range_res : float
Slant range resolution of the burst.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean "pixel spacing". We use "resolution" for measuring the expected length of a point target which depends on processing parameters.

Suggested change
Slant range resolution of the burst.
Slant-range pixel spacing of the burst.

Comment on lines 155 to 161
pos = [float(osv[i].text) for i in range(4,7)]
vel = [float(osv[i].text) for i in range(7,10)]
if t_orbit > sensing_start - pad:
orbit_sv.append(isce3.core.StateVector(isce3.core.DateTime(t_orbit),
pos, vel))
if t_orbit > sensing_stop + pad:
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early escape to simplify code.

Suggested change
pos = [float(osv[i].text) for i in range(4,7)]
vel = [float(osv[i].text) for i in range(7,10)]
if t_orbit > sensing_start - pad:
orbit_sv.append(isce3.core.StateVector(isce3.core.DateTime(t_orbit),
pos, vel))
if t_orbit > sensing_stop + pad:
break
if t_orbit < sensing_start - pad:
continue
if t_orbit > sensing_stop + pad:
break
pos = [float(osv[i].text) for i in range(4,7)]
vel = [float(osv[i].text) for i in range(7,10)]
orbit_sv.append(isce3.core.StateVector(isce3.core.DateTime(t_orbit),
pos, vel))

first_valid_samples[last_line])
last_sample = min(last_valid_samples[first_valid_line],
last_valid_samples[last_line])
n_valid_samples = last_sample - first_valid_sample
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above. If last_sample is 3 and first_valid_sample is 1, shouldn't n_valid_samples equal 3 rather than 2?

@LiangJYu
Copy link
Contributor Author

All issues/comments have been addressed. If there's nothing else, I believe this PR is ready to be merged.

@vbrancat
Copy link
Contributor

@LiangJYu Would you mind fix the "codacy" issues before merging?

@LiangJYu
Copy link
Contributor Author

I'm giving up on addressing Codacy formatting issues. In acaa997, I addressed issues flagged from 9aed622 with suggestions from the 9aed622 Codacy summary. Codacy issues in acaa997 seem to conflict with issues and suggestions from 9aed622.

If we're going to do formatting checks, I much prefer using tools that can also be run on both GitHub and on the local development environment. This arrangement would reduce the number of commits attempting to appease the black box formatting test.

@LiangJYu
Copy link
Contributor Author

@yunjunz Thanks for addressing the formatting in the README! I'm surprised fe4df29 didn't encounter all the flags in Python that acaa997 encountered (see below).
image

@yunjunz
Copy link
Member

yunjunz commented Dec 15, 2021

Codacy has a learning curve as well. For the flag, for example, I just disabled this pattern in the codacy review process, so it won't bother us next time.

Copy link
Contributor

@hfattahi hfattahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @LiangJYu. This looks great. The metadata seems correct and the extracted raster is identical to that extracted by isce2. I have only few minor comments which should be easy to address.

README.md Outdated
2. Clone repository.

```bash
cd ~/src
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a residual from the initial structure. You might just delete it.

bursts : list
List of Sentinel1BurstSlc objects found in annotation XML.
'''
id_str = f'iw{n_subswath}-slc-{pol}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we change pol to pol.lower() so that it won't fail if the user passes "HH" or even "Hv"?

src/sentinel1_reader/sentinel1_reader.py Show resolved Hide resolved
Copy link
Contributor

@vbrancat vbrancat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 . Thank you so much for have been so patience and having addressed all the issues

README cleanup
subswath index and polarization value checks
Copy link
Contributor

@gshiroma gshiroma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @LiangJYu , I think my comments were not addressed yet. Maybe there is some commit left to push?

Comment on lines 88 to 89
inwidth = self.last_valid_sample - self.first_valid_sample
inlength = self.last_valid_line - self.first_valid_line
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this comment addressed yet. Please let me know if it makes sense or not. Thank you.


# determine start and end times from file name
file_name_tokens = os.path.basename(zip_path).split('_')
platform_id = file_name_tokens[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Please, let me know if you think the check is not needed.

@LiangJYu
Copy link
Contributor Author

@gshiroma width and length addressed here with 20bc7ad

Were you expecting something else?

Copy link
Contributor

@gshiroma gshiroma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the changes now. I don't know why GitHub was not showing me your updates previously. Thank you for addressing all my comments. Nice update, @LiangJYu ! LGTM!

@LiangJYu LiangJYu merged commit 8e46b5b into isce-framework:main Dec 16, 2021
LiangJYu pushed a commit to LiangJYu/s1-reader that referenced this pull request Aug 2, 2022
Updating branch `annotation_reader` in this for to date
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants