Skip to content
This repository has been archived by the owner on Jun 28, 2018. It is now read-only.

Optical images handling and some other changes #6

Closed
wants to merge 21 commits into from

Conversation

Ren22
Copy link
Contributor

@Ren22 Ren22 commented Feb 14, 2018

Changes and additions to delete optical image (imageupload.js and resolvers)
Schema additions to retrieve fdr numbers
Other fixes

config/test.js Outdated
@@ -2,7 +2,7 @@ let config = {};

config.port = 3011;
config.ws_port = 5667;
config.iso_img_port = 4202;
config.iso_img_port = 4201;
Copy link
Contributor

@iprotsyuk iprotsyuk Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reasons for this?

Copy link
Member

@intsco intsco Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used a different port (4202) for testing on purpose to avoid conflicts with the dev version running locally.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I changed it by mistake

resolvers.js Outdated
let {datasetId} = args;
const payload = jwt.decode(args.jwt, config.jwt.secret);
try {
logger.info(args);
Copy link
Contributor

@iprotsyuk iprotsyuk Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth skipping this, or at least printing the requested dataset ID instead of the whole payload

resolvers.js Outdated
const body = JSON.stringify({
id: datasetId
});
await fetch(url, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd check the return value here as per MDN, since the query may not have been successful even if no exception has been thrown

resolvers.js Outdated
logger.error(e);
})
// zoom = -1 a trick to get all optical images IDs for a single dataset
if (zoom == -1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of introducing an obscure magic number "-1" here, I'd change the schema.graphql to remove default value for zoom (it's 1 currently), thereby allowing it to be undefined, and add a comment to the schema that undefined is supposed to return image IDs for all zoom levels.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, one may simply move this new code into another method, since the return type is actually different in this case: array of strings vs string. At least I don't see any advantage of having these two queries hidden behind the same GraphQL call

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ren22 is it really needed? You can get all optical image ids for a dataset inside deleteOpticalImage, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@intsco hmm, that's true, I just thought that there is a difference in what each methods does. Basically,

  • deleteOpticalImage
    fetches POST query to sm-engine, in which afterwards del_optical_image method is called to delete all entries in DB
  • opticalImageUrl
    just returns strings of image IDs in optical_image folder (in case of zoom=-1) that are further handled by web-app, which sends for every ID DELETE request to be deleted from File system (handled by Express app.delete). I think in any case retrieving all ids is needed here, or maybe there is a better way for that?

@iprotsyuk another method not to mess up it with the current one is a good idea, thanks! I also though of removing value for zoom to undefined, but then I thought that it may cause some unpredictable behavior, that's why I decided to pick -1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ren22, I'd suggest concentrating all deletion actions at one place, which sm-engine (del_optical_image hanlder) seems to be a best place for. Otherwise, one single action of image removal gets too much distributed

schema.graphql Outdated
@@ -70,6 +70,7 @@ type Dataset {
status: JobStatus
inputPath: String
uploadDateTime: String
fdrCounts: String
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd propose introducing a small data type (with fields fdrLevel & annotationCount) for the return value here and letting GraphQL do all format handling on its own instead of manually doing stringify and parse

imageUpload.js Outdated
await fs.unlink(imgPath);
await fs.remove(dirPath);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ren22 why to remove the whole subdirectory? It might contain other images.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to have more than one image in the same folder? I just found it to be so that for every image there's only one corresponding folder. But this is easy to change

All deletion steps are done in one method async deleteOpticalImage using async/await
Added extra type FdrCountsNumber to GraphQL schema
@intsco
Copy link
Member

intsco commented Feb 20, 2018

@Ren22 please check if optical images get deleted when a datasets is being deleted

resolvers.js Outdated

fdrCounts(ds) {
// Get 10% FDR string -> counts[1], 5% -> counts[0]
let fdrAt10Val = ds._source.annotation_counts[0].counts[1];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation is too specific here. One of ways to make it more general and more useful in the future is adding the needed FDR level as a parameter

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the name of the method I'd expect it to return a map of the form fdr_level -> count_n
I thought we want to display this information in the web app on the datasets page

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the expectation here, but since we don't really need such a map anywhere, I'd transform it to a bit more specific function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think the initial idea was to get all FDR levels with number of counts for each of them. However, after discussion with Theo , only 10% FDR counts were chosen to be shown. Adding FDR level as a param would make this code much more flexible. Another option is to return all FDR levels with counts and then decide on the web-app side, which level to show. I think the latter is the better option, becasue then we can allow user to chose for instance for the level himself w/o extra requests to GQL. What do you think @iprotsyuk @intsco ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess if there're no other FDR-related feature requests for the web-app which could reuse these GQL endpoints, I wouldn't go for the general solution, as it brings more unnecesary data transfer over the network. We can always add the general method once it's needed. So I'd stick to a parametrical FDR count request for the time being.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thank you @intsco @iprotsyuk
So, I will send all 4 pairs of FDR with counts to web-app with only 10% displayed at the moment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ren22 you could get the best of all worlds and accept an array of fdr values to return as an input...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @andy-d-palmer! will do that

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ren22, bear in mind that this approach is a bit complicated with argument handling. What if requested FDR level is not one of the pre-computed ones by SM-engine, including cases when it's outside of [0, 100]% interval. Also there's unnecessary degree of freedom in a sense that values in input array can be repeated. Overall, it seems that returning all available FDR counts is more preferable option here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iprotsyuk great points about the input handling. I added a feature request for making the backend less coupled to the web app (i.e. storing float fdr values rather than discrete levels for display) so it makes sense to have more general api.

resolvers.js Outdated
await checkPermissions(datasetId, payload);
let opticalImageIds = await pg.select('id').from('optical_image').where('ds_id', '=', datasetId);
let rawOpticalImage = await pg.select('optical_image').from('dataset').where('id','=',datasetId);
Array.prototype.push.apply(opticalImageIds, rawOpticalImage);
Copy link
Contributor

@iprotsyuk iprotsyuk Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd go for concat() here to make it more readable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it, thanks @iprotsyuk
Also seems that concat is a bit faster than Array.prototype.push.apply for Safari
just fyi https://jsperf.com/array-prototype-push-apply-vs-concat/17

resolvers.js Outdated
opticalImageIds.forEach((Image) => {
if (Image.id){ //to delete zoomed Image
let url = basePath + `${config.img_upload.categories.optical_image.path}delete/${Image.id}`;
fetch(url, {
Copy link
Member

@intsco intsco Feb 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ren22 Try to await for the result of "fetch" to ensure that exceptions inside a promise are handled well.
And call "fetch" on an incorrect url to see if an exception is raised here and handled below.

schema.graphql Outdated
level: Int
counts: Int
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type might also need to be updated. See the comment above

schema.graphql Outdated
@@ -70,6 +75,7 @@ type Dataset {
status: JobStatus
inputPath: String
uploadDateTime: String
fdrCounts: FdrCountsNumber
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field might also need to be updated. See the comment above

resolvers.js Outdated
let opticalImageIds = await pg.select('id').from('optical_image').where('ds_id', '=', datasetId);
let rawOpticalImage = await pg.select('optical_image').from('dataset').where('id','=',datasetId);
Array.prototype.push.apply(opticalImageIds, rawOpticalImage);
opticalImageIds.forEach((Image) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change the logic here a little bit to reduce the code duplication.

how it works now:

  1. fetch IDs of optical images
  2. fetch ID of raw image
  3. merge the arrays of IDs into 1 array
  4. iterate over the array, and depending on the type of image ID send DELETE request to a specific URL

how I'd put it:

  1. fetch IDs of optical images
  2. generate an array of URLs to send DELETE request to
  3. fetch ID of raw image
  4. generate 1 more URL to send DELETE request to
  5. send DELETE requests to each of generated URLs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @iprotsyuk , I will send changes with a new commit (coming)

schema.graphql Outdated
@@ -70,6 +75,7 @@ type Dataset {
status: JobStatus
inputPath: String
uploadDateTime: String
fdrCounts: FdrCountsNumber
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you should decide whether counts for all FDR levels should be returned by this field (then the use of this custom type FdrCountsNumber is perfectly fine), or a count for specific FDR level is returned (then the level property doesn't make much sense).

Based on the current usage of this field (we don't need counts for all FDR levels), I'd suggest putting it as a function here with fdrLevel parameter, s.t. it'd return just one number.

@Ren22
Copy link
Contributor Author

Ren22 commented Feb 22, 2018

Changes after 2nd code review:

  • Rewritten deleteOpticalImage with only one fetch request per each URL
  • fdrCounts now takes a min and max value of fdrLevel and should return the list of all fdrCounts between them.

resolvers.js Outdated
let allLvlCounts = ds._source.annotation_counts[0].counts;
let fdrLevels = [];
let fdrCounts = [];
allLvlCounts.forEach(Level => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

variables and argument names should start with a small letter

resolvers.js Outdated
await checkPermissions(datasetId, payload);
let opticalImageIds, rawOpticalImage = await pg.select('id').from('optical_image').where('ds_id', '=', datasetId)
let rawOpticalImage = await pg.select('optical_image').from('dataset').where('id', '=', datasetId);
await opticalImageIds.forEach(Image => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schema.graphql Outdated
@@ -44,6 +44,11 @@ enum JobStatus {
FAILED
}

type fdrCountsNumber {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GraphQL types must start with a capital letter
see examples http://graphql.org/learn/schema/#object-types-and-fields

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@intsco thanks for pointing on that problem and links. I will correct it

Processing requested Fdrlevels, deletion of opt Image
schema.graphql Outdated
@@ -44,6 +44,11 @@ enum JobStatus {
FAILED
}

type FdrCountsNumber {
Copy link
Contributor

@iprotsyuk iprotsyuk Feb 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename it to FdrCounts as it's shorter and conveys the same meaning

schema.graphql Outdated
@@ -70,6 +75,7 @@ type Dataset {
status: JobStatus
inputPath: String
uploadDateTime: String
fdrCounts(inpFdrLvls: [Float]): FdrCountsNumber
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use ! as a non-null specification everywhere in this declaration and definition of FdrCountsNumber

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @iprotsyuk, I will add it

schema.graphql Outdated
@@ -235,7 +241,7 @@ type Query {
allDatasets(orderBy: DatasetOrderBy = ORDER_BY_DATE,
sortingOrder: SortingOrder = DESCENDING,
filter: DatasetFilter = {}, simpleQuery: String,
offset: Int = 0, limit: Int = 10): [Dataset!]!
offset: Int = 0, limit: Int = 10, inpFdrLvls: [Float]): [Dataset!]!
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow, the change in the function signature didn't affect its implementation. Forgot to remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's indeed a redundant argument that I passed while experimenting with an input, will delete it

resolvers.js Outdated
allUrls.push(basePath + `${config.img_upload.categories.raw_optical_image.path}delete/${Image.optical_image}`)
});
await allUrls.forEach(url => {
fetch(url, {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I suggested in one of prior reviews to check the return value of fetch as it can be a sign of unsuccessful query as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that calling catch on a promise returned from fetch is enough for that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what I'm missing here is checking for a response.ok as per MDN fetch promise is rejected when there's network problem.

An accurate check for a successful fetch() would include checking that the promise resolved, then checking that the Response.ok property has a value of true.

fetch(url).then((response) => {
  if (response.ok) {
    //do smth
  } else {
    throw new Error('Something went wrong');
  }
})
.catch((error) => {
  console.log(error)
});

Would it be correct? @intsco @iprotsyuk

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I think so

resolvers.js Outdated
});
return {
'levels': outfdrLvls,
'counts': outFdrCounts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outfdrLvls & outFdrCounts can be undefined if the dataset status is not FINISHED

resolvers.js Outdated
let inpAllLvlCounts = ds._source.annotation_counts[0].counts;
let outfdrLvls = [], outFdrCounts = [];
inpFdrLvls.forEach(lvl => {
for (let lvlCount of inpAllLvlCounts) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use inpAllLvlCounts.find() instead of iteration through the whole array. It'd do the same thing, but read better

resolvers.js Outdated
@@ -77,6 +77,15 @@ function baseDatasetQuery() {
});
}

function checkFetchRes(resp) {
if (resp.status >= 200 && resp.status < 300) {
Copy link
Contributor

@iprotsyuk iprotsyuk Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd check resp.ok flag here, as it has the same meaning as per MDN

Copy link
Contributor Author

@Ren22 Ren22 Mar 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iprotsyuk that's the interesting case, what I found with fetch is that resp will be ok even if 404 status is raised. e.g. when one sends a full url , so this was a way to catch it that I came up with
More:
JakeChampion/fetch#155

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I checked it once again and seems that resp.ok works fine too. Don't know what was the thing before

resolvers.js Outdated
if(ds._source.annotation_counts && ds._source.ds_status === 'FINISHED') {
let inpAllLvlCounts = ds._source.annotation_counts[0].counts;
inpFdrLvls.forEach(lvl => {
inpAllLvlCounts.find(lvlCount => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit confusing usage of find(). The callback function should just check that lvlCount.level === lvl. Then, find() would return the corresponding count, and you extract necessary data from it. Keep in mind that find() may return undefined.

resolvers.js Outdated
await allUrls.push(basePath +
`${config.img_upload.categories.raw_optical_image.path}delete/${rawOpticalImage[0].optical_image}`);
for (let image of opticalImageIds) {
await allUrls.push(basePath + `${config.img_upload.categories.optical_image.path}delete/${image.id}`)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems push() is synchronous, so I'd remove await from this line and the one above. Also please put the closing brace to a new line :)

resolvers.js Outdated
for (let url of allUrls) {
let res = await fetch(url, {
method: 'delete'});
await checkFetchRes(res);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkFetchRes is also synchronous, so await doesn't do anything

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, thanks for pointing on that @iprotsyuk

@iprotsyuk
Copy link
Contributor

A general comment: it seems that optical image removal doesn't take place upon dataset deletion. If this is the case indeed, It should be addressed in another PR to sm-engine.

@Ren22
Copy link
Contributor Author

Ren22 commented Mar 2, 2018

@iprotsyuk it is properly deleted by daleteDatasetQuery from sm-web app without messing things up in resolver on GQL, I will commit it soon :) (but I think I already added it in the previous commit to sm-web-app)

@iprotsyuk
Copy link
Contributor

👍

intsco added a commit that referenced this pull request Mar 19, 2018
@intsco
Copy link
Member

intsco commented Mar 19, 2018

Merged into master as a single commit (cb97991). The PR has too many merge commits

@intsco intsco closed this Mar 19, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants