Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should newlines be allowed on metadata upload #3149

Closed
wasade opened this issue Oct 1, 2021 · 4 comments
Closed

Should newlines be allowed on metadata upload #3149

wasade opened this issue Oct 1, 2021 · 4 comments

Comments

@wasade
Copy link
Contributor

wasade commented Oct 1, 2021

I recently pushed metadata to Qiita which unbeknownst to me contained newlines in some cells. I discovered this when downloading the pushed metadata after, and observed what appeared to be incorrect sample names in the resulting tab delimited file, coming from the contained newlines.

Should Qiita detect and error, or detect and replace, newline characters prior to accepting metadata updates?

@antgonza
Copy link
Member

antgonza commented Oct 4, 2021

Thank you for reporting.

A couple of questions:

  1. do you have a small example of the erroneous data? Basically, I'm not sure if the new lines were in the sample names, in the metadata columns or both.
  2. where you using the GUI or an endpoint?

@wasade
Copy link
Contributor Author

wasade commented Oct 5, 2021

  1. qiita-rc under 10317 should have invalid sample names in the metadata download (unless it's been wiped since last week). An example of what was observed in the download from qiita-rc is below. Newlines were in variable values.
$ grep -v "^10317" 10317_20210930-123208.txt | cut -c 1-20 | head
sample_name	acid_ref
fresh water fish: Se
fresh water fish: Se
fresh water fish: Se
fresh water fish: Se
fresh water fish: Se
Note: CAT died very 
have had diarrhea 2-
felt nauseous 2 time
Chickens
  1. REST endpoint

@antgonza
Copy link
Member

antgonza commented Oct 6, 2021

I can confirm this issue and figured out why this is happening: the REST endpoint is not calling load_template_to_dataframe which validates and clean up the metadata to create the dataframe.

We basically would need to replace this line here:

data = qiita_db.metadata_template.util.load_template_to_dataframe(self.request.body)

@wasade
Copy link
Contributor Author

wasade commented Oct 6, 2021

Okay cool. Let's see if #3152 does it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants