Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate values in the database #1079

Closed
10 tasks done
mattdon opened this issue Oct 19, 2019 · 2 comments · Fixed by #1366
Closed
10 tasks done

Duplicate values in the database #1079

mattdon opened this issue Oct 19, 2019 · 2 comments · Fixed by #1366
Labels
defect priority_normal RAVENv2.0 Defects and Features in release of RAVEN v2.0 RAVENv2.1 All tasks and defects that will go in RAVEN v2.1

Comments

@mattdon
Copy link

mattdon commented Oct 19, 2019


Defect Description

Describe the defect

What did you expect to see happen?

Being able to sync time series with duplicated valeus

What did you see instead?

Hi guys, I encountered a problem using RAVEN with a Database with duplicate values of the variable "Time".
I can load the database in the RAVEN framework, but I cannot use some post-processor methods such as the HistorySetSync, as well as Externa PP.
RAVEN returns me the following error:
ValueError: cannot reindex or align along dimension 'Time' because the index has duplicate values.

In my specific case, I fixed the problem adding to the MELCOR interface the Pandas dataframe.drop_duplicates() method. However, I think it would be useful to add in RAVEN a post-processor method to remove duplicates from DataBases.

Thanks.

Matteo

Do you have a suggested fix for the development team?

N/A

Describe how to Reproduce
Steps to reproduce the behavior:

  1. Duplicated values in a time series
  2. Apply Sync PP

Screenshots and Input Files
Please attach the input file(s) that generate this error. The simpler the input, the faster we can find the issue.

Platform (please complete the following information):

  • OS: Linux
  • Version: N/A
  • Dependencies Installation: CONDA

For Change Control Board: Issue Review

This review should occur before any development is performed as a response to this issue.

  • 1. Is it tagged with a type: defect or task?
  • 2. Is it tagged with a priority: critical, normal or minor?
  • 3. If it will impact requirements or requirements tests, is it tagged with requirements?
  • 4. If it is a defect, can it cause wrong results for users? If so an email needs to be sent to the users.
  • 5. Is a rationale provided? (Such as explaining why the improvement is needed or why current code is wrong.)

For Change Control Board: Issue Closure

This review should occur when the issue is imminently going to be closed.

  • 1. If the issue is a defect, is the defect fixed?
  • 2. If the issue is a defect, is the defect tested for in the regression test system? (If not explain why not.)
  • 3. If the issue can impact users, has an email to the users group been written (the email should specify if the defect impacts stable or master)?
  • 4. If the issue is a defect, does it impact the latest release branch? If yes, is there any issue tagged with release (create if needed)?
  • 5. If the issue is being closed without a pull request, has an explanation of why it is being closed been provided?
@alfoa
Copy link
Collaborator

alfoa commented Oct 20, 2019

@mattdon can you attach an imput file (maybe using an ExternalModel) that replicates the problem for us? Or a small database with duplicated values and an input file that loads it and shows the error?
Thanks

@alfoa alfoa added defect RAVENv2.0 Defects and Features in release of RAVEN v2.0 priority_normal labels Oct 20, 2019
@alfoa alfoa mentioned this issue Nov 3, 2020
9 tasks
@wangcj05 wangcj05 mentioned this issue Nov 12, 2020
9 tasks
@joshua-cogliati-inl
Copy link
Contributor

joshua-cogliati-inl commented Dec 2, 2020

FYI: I am not sure we want to use this code since it deletes data. To remove duplicate coordinate values in an XArray DataArray, the following code works (for the variable toClear):

import numpy as np
import pandas as pd
import xarray as xr
toClear = xr.DataArray([[1,2,3],[4,5,6]], dims=("y","x"), coords={"x": [2.0, 3.0, 2.0], "y": [1.0, 2.0]})
for dim in toClear.coords.dims:
    toClear = toClear.isel({dim:np.unique(toClear[dim], return_index=True)[1]})

@alfoa alfoa added the RAVENv2.1 All tasks and defects that will go in RAVEN v2.1 label Dec 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect priority_normal RAVENv2.0 Defects and Features in release of RAVEN v2.0 RAVENv2.1 All tasks and defects that will go in RAVEN v2.1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants