-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fms_find_unique #938
fms_find_unique #938
Conversation
// Finds the number of unique strings in an array | ||
int fms_find_unique(char** arr, int *n) | ||
{ | ||
int i; | ||
int nfind; | ||
|
||
nfind=1; | ||
//printf("n is %i", *n); | ||
for(i=1; i<*n; i++){ | ||
//printf("Comparing %s and %s \n",arr[i], arr[i-1]); | ||
if (strcmp(arr[i], arr[i-1]) != 0){ nfind = nfind + 1;} | ||
} | ||
|
||
//printf("nfind=%i",nfind); | ||
return nfind; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I'm misreading the logic, this only works for a sorted array. To do this without a pre-sort, you'd need a nested loop structure. You can test this by moving the fms_find_unique in your unit test to be line 59 prior to the sort. If you want to stick with a single loop, you'd need add an fms_sort_this prior to the search-loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was only intended to work with a sorted array:
FMS/string_utils/fms_string_utils.F90
Lines 71 to 81 in 3db45b1
!> @brief c function that finds the number of unique strings in a SORTED array of c pointers | |
!! @return number of unique strings | |
function fms_find_unique(my_pointer, p_size) bind(c)& | |
result(ntimes) | |
use iso_c_binding | |
type(c_ptr), intent(in) :: my_pointer(*) !< Array of sorted c pointer | |
integer(kind=c_int), intent(in) :: p_size !< Size of the array | |
integer(kind=c_int) :: ntimes | |
end function fms_find_unique |
But the documentation should be more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add a call to the fms_sort_this function to ensure a sort is done prior to the comparison search. That would remove the requirement to pass in only sorted arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a quick way to test if it's sorted? Is the sorting of a sorted array fast enough that it doesn't matter that you sort it again?
//printf("n is %i", *n); | ||
for(i=1; i<*n; i++){ | ||
//printf("Comparing %s and %s \n",arr[i], arr[i-1]); | ||
if (strcmp(arr[i], arr[i-1]) != 0){ nfind = nfind + 1;} | ||
} | ||
|
||
//printf("nfind=%i",nfind); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unused printf
s should be cleaned up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
// Finds the number of unique strings in an array | ||
int fms_find_unique(char** arr, int *n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These C function variables should be documented https://www.doxygen.nl/manual/commands.html#cmdparam
You also need a \return
for the return value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, I think
int nfind; // Number of unique strings in an array | ||
int * ids = calloc(*n, sizeof(int)); // Array of integers initialized to 0 | ||
|
||
fms_sort_this(arr, n, ids); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@uramirez8707
The fms_sort_this function uses qsort function from stdlib.h. I was able to look at the code and web comments and it seems to me that it uses the known/published tricks to avoid the worst (O(N^2)) cases. But I am not 100% sure. To be closer to sure, one can check if he array is already sorted or reverse is sorted. This is fast (O(N)) in compariosn to qsort. One can also test and determine the performance as functin of size of input, with test data in sorted order, reverse sorted order, and random order.
Description
Adds a function to string_utils_mod to determine the number of unique strings in a sorted array and adds a test for it. This will be used in the diag_manager update to determine the number of unique diag_fields in a diag_yaml.
How Has This Been Tested?
CI, including the added unit test.
Checklist:
make distcheck
passes