Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate patient id is a relative reference #117

Merged
merged 4 commits into from
Jul 7, 2021

Conversation

wi-y
Copy link
Contributor

@wi-y wi-y commented Jun 24, 2021

Validates that the relative reference patient identifier does not contain any URI restricted characters.

If the patient identifier contains restricted URI characters, then we log a NotSupportedException. We want to to this validation up front because the subsequent calls to FHIR will fail and we want to make it clear that the failure occurred due to certain characters contained in the identifier.

@wi-y wi-y requested review from namalu, dustinburson and a team June 24, 2021 16:27
var patientId = device.Patient?.GetId<Model.Patient>() ?? throw new FhirResourceNotFoundException(ResourceType.Patient);

// only allow unreserved URI characters
if (!Regex.IsMatch(patientId, @"^[A-Za-z0-9_.\-~]+$"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than throwing an error, I think a better approach might be to make our GetId logic more robust.

We might want to consider updating the code in Microsoft.Health.Extensions.Fhir.ModelExtensions class, mainly because we currently support only one reference format (e.g. Patient/{FHIR_RESOURCE_ID}). It's possible that a reference could be a full URL (e.g. https://fhirserver.com/Patient/{FHIR_RESOURCE_ID}) or contain a '/' prior to the Resource type name.

We could potently use regex to capture the Id and ignore the rest of the string. We could still throw an error or just return null like we currently do, if no valid Id could be extracted from the string. However, if we do throw an error, we should be very explicit with what the customer can do to fix the issue.

// Should contain 2 groups, the second group should just be the id.
var match = Regex.Match(reference, typeof(TResource).Name + @"\/([A-Za-z0-9\-\.]{1,64})");

Copy link
Member

@dustinburson dustinburson Jun 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point I think we want to just enforce relative references for purposes in this PR, https://www.hl7.org/fhir/references.html#2.3.0.

Which are in the pattern ResourceName/identifier.

We currently make assumptions that the references can be resolved on the current FHIR service. Literal, logical (which is what is causing the error), or absolute open us up to other issues that aren't being properly accounted for.

Agreed this should be moved to the GetId method since it applies equally to any id we are trying to use today. As we support more reference types we can add support there.

In general I think it would be valuable to build checks off the URI (Device.Patient.Url or Reference.Url ) so we can validate it is relative and doesn't contain parameters. Hopefully there is exiting code for URIs that you can leverage to make these distinctions. We would be looking for a valid relative URI with no parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the latest commit I've moved the code to GetId and made it a bit more robust. It now locates a matching ResourceName/identifier if it happens to exist anywhere as long as it does not contain any reserved URI characters. We return null if we do not find a match.

Unfortunately it seems like regex is the best option. The Uri classes do have some useful methods for parsing the Uri but they unfortunately only work for absolute Uris. The Uri methods throw exceptions if the Uri is relative and are unusable. There is a workaround to just turn the Uri into an absolute Uri using some fake host name like http://foo/ResouceType/id but but that seemed ugly.

var patientId = device.Patient?.GetId<Model.Patient>() ?? throw new FhirResourceNotFoundException(ResourceType.Patient);

// only allow unreserved URI characters
if (!Regex.IsMatch(patientId, @"^[A-Za-z0-9_.\-~]+$"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be options to validate the reference in the FHIR library that I am double checking but if we keep this as is it would be good to have this as private static compiled Regex to avoid the extra perf hit to compile the expression per execution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a compiled regex caching mechanism for GetId. I did try and look through the firely .net library but didn't find anything that we could easily re-use for our use case

var patientId = device.Patient?.GetId<Model.Patient>() ?? throw new FhirResourceNotFoundException(ResourceType.Patient);

// only allow unreserved URI characters
if (!Regex.IsMatch(patientId, @"^[A-Za-z0-9_.\-~]+$"))
Copy link
Member

@dustinburson dustinburson Jun 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point I think we want to just enforce relative references for purposes in this PR, https://www.hl7.org/fhir/references.html#2.3.0.

Which are in the pattern ResourceName/identifier.

We currently make assumptions that the references can be resolved on the current FHIR service. Literal, logical (which is what is causing the error), or absolute open us up to other issues that aren't being properly accounted for.

Agreed this should be moved to the GetId method since it applies equally to any id we are trying to use today. As we support more reference types we can add support there.

In general I think it would be valuable to build checks off the URI (Device.Patient.Url or Reference.Url ) so we can validate it is relative and doesn't contain parameters. Hopefully there is exiting code for URIs that you can leverage to make these distinctions. We would be looking for a valid relative URI with no parameters.


namespace Microsoft.Health.Extensions.Fhir
{
public static class ModelExtensions
{
private static ConcurrentDictionary<string, Regex> _idMatcherRegexCache = new ConcurrentDictionary<string, Regex>();
Copy link
Member

@dustinburson dustinburson Jun 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than define these dynamically and a regex per resource type can we do some simple string manipulation as a preprocessing. I.E. do a .StartsWith("ResourceType") and if true do a substring on the remaining reference . This will allow us to just have one compiled Regex statically and improve processing throughput and memory utilization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@@ -39,9 +42,28 @@ public static Hl7.Fhir.Model.ResourceReference ToReference<TResource>(this strin
/// <returns>The id for the specified resource type if it exists, null otherwise.</returns>
public static string GetId<TResource>(this Hl7.Fhir.Model.ResourceReference reference)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some tests for just the GetId extension method? Would be good to verify for the reference types we support today (Device & Patient)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added ModelExtensionsTests

@wi-y wi-y merged commit adc433f into master Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants