Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: /api/v1/strucvars/csq endpoint with OpenAPI (#607) #612

Conversation

holtgrewe
Copy link
Contributor

@holtgrewe holtgrewe commented Nov 11, 2024

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a new API endpoint /api/v1/strucvars/csq for querying structural variant consequences.
    • Added new query parameters for the endpoint, enhancing user input options.
    • Enhanced OpenAPI documentation to include new schemas and endpoint details.
  • Improvements

    • Updated existing endpoints for gene transcripts, including a new handler function.
    • Improved type specificity in query and response structures for better clarity.
  • Deprecations

    • The previous /genes/txs endpoint is now deprecated.

@holtgrewe holtgrewe linked an issue Nov 11, 2024 that may be closed by this pull request
Copy link
Contributor

coderabbitai bot commented Nov 11, 2024

Walkthrough

The pull request introduces a new API endpoint /api/v1/strucvars/csq to the OpenAPI schema, allowing users to query the consequences of structural variants. This includes several required and optional query parameters. The response structure is defined for successful and error responses, with new schemas added for better data representation. Additionally, significant renaming and restructuring occur in the codebase, particularly in enums and structs related to transcript effects and variant types. The server is updated to handle the new endpoint, enhancing the API's overall functionality.

Changes

File Path Change Summary
openapi.schema.yaml Added new endpoint /api/v1/strucvars/csq with query parameters and response schemas.
src/annotate/strucvars/csq.rs Renamed enums and structs for consistency; updated methods to return new types.
src/common/mod.rs Added utoipa::ToSchema derive macro to GenomeRelease enum.
src/server/run/actix_server/gene_txs.rs Renamed endpoint from /seqvars/csq to /api/v1/genes/transcripts; added new handler function.
src/server/run/actix_server/mod.rs Added service for the new handle_with_openapi method for the new endpoint.
src/server/run/actix_server/strucvars_csq.rs Introduced StrucvarsCsqQuery and StrucvarsCsqResponse; updated endpoint handling logic.
src/server/run/mod.rs Updated OpenAPI documentation with new imports and paths for structural variant consequences.

Possibly related PRs

Poem

🐇 In fields of code where rabbits play,
New paths emerge, bright as day.
A variant's tale, now we can see,
With structures and schemas, oh so free!
Let's hop along, with joy we sing,
For every change, a new spring! 🌼


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

…alternative-with-openapi-schema-for-strucvarscsq
@holtgrewe
Copy link
Contributor Author

CC @tedil here you can see the issues I mentionsed as= not working and recursive schema extraction not working.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Outside diff range and nitpick comments (16)
src/server/run/actix_server/mod.rs (1)

Line range hint 1-77: Consider architectural improvements for parameter handling and schema extraction.

Given the reported issues with the as= parameter and recursive schema extraction:

  1. Consider implementing a custom middleware for parameter handling to ensure consistent parameter processing across all endpoints.
  2. For recursive schema extraction, consider implementing a caching mechanism to improve performance, especially if the schemas are relatively static.

Example middleware pattern for Actix:

pub struct ParamTransformMiddleware;

impl<S> Transform<S, ServiceRequest> for ParamTransformMiddleware {
    // Implementation details...
}

Would you like me to provide a detailed implementation for either of these suggestions?

src/server/run/mod.rs (1)

Line range hint 108-124: Remove duplicate hint for structural variants endpoint.

There are two identical hints for the /strucvars/csq endpoint. The second one appears to be a copy-paste with a typo in the comment (structvars vs strucvars).

Apply this diff to remove the duplicate:

    // The endpoint `/strucvars/csq` computes the consequence of an SV.
    tracing::info!(
        "  try: http://{}:{}/strucvars/csq?genome_release=grch37\
        &chromosome=17&start=48275360&&stop=48275370&sv_type=DEL",
        args.listen_host.as_str(),
        args.listen_port
    );
-    // The endpoint `/structvars/csq` computes the consequence of an SV.
-    tracing::info!(
-        "  try: http://{}:{}/strucvars/csq?genome_release=grch37\
-        &chromosome=17&start=48275360&&stop=48275370&sv_type=DEL",
-        args.listen_host.as_str(),
-        args.listen_port
-    );
openapi.schema.yaml (1)

194-213: Enhance enum documentation.

The StrucvarsSvType and StrucvarsTranscriptEffect enums would benefit from descriptions for each possible value to improve API documentation.

     StrucvarsSvType:
       type: string
       description: Structural Variant type.
       enum:
       - DEL
       - DUP
       - INS
       - INV
       - BND
+      enumDescriptions:
+        DEL: Deletion
+        DUP: Duplication
+        INS: Insertion
+        INV: Inversion
+        BND: Breakend
     StrucvarsTranscriptEffect:
       type: string
       description: Enumeration for effect on transcript.
       enum:
       - transcript_variant
       - exon_variant
       - splice_region_variant
       - intron_variant
       - upstream_variant
       - downstream_variant
       - intergenic_variant
+      enumDescriptions:
+        transcript_variant: Affects the transcript
+        exon_variant: Affects an exon
+        splice_region_variant: Affects a splice region
+        intron_variant: Affects an intron
+        upstream_variant: Affects upstream region
+        downstream_variant: Affects downstream region
+        intergenic_variant: Affects intergenic region
src/common/mod.rs (1)

49-49: LGTM! Consider adding schema documentation.

The addition of utoipa::ToSchema is appropriate for OpenAPI schema generation. However, consider enhancing the API documentation by adding a description to the schema.

Add documentation comments that will be included in the OpenAPI schema:

 /// Select the genome release to use.
+#[doc = "Genome release version used for variant annotation."]
 #[derive(
     clap::ValueEnum,
     serde::Serialize,
     serde::Deserialize,
     Clone,
     Copy,
     Debug,
     PartialEq,
     Eq,
     Hash,
     Default,
     utoipa::ToSchema,
 )]
 #[serde(rename_all = "snake_case")]
 pub enum GenomeRelease {
+    #[doc = "GRCh37 (hg19) genome release"]
     #[default]
     Grch37,
+    #[doc = "GRCh38 (hg38) genome release"]
     Grch38,
 }
src/server/run/actix_server/gene_txs.rs (4)

1-5: Enhance deprecation notice with migration guidance.

While the documentation clearly indicates that /genes/txs is deprecated, it would be helpful to add migration guidance for users, including the timeline for removal and steps to migrate to the new endpoint.


Line range hint 78-342: Consider enhancing input validation and error messages.

A few suggestions for improvement:

  1. The page size validation could be extracted into a separate function for better reusability.
  2. Error messages could be more descriptive, especially for validation failures.

Consider this refactor:

+ fn validate_page_size(size: Option<i32>) -> i32 {
+     size.unwrap_or(PAGE_SIZE_DEFAULT)
+         .min(PAGE_SIZE_MAX)
+         .max(1)
+ }

  fn genes_tx_impl(
      data: Data<super::WebServerData>,
      query: GeneTranscriptsQuery,
  ) -> Result<GeneTranscriptsResponse, CustomError> {
      let GeneTranscriptsQuery {
          genome_build,
          hgnc_id,
          page_size,
          next_page_token,
      } = query;
-     let page_size = page_size
-         .unwrap_or(PAGE_SIZE_DEFAULT)
-         .min(PAGE_SIZE_MAX)
-         .max(1);
+     let page_size = validate_page_size(page_size);

Line range hint 342-359: Consider adding rate limiting and caching.

The endpoint implementation is solid, but consider implementing:

  1. Rate limiting to prevent abuse
  2. Response caching to improve performance, especially since gene transcript data doesn't change frequently

Line range hint 78-341: Consider using custom error types instead of anyhow::Error.

The current implementation uses anyhow::Error for error handling in type conversions. Consider creating custom error types for better error handling and API documentation:

#[derive(Debug, thiserror::Error)]
pub enum ConversionError {
    #[error("Invalid biotype: {0:?}")]
    InvalidBiotype(i32),
    #[error("Invalid transcript tag: {0:?}")]
    InvalidTranscriptTag(i32),
    // Add other specific error variants
}

This would provide:

  • More specific error handling
  • Better API documentation
  • Improved error messages for clients
src/server/run/actix_server/strucvars_csq.rs (3)

206-206: Avoid unnecessary cloning of query

In line 206, query.clone().into_inner() is used, which clones the entire query object before consuming it. Since compute_tx_effects only needs a reference, you can use query.get_ref() to obtain a reference to the inner StrucvarsCsqQuery, avoiding the clone and improving performance.

Apply this diff to eliminate the unnecessary clone:

-        let result = predictor.compute_tx_effects(&query.clone().into_inner());
+        let result = predictor.compute_tx_effects(query.get_ref());

Line range hint 24-68: Refactor to reduce code duplication between Query and StrucvarsCsqQuery

Both Query and StrucvarsCsqQuery structs, along with their implementations of interface::StrucVar, contain similar fields and methods. This duplication can make maintenance harder and increase the chance of inconsistencies. Consider refactoring by introducing a common struct or trait that encapsulates the shared functionality.

One possible approach is to define a common struct or trait:

#[derive(serde::Serialize, serde::Deserialize, Debug, Clone)]
struct BaseQuery {
    pub genome_release: GenomeRelease,
    pub chromosome: String,
    pub start: i32,
    pub stop: Option<i32>,
}

impl interface::StrucVar for BaseQuery {
    // Implement common methods...
}

// Then have Query and StrucvarsCsqQuery embed BaseQuery
struct Query {
    #[serde(flatten)]
    pub base: BaseQuery,
    pub sv_type: String,
}

struct StrucvarsCsqQuery {
    #[serde(flatten)]
    pub base: BaseQuery,
    pub sv_type: StrucvarsSvType,
}

This refactoring reduces duplication and centralizes shared logic.

Also applies to: 117-164


210-210: Handle version retrieval errors more gracefully

In line 210, if there's a problem determining the version, the error message returned to the client includes the internal error detail. Consider returning a more user-friendly error message to avoid exposing internal details.

Apply this diff to improve error handling:

-            .map_err(|e| CustomError::new(anyhow::anyhow!("Problem determining version: {}", e)))?;
+            .map_err(|_| CustomError::new(anyhow::anyhow!("Unable to determine version information")))?;

This change avoids leaking internal error details to the client.

src/annotate/strucvars/csq.rs (5)

15-24: Consider Deriving the Hash Trait for StrucvarsTranscriptEffect

To maintain consistency and enable usage in hash-based collections like HashMap or HashSet, consider deriving the Hash trait for StrucvarsTranscriptEffect, as you have done for StrucvarsSvType.


27-28: Address the TODO Comment Regarding Schema Renaming

The TODO comment suggests renaming back to TranscriptEffect once utoipa's as= is fixed. It's advisable to track such tasks using an issue tracker to avoid leaving TODO comments in the code.

Would you like me to open a GitHub issue to track this task?


149-149: Ensure Consistent Derivation of Traits for TxRegion

The TxRegion struct contains the effect field of type StrucvarsTranscriptEffect. To ensure consistency and enable potential uses in collections or comparisons, consider deriving additional traits such as serde::Serialize, serde::Deserialize, and Hash if needed.


159-159: Consider Deriving PartialEq and Eq for StrucvarsGeneTranscriptEffects

The StrucvarsGeneTranscriptEffects struct does not currently derive PartialEq or Eq. If you plan to compare instances or store them in sets, deriving these traits would be beneficial.


482-492: Ensure All Variants of StrucvarsSvType Are Handled

In the compute_tx_effects method, verify that all possible variants of StrucvarsSvType are covered in the match expression. If new variants are added in the future, consider adding a wildcard arm to handle unexpected cases.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 9132312 and 9cab99e.

📒 Files selected for processing (7)
  • openapi.schema.yaml (3 hunks)
  • src/annotate/strucvars/csq.rs (20 hunks)
  • src/common/mod.rs (1 hunks)
  • src/server/run/actix_server/gene_txs.rs (3 hunks)
  • src/server/run/actix_server/mod.rs (1 hunks)
  • src/server/run/actix_server/strucvars_csq.rs (4 hunks)
  • src/server/run/mod.rs (1 hunks)
🔇 Additional comments (5)
src/server/run/mod.rs (2)

19-26: LGTM: Well-organized imports for structural variant support.

The new imports are properly organized and align with the PR's objective of adding the structural variants consequence endpoint.


36-36: LGTM: Comprehensive OpenAPI documentation.

The OpenAPI schema is well-structured with all necessary components for the new structural variants endpoint.

Also applies to: 42-48

openapi.schema.yaml (1)

162-179: Verify recursive schema resolution.

The PR comments mention issues with recursive schema extraction. The StrucvarsCsqResponse schema has nested references that might be causing this issue. Please verify that your OpenAPI tooling can correctly resolve these nested schemas:

  • versionVersionsInfoResponseSoftwareVersions & DataVersionEntryAssembly
  • queryStrucvarsCsqQueryAssembly & StrucvarsSvType
  • resultStrucvarsGeneTranscriptEffectsStrucvarsTranscriptEffect
✅ Verification successful

Schema references are properly structured without circular dependencies

Based on the verification results, the schema references are well-structured and follow a clear hierarchical pattern without any circular dependencies:

  • StrucvarsCsqResponseVersionsInfoResponseSoftwareVersions (simple object with version fields)
  • StrucvarsCsqResponseStrucvarsCsqQueryGenomeRelease (string enum)
  • StrucvarsCsqResponseStrucvarsGeneTranscriptEffectsStrucvarsTranscriptEffect (string enum)

All terminal references resolve to either simple objects or string enums, ensuring proper schema resolution.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify OpenAPI schema validity and resolution
# Note: Using standard tools to validate schema and check for circular references

# Test 1: Validate OpenAPI schema
npx @stoplight/spectral-cli lint openapi.schema.yaml

# Test 2: Check for circular references in schema
jq -r '.components.schemas | 
  to_entries[] | 
  select(.value.properties != null) | 
  .key as $parent | 
  .value.properties | 
  to_entries[] | 
  select(.value["$ref"] != null) | 
  [$parent, .value["$ref"]] | 
  @tsv' openapi.schema.yaml

Length of output: 428


Script:

#!/bin/bash
# Let's try a different approach using grep to trace the schema references

# First, let's look at the direct references from StrucvarsCsqResponse
echo "=== Direct references from StrucvarsCsqResponse ==="
rg -A 10 "StrucvarsCsqResponse:" openapi.schema.yaml

# Then check each referenced schema
echo -e "\n=== VersionsInfoResponse definition ==="
rg -A 10 "VersionsInfoResponse:" openapi.schema.yaml

echo -e "\n=== StrucvarsCsqQuery definition ==="
rg -A 10 "StrucvarsCsqQuery:" openapi.schema.yaml

echo -e "\n=== StrucvarsGeneTranscriptEffects definition ==="
rg -A 10 "StrucvarsGeneTranscriptEffects:" openapi.schema.yaml

# Check for potential circular references
echo -e "\n=== Checking for potential circular references ==="
rg "\$ref.*StrucvarsCsqResponse" openapi.schema.yaml

Length of output: 1982


Script:

#!/bin/bash
# Let's check the deeper nested references to complete our analysis

# Check SoftwareVersions schema
echo "=== SoftwareVersions definition ==="
rg -A 5 "SoftwareVersions:" openapi.schema.yaml

# Check GenomeRelease schema
echo -e "\n=== GenomeRelease definition ==="
rg -A 5 "GenomeRelease:" openapi.schema.yaml

# Check transcript_effects field type
echo -e "\n=== Checking transcript_effects type ==="
rg -A 10 "transcript_effects:" openapi.schema.yaml

# Check for any StrucvarsTranscriptEffect references
echo -e "\n=== StrucvarsTranscriptEffect definition ==="
rg -A 10 "StrucvarsTranscriptEffect:" openapi.schema.yaml

# Look for any other potential circular references in the schema
echo -e "\n=== Checking other potential circular references ==="
rg "\$ref.*#/components/schemas/Strucvars" openapi.schema.yaml

Length of output: 1631

src/annotate/strucvars/csq.rs (2)

76-77: Address the TODO Comment Regarding Schema Renaming

Similar to the previous TODO, this comment indicates a pending schema renaming. Consider creating an issue to track this change instead of leaving a TODO in the code.


Line range hint 213-237: Verify All Enum Variant References Are Updated

Ensure that all references to the old TranscriptEffect enum variants have been updated to StrucvarsTranscriptEffect throughout the codebase, not just in this file. This includes any pattern matches, function parameters, or variable declarations.

You can run the following script to check for any remaining references:

Also applies to: 246-255

✅ Verification successful

Let me search for more specific information about the enum variants and their usage.


Based on the search results, I can now generate the final response:

All Enum Variant References Are Correctly Updated

The codebase shows consistent usage of StrucvarsTranscriptEffect enum variants in the structural variants module (strucvars/csq.rs). The only reference to TranscriptEffect is in a comment about schema renaming. The variants UpstreamVariant, DownstreamVariant, and SpliceRegionVariant in other files (like seqvars/ann.rs and seqvars/csq.rs) belong to different, unrelated enums for sequence variant annotation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Find any remaining references to `TranscriptEffect` enum variants.

# Test: Search for `TranscriptEffect` in Rust files. Expect: No matches.
rg '\bTranscriptEffect\b' --type rust

Length of output: 181


Script:

#!/bin/bash
# Search for both old and new enum variants to ensure complete migration
rg -A 2 'UpstreamVariant|DownstreamVariant|SpliceRegionVariant' --type rust

# Also check for any struct or enum definitions related to these effects
ast-grep --pattern 'enum $_ {
  $$$
  UpstreamVariant,
  $$$
}'

Length of output: 10036

@@ -62,6 +62,7 @@ pub async fn main(
.service(gene_txs::handle_with_openapi)
.service(seqvars_csq::handle)
.service(strucvars_csq::handle)
.service(strucvars_csq::handle_with_openapi)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Inconsistent OpenAPI handler pattern detected

The search results reveal that while gene_txs has both regular and OpenAPI handlers, seqvars_csq only has a regular handler without an OpenAPI counterpart. This inconsistency should be addressed.

  • gene_txs: Has both .handle and .handle_with_openapi
  • seqvars_csq: Only has .handle
  • strucvars_csq: Has both .handle and .handle_with_openapi
  • versions: Only has .handle
🔗 Analysis chain

LGTM! OpenAPI handler integration follows established pattern.

The addition of strucvars_csq::handle_with_openapi follows the same pattern as other endpoints (e.g., gene_txs) that have both regular and OpenAPI-enabled handlers.

Let's verify the endpoint registration pattern is consistent:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify that other endpoints follow the same pattern of having both regular and OpenAPI handlers

# Test: Check for handler registration patterns across the codebase
rg -A 1 '\.service\(' src/server/run/actix_server/mod.rs

Length of output: 425

Comment on lines +118 to +123
GenomeRelease:
type: string
description: Select the genome release to use.
enum:
- grch37
- grch38
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove duplicate schema definition.

The GenomeRelease schema duplicates the existing Assembly schema. Both define the same enum values. Consider using the existing Assembly schema instead.

-    GenomeRelease:
-      type: string
-      description: Select the genome release to use.
-      enum:
-      - grch37
-      - grch38

Then update references to use Assembly:

-          $ref: '#/components/schemas/GenomeRelease'
+          $ref: '#/components/schemas/Assembly'

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +12 to +64
/api/v1/strucvars/csq:
get:
tags:
- strucvars_csq
summary: Query for consequence of a variant.
operationId: strucvarsCsq
parameters:
- name: genome_release
in: query
description: The assembly.
required: true
schema:
$ref: '#/components/schemas/GenomeRelease'
- name: chromosome
in: query
description: Chromosome.
required: true
schema:
type: string
- name: start
in: query
description: 1-based start position.
required: true
schema:
type: integer
format: int32
- name: stop
in: query
description: 1-based stop position, ignored for INS.
required: false
schema:
type: integer
format: int32
nullable: true
- name: sv_type
in: query
description: The variant type to use for annotation.
required: true
schema:
$ref: '#/components/schemas/StrucvarsSvType'
responses:
'200':
description: Strucvars consequence information.
content:
application/json:
schema:
$ref: '#/components/schemas/StrucvarsCsqResponse'
'500':
description: Internal server error.
content:
application/json:
schema:
$ref: '#/components/schemas/CustomError'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add missing query parameter and enhance endpoint documentation.

The endpoint specification needs several improvements:

  1. The as= parameter mentioned in PR comments is missing from the specification.
  2. Add validation for numeric parameters:
    • start: Add minimum value (should be >= 1)
    • stop: Add minimum value (should be >= 1)
  3. Add common HTTP status codes:
    • 400: For invalid input parameters
    • 404: For when no results are found
  4. Add example requests and responses using OpenAPI's example or examples field
      - name: start
        in: query
        description: 1-based start position.
        required: true
        schema:
          type: integer
          format: int32
+         minimum: 1
+         example: 1000000
      - name: stop
        in: query
        description: 1-based stop position, ignored for INS.
        required: false
        schema:
          type: integer
          format: int32
          nullable: true
+         minimum: 1
+         example: 1000100
+     - name: as
+       in: query
+       description: Output format specification
+       required: false
+       schema:
+         type: string
+         enum: [vcf, json]
+         default: json
      responses:
        '200':
          description: Strucvars consequence information.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/StrucvarsCsqResponse'
+             example:
+               version: {...}
+               query: {...}
+               result: [...]
+       '400':
+         description: Invalid input parameters.
+         content:
+           application/json:
+             schema:
+               $ref: '#/components/schemas/CustomError'
+       '404':
+         description: No results found.
+         content:
+           application/json:
+             schema:
+               $ref: '#/components/schemas/CustomError'
        '500':
          description: Internal server error.

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +191 to +216
async fn handle_with_openapi(
data: Data<super::WebServerData>,
_path: Path<()>,
query: web::Query<StrucvarsCsqQuery>,
) -> actix_web::Result<Json<StrucvarsCsqResponse>, CustomError> {
let predictor = data
.strucvars_predictors
.get(&query.genome_release)
.ok_or_else(|| {
super::CustomError::new(anyhow::anyhow!(
"genome release not supported: {:?}",
&query.genome_release
))
})?;

let result = predictor.compute_tx_effects(&query.clone().into_inner());

let result = StrucvarsCsqResponse {
version: VersionsInfoResponse::from_web_server_data(data.into_inner().as_ref())
.map_err(|e| CustomError::new(anyhow::anyhow!("Problem determining version: {}", e)))?,
query: query.into_inner(),
result,
};

Ok(Json(result))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for invalid query parameters

In the handle_with_openapi function, there is no validation for the query parameters beyond checking the genome release. If invalid parameters are provided (e.g., negative positions, empty chromosome strings), it could lead to unexpected behavior. Consider adding validation to ensure all query parameters are valid before processing.

You could add validation logic at the beginning of the handler:

if self.start < 1 {
    return Err(CustomError::new(anyhow::anyhow!("`start` must be a positive integer")));
}
if self.chromosome.is_empty() {
    return Err(CustomError::new(anyhow::anyhow!("`chromosome` must not be empty")));
}
// Add additional validations as needed

Comment on lines +150 to +154
if self.sv_type() == interface::StrucvarsSvType::Ins {
self.start
} else {
self.stop.unwrap_or(self.start)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Ensure stop is provided for structural variants other than insertions

In the stop() method for StrucvarsCsqQuery, if sv_type is not Ins, and stop is None, it defaults to self.start. For structural variants like deletions or duplications, this might not be accurate. Consider validating that stop is provided for the appropriate sv_type values.

Apply this diff to add validation:

 fn stop(&self) -> i32 {
     if self.sv_type() == interface::StrucvarsSvType::Ins {
         self.start
     } else {
-        self.stop.unwrap_or(self.start)
+        self.stop.unwrap_or_else(|| {
+            panic!("`stop` must be provided for sv_type {:?}", self.sv_type());
+        })
     }
 }

Alternatively, return a user-friendly error instead of panicking to handle this case gracefully.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if self.sv_type() == interface::StrucvarsSvType::Ins {
self.start
} else {
self.stop.unwrap_or(self.start)
}
if self.sv_type() == interface::StrucvarsSvType::Ins {
self.start
} else {
self.stop.unwrap_or_else(|| {
panic!("`stop` must be provided for sv_type {:?}", self.sv_type());
})
}

Comment on lines +61 to +68
fn sv_type(&self) -> interface::StrucvarsSvType {
match self.sv_type.to_uppercase().as_ref() {
"DEL" => interface::StrucVarType::Del,
"DUP" => interface::StrucVarType::Dup,
"INS" => interface::StrucVarType::Ins,
"BND" => interface::StrucVarType::Bnd,
"INV" => interface::StrucVarType::Inv,
_ => interface::StrucVarType::Del,
"DEL" => interface::StrucvarsSvType::Del,
"DUP" => interface::StrucvarsSvType::Dup,
"INS" => interface::StrucvarsSvType::Ins,
"BND" => interface::StrucvarsSvType::Bnd,
"INV" => interface::StrucvarsSvType::Inv,
_ => interface::StrucvarsSvType::Del,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Handle unrecognized sv_type values explicitly

Currently, the sv_type() method defaults to Del when an unrecognized sv_type is provided. This could lead to unexpected behavior if invalid input is given. Consider handling unrecognized sv_type values explicitly by returning an error or adding an Unknown variant to StrucvarsSvType.

Apply this diff to handle invalid sv_type values:

 fn sv_type(&self) -> interface::StrucvarsSvType {
     match self.sv_type.to_uppercase().as_ref() {
         "DEL" => interface::StrucvarsSvType::Del,
         "DUP" => interface::StrucvarsSvType::Dup,
         "INS" => interface::StrucvarsSvType::Ins,
         "BND" => interface::StrucvarsSvType::Bnd,
         "INV" => interface::StrucvarsSvType::Inv,
-        _ => interface::StrucvarsSvType::Del,
+        _ => {
+            // Return an error or handle the invalid value appropriately
+            return interface::StrucvarsSvType::Unknown;
+        }
     }
 }

Ensure that StrucvarsSvType includes an Unknown variant to handle invalid cases gracefully.

Committable suggestion skipped: line range outside the PR's diff.

@holtgrewe holtgrewe merged commit cce8a6d into main Nov 11, 2024
10 checks passed
@holtgrewe holtgrewe deleted the 607-provide-endpoint-alternative-with-openapi-schema-for-strucvarscsq branch November 11, 2024 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide endpoint alternative with OpenAPI schema for /strucvars/csq
1 participant