Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking up large swagger.yaml files #254

Closed
patrickw276 opened this issue Jul 6, 2016 · 65 comments · Fixed by #1648
Closed

Breaking up large swagger.yaml files #254

patrickw276 opened this issue Jul 6, 2016 · 65 comments · Fixed by #1648

Comments

@patrickw276
Copy link

Is there any way to break up large swagger.yaml files into smaller pieces? I've tried using external references but it doesn't seem like Connexion supports these. I've also tried using jinja features to break the swagger files up but this doesn't work (and its ugly).

Am I missing a feature that would let me do this? If not, is relative referencing a feature Connexion would be interested in? I know that the jsonschema library used internally supports relative referencing so it might not be too difficult to implement. If you guys are interested, I can take a crack at adding the feature.

@hjacobs
Copy link
Contributor

hjacobs commented Jul 7, 2016

@patrickw276 I think it's not supported right now in Connexion, but "people" might be interested in this feature. I personally don't have the need for it as IMHO "microservices" should not have large Swagger files 😏.

@rafaelcaricio
Copy link
Collaborator

rafaelcaricio commented Jul 7, 2016

As @hjacobs said, one using Connexion might be interested in this kind of feature. It is something interesting to support. Please send a PR and we will review. 😃

@patrickw276
Copy link
Author

@hjacobs My use case doesn't involve very much processing but the model's schema is somewhat complex. It doesn't make much sense to break the API into smaller microservices (IMO).

I've started working on a PR where the swagger document's external references are resolved into a single dictionary before being passed to the swagger validator. For example:

# doc1.yaml
key1: value1
key2:
  $ref: 'doc2.yaml#/key3'

# doc2.yaml
key3: value3

will resolve to a dictionary that looks like

{'key1': 'value1', 'key2': 'value3'}

Recursive references will be handled by pointing the recursive $ref to its new location in the combined document. Do you guys see any issues with this approach? Also, I don't plan on implementing non-local reference resolution (e.g. using http).

@rafaelcaricio
Copy link
Collaborator

rafaelcaricio commented Jul 22, 2016

A simpler way, since Connexion already uses Jinja2 for template rendering, would be making use of the {% include "something.yaml" %} directive. All necessary changes to make this work is to configure the Template#environment attribute in the Connexion code to use the jinja2.FileSystemLoader loader. Then you would be able to break your Swagger/OpenAPI definition in several files.

@patrickw276
Copy link
Author

@rafaelcaricio That would definitely be simpler to implement BUT you can get in to some weird issues with local references. These references would have to be declared relative to the complete concatenated document, not the document they actually exist in. I'm not sure how often this would be an issue in a real API but it's something to consider.

I favor the reference resolution approach I outlined above because it implements a feature as outlined in the Swagger/OpenAPI Specification.

@rafaelcaricio
Copy link
Collaborator

I do not dislike your solution. I am just trying to show more possibilities. ✌️

Anyway, where is this outlined in the Swagger/OpenAPI Specification?

@patrickw276
Copy link
Author

patrickw276 commented Jul 22, 2016

@rafaelcaricio It's good to be open to other approaches, so I definitely appreciate the input (and all your guys work on Connexion in general).

The spec mentions them here and a few other places.

EDIT: Actually, here is a better place in the document to read, along with the link to canonical dereferencing.

@rafaelcaricio
Copy link
Collaborator

@patrickw276 Sounds good. 👍

@ibigpapa
Copy link
Contributor

If you are looking for an alternative while this gets baked in. I used a node app to help because I have a swagger api that is split up across 15 to 20 files fairly large.

The tool I used is really simple to create a single swagger file that connexion will work with as it'll fully dereference pointers for you.

https://www.npmjs.com/package/swagger-cli

here's a quick start for you

npm install -g swagger-cli
swagger validate <path_to_root_spec>

Once it's validated it can make a single file doing the following

swagger bundle -r -o <output_path> <path_to_root_spec>

The -r is to fully dereference.

@do3cc
Copy link

do3cc commented Dec 7, 2016

Out of curriosity, how is that handled in Zalando?
Is this file https://api.zalando.com/schema/swagger.json also bundled?

@ibigpapa
Copy link
Contributor

ibigpapa commented Dec 7, 2016

Looks like you can find more on the shop api here https://github.com/zalando/shop-api-documentation

@hjacobs
Copy link
Contributor

hjacobs commented Dec 7, 2016

@do3cc I personally don't know anybody in Zalando who splits up Swagger files (mostly it's "microservices", right?). Our RESTful API Guidelines also don't cover this.

@patrickw276
Copy link
Author

I came to the conclusion that to do this cleanly, loading the individual files instead of bundling, the yelp swagger_spec_validator would need to support yaml directly. I opened up a PR on that repo but never got any feedback.

I do think there is a valid use case for this feature that can't be solved with microservices. I work with earth science metadata where the schemas can be quite complex. It wouldn't really make sense to break an api storing this data into microservices but breaking the swagger spec up could help with managing the complexity.

@advance512
Copy link

Any update on this? It is quite uncomfortable using a single file, why not allow multiple files as in the OpenAPI spec?

@rafaelcaricio
Copy link
Collaborator

@advance512 it is not a matter of allowing or not. We don't mind to accept a PR that solves this issue. But so far, there is none.

@cigano
Copy link

cigano commented Jun 22, 2019

Used @tuankiet65's solution with a few tweaks:

def get_bundled_specs(main_file: Path) -> Dict[str, Any]:
    parser = prance.ResolvingParser(str(main_file.absolute()),
                                    lazy = True, backend = 'openapi-spec-validator')
    parser.parse()
    return parser.specification

justkrys added a commit to justkrys/racecard that referenced this issue Apr 16, 2020
Connexion cannot currently handle $ref to external files.
prance merges OpenAPI files and dereferences $ref entries.
This works around connextions issue.
See for details: spec-first/connexion#254
@tomghyselinck
Copy link

Unfortunately the solution using prance does not work for us since we use recursive schema's.

I.e. something like the following won't work:

    RecursiveItem:
      title: Tree of items
      type: object
      properties:
        name:
          type: string
        children:
          $ref: "#/components/schemas/RecursiveItemList"
      additionalProperties: false
    RecursiveItemList:
      title: collection of recursive items
      type: array
      items:
        $ref: "#/components/schemas/RecursiveItem"

@tahmidefaz
Copy link

tahmidefaz commented Oct 29, 2020

@tomghyselinck Anything you guys ended up doing to work around it? Currently, dealing with the same issue.

@HRogge
Copy link

HRogge commented Oct 30, 2020

I decided not to look at OpenAPI again until they got to version 3.1
I lost too much time on the inconsistencies between JSON-Schema and (OpenAPI3)-JSON-Schema.

@tomghyselinck
Copy link

tomghyselinck commented Nov 6, 2020

Hi @tahmidefaz, unfortunately I haven't been able to work around it.

We haven't split up the (meanwhile very large) OpenAPI yaml file.

I have been playing around with the connexion and openapi-spec-validator code but haven't been able to make some mature changes.

@tahmidefaz
Copy link

@tomghyselinck thank you for the reply!

@RonnyPfannschmidt
Copy link

i took over the maintenance of prance and started to work towards supporting recursive refs and openapi 3.1

@Glutexo
Copy link
Contributor

Glutexo commented May 18, 2021

Recursive references are going to be supported very shortly. There is already an alternative parser ready that handles this correctly.

Update 1: A pull request here – RonnyPfannschmidt/prance#101
Update 2: Merged!

@aniketbhatnagar
Copy link

Thank you for fixing this. Any clue when this will be released?

@aniketbhatnagar
Copy link

Thank you for fixing this. Any ETA on when to expect recursive references to work?

@advance512
Copy link

@Glutexo is this already part of the latest released version?

@RonnyPfannschmidt
Copy link

@advance512 recursive references are part of the latest release, but it needs a correct parser configuration atm

@alexandr-san4ez
Copy link

One more way to use $ref with separate files:

import connexion
from connexion.json_schema import default_handlers as json_schema_handlers

app = connexion.App(__name__, specification_dir='swagger/')

json_schema_handlers[''] = lambda uri: json_schema_handlers['file'](str(app.specification_dir / uri))
app.add_api('spec_v1.yaml')

I tested it on connexion==2.13.1. Not sure how it will work with other versions.

Perhaps someone will be useful.

@wackykid
Copy link

wackykid commented Aug 3, 2022

i really do wish this feature is implemented too... i am involved in a large scale project where there are multiple developers contributing to the development so having a single YAML file is too huge and difficult to manage... so we split it into multiple files by different modules each of them being a microservice... but collectively they are all considered as part of the same system...

the way we have to deal with this is to have a script to generate flask server for each YAML file and having each of them run independently with its own flask instance using different ports... ideally we would want to be able to just run a single flask server so that we don't have to deal with a different port for each module/microservice...

@erans
Copy link

erans commented Aug 3, 2022

The above solution (#254 (comment)) help load multiple yaml files referenced from a main one, you also need this:

def my_resolve_refs(spec, store=None, handlers=None):
    """
    Resolve JSON references like {"$ref": <some URI>} in a spec.
    Optionally takes a store, which is a mapping from reference URLs to a
    dereferenced objects. Prepopulating the store can avoid network calls.
    """
    from connexion.json_schema import default_handlers

    spec = deepcopy(spec)
    store = store or {}
    handlers = handlers or default_handlers
    resolver = RefResolver("", spec, store, handlers=handlers)

    def _do_resolve(node):
        if isinstance(node, Mapping) and "$ref" in node:
            path = node["$ref"][2:].split("/")
            try:
                # resolve known references
                node.update(deep_get(spec, path))
                del node["$ref"]
                return node
            except KeyError:
                # resolve external references
                with resolver.resolving(node["$ref"]) as resolved:
                    # if not fixed:
                    #     return resolved
                    # else:
                    return _do_resolve(resolved)
                    # endfix
        elif isinstance(node, Mapping):
            for k, v in node.items():
                node[k] = _do_resolve(v)
        elif isinstance(node, (list, tuple)):
            for i, _ in enumerate(node):
                node[i] = _do_resolve(node[i])
        return node

    res = _do_resolve(spec)
    return res

A slightly modified $ref resolver that is recursive, so if you see somewhere inside the files $ref: fileb.yml/MyObject it will know how to resolve it (even if its not on the main file.

Once you have this function in the code you can patch it by applying:

spec.resolve_refs = my_resolve_refs

Prior to calling add_api.

This is basically a combination of a few things I've found online that helped us manage it without forking connexion.

@zdanial
Copy link

zdanial commented Dec 6, 2022

tried a bunch of the solutions here but couldn't get any to work. opted for the quick and dirty approach as follows:

app = connexion.App(__name__, specification_dir='swagger/')
with open('./myapi/swagger/myapi_merged.yml', 'w') as out, \
    open('./myapi/swagger/myapi.yml', 'r') as main, \
    open('./myapi/swagger/myapi_schemas.yml', 'r') as schemas:

    main_yml = main.read()
    merged_yml = main_yml.replace('{{schemas}}', schemas.read())

    out.write(merged_yml)

app.add_api('myapi_merged.yml')

so the last part of main.yml is:

components:
    ...
    schemas:
    {{schemas}}

don't need to use {{schemas}} but figured jinja syntax would be most readable but any unique flag here works.

@phasath
Copy link

phasath commented Feb 4, 2023

Is referencing not working yet? The issue seems a bit confusing due to some answers in which it was supported and merged, but not working?

RobbeSneyders added a commit that referenced this issue Feb 22, 2023
Fixes #254 
Fixes #967 

This PR fixes the very long-standing issue of being able to handle
relative references, which allows users to split their specification
into multiple files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet