Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: use rewriter to modify wrapping #3

Merged
merged 1 commit into from
Nov 21, 2015
Merged

WIP: use rewriter to modify wrapping #3

merged 1 commit into from
Nov 21, 2015

Conversation

visr
Copy link
Member

@visr visr commented Oct 13, 2015

So far this PR should cover all manual edits to the wrapped code in RasterIO.jl, namely:

  1. CPLErr was added manually: solved by wrapping cpl_error.h
  2. Replace ::Cint with ::Integer as in wkearn/RasterIO.jl@7da9573
  3. Let wrap_gdal.jl comment out some lines in common.jl, as in ad44fac

This rewriting functionality is quite powerful, but I am not yet sure of all useful applications or what to do with those. Some ideas:

  • padfSrcGeoTransform::Ptr{Cdouble}: rewrite to accept Vector{Float64}
  • pszDstWKT::Ptr{UInt8}: rewrite to accept AbstractString

Another extreme would be to omit types in function signatures altogether, like is done in CUDArt.jl. We could also convert all function names to lowercase for instance. But I'm currently not convinced these last two are good ideas. Any input or other ideas welcome.

@visr
Copy link
Member Author

visr commented Oct 13, 2015

Error checking all functions that return a CPLErr is probably also a good idea, this is also used in the wrap_cuda.jl that I linked.

@yeesian
Copy link
Member

yeesian commented Oct 13, 2015

💯

It'll be good to figure out the appropriate names for the enums too, e.g. https://github.com/visr/GDAL.jl/blob/rewriter/src/clang/common.jl#L227 versus https://github.com/wkearn/RasterIO.jl/blob/master/src/clang/gdal_cdefs.jl#L55

@yeesian
Copy link
Member

yeesian commented Oct 13, 2015

I'm not sure about going all the way in removing all the types from the function signatures either, even if it has been unnecessarily restrictive on a number of occasions

@visr
Copy link
Member Author

visr commented Oct 18, 2015

A few other things have snuck into this PR, because I need to start testing/using this API to find the parts that need improvement.

For instance, GDALGetMetadata returns a Ptr{Ptr{UInt8}}, though I think this should really just return a Dict, see the code here. Looking at the GDAP API Tutorial under Techniques for Creating Files, it shows the C and C++ examples calling a method CSLFetchBoolean that we haven't wrapped, whereas in the Python example a dict is returned as well.

@yeesian
Copy link
Member

yeesian commented Oct 18, 2015

I'm really happy with the progress you're making here.

A few other things have snuck into this PR, because I need to start testing/using this API to find the parts that need improvement.

I personally prefer to

  • keep the original GDAL functions generated by Clang too (we should still replace Cint by Integer, etc, but not to the extent of rewriting the pointers into julian objects). I might write a scraper to fetch documentation from the gdal website as docstrings at some point. If it's cluttering the namespace, we can put them all in a submodule. That way we don't need to resort to ccall-ing the original GDAL functions, and can have the API documentation at the REPL.
  • Provide higher-level julia functions (auto-generated or handwrapped) that take-in and emit julian types (Dict/Vector/UTF8String/etc), that we'll have to write documentation for. We can either do multi-dispatch on the original GDAL names (which might be confusing), or provide our own set of julian function names.

I think it'll be good if we can do the error-checking at the low-level GDAL layer~~, but it sounds like a tall order for the rewrite phase though. We might need auxiliary information from the documentation too, so it's probably better to wait before doing another PR for error-checking.~~ Edit: I just saw your work on checking null pointers, and emitting error messages, and it's great! Am I right to check that only the CPLErrors are left?

For instance, GDALGetMetadata returns a Ptr{Ptr{UInt8}}, though I think this should really just return a Dict, see the code here.

Sure. But my preference will be for it to not overwrite the original GDAL function (see above).

@visr
Copy link
Member Author

visr commented Oct 18, 2015

Thanks for your comments. It is good to decide where to draw the line between the low-level GDAL interface and the higher-level Julia functions. I'm not trying to take the step from low-level to high-level directly in this PR.

The way I see it, this is all still the low-level GDAL interface. It's good to have the whole thing available for if somebody wants to use a function that is not in the higher-level API. This means it should be directly usable from Julia, without it being too awkward.

At the same time, I agree the higher level API should be written entirely from function calls to the lower level API, no ccalls. Therefore I understand you don't want too much rewriting, but for me it's still not completely clear where you want to draw the line:

keep the original GDAL functions generated by Clang too (we should still replace Cint by Integer, etc, but not to the extent of rewriting the pointers into julian objects)

What I currently have in mind for the low-level API:

  • Keep all original function names intact
  • In general, keep the original function signatures intact, except for:
    • Replacing basic data types: input/output strings, not Ptr{Uint}; loosen Cint to Integer; OrderedDict instead of Ptr{Ptr{UInt8}}
  • Don't replace main GDAL types, which are typealiased Ptr{Void}.

For the case of GDALGetMetadata returning a Ptr{Ptr{UInt8}}, why would you want to also retain a version returning Ptr{Ptr{UInt8}}? I'm asking since I don't think we want to get three API layers, one low-level without any rewriter, a second 'nice' low-level, autogenerated from the first, and a high-level API. Or is that what you would prefer?

Regarding the low-level error checking, with a8c750b we are hooked into GDAL's internal error checking. If an error is raised, the type, code and message gets sent to our Julia function to deal with. This should take care of most low-level error handling needs, and we can change the Julia function when we're not happy. On top of that I added the C_NULL return check for selected return types, because some things GDAL doesn't consider an error that we may want to be an error. I could remove this automatic C_NULL check though.

@yeesian
Copy link
Member

yeesian commented Oct 18, 2015

For the case of GDALGetMetadata returning a Ptr{Ptr{UInt8}}, why would you want to also retain a version returning Ptr{Ptr{UInt8}}? I'm asking since I don't think we want to get three API layers, one low-level without any rewriter, a second 'nice' low-level, autogenerated from the first, and a high-level API. Or is that what you would prefer?

I think that's what I was hinting at. I guess I wasn't thinking of having a high-level API in this package. You might be able to convince me otherwise though.

What I have in mind:

module GDAL
    module capi
        # directly from Clang, i.e. "really" low-level
        # pointers/error-messages/etc handled here
        function GDALGetAsyncStatusTypeByName(arg1::Ptr{UInt8})
        ...
        function GDALDestroyDriver(arg1::Ptr{Void})
    end

    # auto re-written, i.e. "nice" low-level
    # pointers should be disambiguated into the corresponding GDAL types
    abstract MajorObject
    immutable Dataset <: MajorObject
        ptr::Ptr{Void}
    end
    immutable Driver <: MajorObject
        ptr::Ptr{Void}
    end

    # takes in and emits julian types
    # (function names generated by removing GDAL/OGR/CPL in front,
    #  and lowercasing everything)
    function getasyncstatustypebyname(name::String)
    ...
    destroydriver(d::Driver) = capi.GDALDestroyDriver(d.ptr)
end

I don't think it's the responsibility of this package to provide a nicer API than that -- that sounds like the job of downstream packages (e.g. RasterIO.jl, or OGR.jl or any other package) that wants to provide a nice way of working with GDAL, e.g.

module RasterIO # downstream package
    import GDAL
    type RasterDataset
        dataset::GDAL.Dataset
        nbands::Int
        width::Int
        height::Int
    end
    asyncstatustype(name::String) = GDAL.getasyncstatustypebyname(name)
    destroy(d::GDAL.Driver) = GDAL.destroydriver(d)
end

module OGR
    import GDAL
    type FeatureDataset
        dataset::GDAL.Dataset
        ...
    end
    ...
end

You might ask -- why have module capi at all? Some reasons off the top of my head:

  1. Sometimes it might be better to just get the pointer output, and pass it on to another GDAL function, without having to load it into a bytestring/array/etc. There might be people who want to bypass that step, and not load objects into julia's ownership (to reduce GC churn), which our "nice" low-level won't allow for.
  2. It allows us to document with docstrings (fetched directly from GDAL's documentation) what users otherwise have to search through gdal's online documentation for (when ccall-ing).

@yeesian
Copy link
Member

yeesian commented Oct 18, 2015

I don't think it's the responsibility of this package to provide a nicer API than that -- that sounds like the job of downstream packages

This package will also need to provide build scripts, maintain documentation for re-written functions, and test that the wrappers are working etc, for (potentially) different GDAL versions in the future. I think that sounds like enough responsibility on its own, without having to also worry about the friendliness of the package for end-users?

@visr
Copy link
Member Author

visr commented Oct 18, 2015

Ok that's clear. I understand the possible benefits of having a capi submodule now. I'll get rid of most of the rewriter stuff, and instead reapply (and intensify) it to generate the GDAL module. Though I don't think I'll have much time available this week.

I also think the true high-level API is better suited for other packages, it was a 'thin' wrapper after all.

function rewriter(ex::Expr)
# comment out predefined expressions
if in(ex, skip_expr)
return string("# ", e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be return string("# ", ex) ?

@meggart
Copy link
Member

meggart commented Oct 19, 2015

Sorry for coming so late into this thread, thanks for starting this. I really like the model proposed by @yeesian. However, can we still keep the error handling as it is right now or do you think this is too high-level?

@visr
Copy link
Member Author

visr commented Oct 19, 2015

@meggart thanks for the comments. I addressed them in the last commit. Also stopped the rewriting and put everything in a submodule C. Sorry it's such a mess right now, I will squash later. With the error handling, do you mean the checknull function? Or the error handling? I removed the checknull for now, but we could consider bringing it back.

Add an __init__ function that always runs GDALAllRegister on startup.
Add tests that one ore more drivers exist.

Use GDAL's custom error handler funcitonality:
all errors internally reported to CPLError() are now handled by a julia function
currently this function always throws an error such that we don't need to check for them

For other functions such as GetDriverByName, when a driver cannot be found,
this is not reported to CPLError() and a NULL pointer is returned.
In this case we ofter still want an error, that is what checknull() does.
@yeesian
Copy link
Member

yeesian commented Nov 20, 2015

What's left here to wrap things up? Might I be able to help in any way?

@visr
Copy link
Member Author

visr commented Nov 20, 2015

I think it is indeed best to wrap up this PR such that we can think about
the next steps that need to be done. Are you mostly satisfied with what's
in here? If so I can squash and merge tomorrow.

On Fri, 20 Nov 2015 20:04 Yeesian Ng notifications@github.com wrote:

What's left here to wrap things up? Might I be able to help in any way?


Reply to this email directly or view it on GitHub
#3 (comment).

@yeesian
Copy link
Member

yeesian commented Nov 21, 2015

Yup, I've checked it out, and it works for me. @meggart?

@meggart
Copy link
Member

meggart commented Nov 21, 2015

I can test this evening.

@meggart
Copy link
Member

meggart commented Nov 21, 2015

Looks good for me, too. Please merge when you are ready

visr added a commit that referenced this pull request Nov 21, 2015
use rewriter to modify wrapping
@visr visr merged commit 22b70b7 into master Nov 21, 2015
@yeesian
Copy link
Member

yeesian commented Nov 21, 2015

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants