Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

profile naming and interface #124

Closed
roll opened this issue Oct 10, 2016 · 8 comments
Closed

profile naming and interface #124

roll opened this issue Oct 10, 2016 · 8 comments
Assignees
Labels
general General improvements

Comments

@roll
Copy link
Member

roll commented Oct 10, 2016

Overview

There are two questions:

  • profile is also in use for datapackage profiles
  • profile interface could be updated

Naming

TBD

Interface

@pwalsh has wrote:


I'm not a fan of passing in data (errors and table in this case), modifying it, and then not returning that modified data. https://github.com/frictionlessdata/goodtables-py/blob/next-initial/goodtables/profiles/datapackage.py#L15

Based on the way profiles are used here, I don't see why the profile function needs errors and table passed in as arguments - they are always empty. Why not get rid of the empty list assignments and have

errors, tables = profile_handler(source, **options)  # sidenote: handler because more meaningful than func?

The reason why now profiles gets predefined errors and tables - API consistency with checks (but that's true we could removed errors and tables from profile signature):

profile(errors, tables, ...)
check(errors, columns, ...)

Also cc @amercader

@roll roll added this to the goodtables-v1 milestone Oct 10, 2016
@roll roll added general General improvements priority labels Oct 10, 2016
@roll roll changed the title Replace profile as a name for different datasets (table/datapackage/etc) profile naming and interface Oct 13, 2016
@pwalsh
Copy link
Member

pwalsh commented Oct 13, 2016

why does check need to have the same API as profile?

@roll
Copy link
Member Author

roll commented Oct 13, 2016

@pwalsh
It doesn't.

Here I just use principle like don't surprise your user. After writing first custom check user could expect that passing errors to a function is a standard pattern for this API. May be not perfect but consistent across the framework.

Not having a strong preference here esp after your words (that it's confusing). Just interested in how others look on this problem.

@roll roll added current and removed priority labels Oct 18, 2016
@roll roll self-assigned this Oct 20, 2016
@roll
Copy link
Member Author

roll commented Oct 26, 2016

About profile naming. There is a collision with datapackage profile but may be because it means the same in a good way (of collision). Like there is a dataset which could be:

  • table
  • ckan instance
  • datapackage
  • fiscal datapackage
  • etc

And distinction between is a dataset profile. So datapackage profiles are subset of overall dataset profiles.

But if collision is still seems bad may use something safe like handler?

@roll roll added the helpme label Oct 26, 2016
@pwalsh
Copy link
Member

pwalsh commented Oct 26, 2016

@roll let's hear from others. @amercader @vitorbaptista @akariv please jump in when you can.

For me:

I prefer handler or anything that is not profile.

For the other issue above, I don't like the arbitrary enforcement of the same API for profile and check. But I honestly don't care if it stays like that because you do prefer it :).

@amercader
Copy link
Member

Sorry it's hard to follow this discussion if you don't know the context.

  1. What is/does the profile object/function in the context of goodtables as opposed to the profiles in Data Packages? I agree that if not the same thing they should be named differently
  2. Consistency is good but if it means having to always pass empty params is probably pushing it too far. As long as all functions on the API are well documented, I don't think we need to enforce a pattern

@roll
Copy link
Member Author

roll commented Oct 27, 2016

@amercader
Both profile are pretty close as a dataset characteristic. But technically it's still different things. I'm also a big fan of not having different things with the same names. But this one was tricky)

So let's:

  • rename profile to handler (we get input from user and handler handles it)
  • remove empty arguments from handler signature

@roll
Copy link
Member Author

roll commented Oct 27, 2016

I've started to update README using handler but there was a feeling that for end-user this word has to less sense. So I've tried to use word preset (in goodtables could be used different presets to process input data like datapackage preset) - https://github.com/frictionlessdata/goodtables-py/pull/137

@pwalsh
Copy link
Member

pwalsh commented Oct 27, 2016

@roll I saw and approved the PR

roll pushed a commit that referenced this issue Oct 27, 2016
* renamed profile to preset

* removed errors, tables arguments from preset
@roll roll closed this as completed Oct 27, 2016
@roll roll removed the review label Oct 27, 2016
roll pushed a commit that referenced this issue Nov 2, 2016
* Rebased on goodtables.next codebase (#118)

* removed current codebase

* added updated codebase

* fixed linting

* updated readme

* updated readme

* updated readme

* fixed source checks

* added dataset checks

* min style change

* removed ecode filter from filter_checks

* renamed cells back to columns + row_number

* added dataset check stubs

* implemented dataset checks

* fixed linting

* moved __inspect_table next to inspect for better reading

* fixed list.clear for python2

* added error limit to dataset errors

* updated readme

* updated readme

* added breaking note to readme

* added custom checks support

* implemented custom profiles

* fixed linting

* added options order_fields and infer_fields

* fixed extra_header

* added comments

* updated spec

* renamed unordered_headers to non_matching_header

* renamed col-number to column-number

* min

* updated added dataset errors to readme

* splitted error and check concpets

* fixed linting

* min

* fixed readme

* updated readme

* updated readme

* fixed readme

* fixed readme

* added guard assertion to checks

* updated custom checks API

* fixed linting, readme

* fixed linting, readme

* updated readme

* updated readme

* typo

* moved table errors to Inspector, deleted checks

* added ability to profilies to return errors

* fixed head checks not columns break

* rebased check on in-place erorrs update

* rebased profiles on in-place errors

* fixed readme

* fixed readme

* fixed readme

* fixed readme

* fixed readme

* no extra-header error if infer_fields is True

* implemented proper non-matching-header without ordering

* added custom_profiles, custom_checks arguments instead of global
registry

* moved default args to signatures

* removed make release (use github releases instead!)

* added return code to cli

* improved cli error formatting

* improved error messages, tests

* updated examples

* added custom examples

* added inspector tests

* added limit tests

* fixed tests

* fixed spec link

* added checks options to cli

* fixed profiles

* added description to setup

* added entry_points, keywords

* moved ckan profile to examples

* removed report from spec

* updated version to v1

* updated install instruction for now

* Fixed jsontableschema-error message (#133)

* Rebased on granular tabulator exceptions (#115)

* updated dependencies

* rebases on new tabulator exceptions

* Renamed profile to preset with simplified API (#124)

* renamed profile to preset

* removed errors, tables arguments from preset

* Added infer_schema option, updated preset API (#128)

* minor improvements

* added infer_schema option false by default

* Added support for schema constraints (#55)

* updated jsontableschema version

* implemented all constraints except unique

* implemented unique constraint check

* updated to jsontableschema-v0.8.2

* fixed linting

* Added tables preset (#125)

* added tables preset

* fixed linting

* added tables test

* Implemented order_fields option (#123)

* fixed column producing for body context

* removed column from schema checks only if name slugs are different

* implemented order_fields algo

* improved comments

* Rebased on external spec (#131)

Rebased on external spec

* Improved tests (#127)

* added prev version data examples

* implemented feature tests

* moved files to data

* updated verson

* added features

* marked spec test as xfail

* Rebased on spec-v1.0.0-alpha1 (#131)

* updated spec, added spec to API

* added config with checks order

* rebased in inspector on updated spec

* updated @check API

* rebased on spec message templates

* fixed cutom checks

* fixed linting

* Updated readme note
@roll roll moved this to Done in Open Knowledge Dec 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
general General improvements
Projects
Archived in project
Development

No branches or pull requests

3 participants