-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fluent interfaces #67
Comments
What is the result if you do this? return (
sa.sql
.select([sa.sql.func.count(membership.c.user_id)])
.where(membership.c.group_id == cls.id)
.group_by(membership.c.group_id)
.label('members_count')
) |
Black doesn't support fluent interfaces or any other hand-formatted idioms. Backslashes are always removed, I responded as to why in #64. This will be solved by |
And for the record, I personally find the Black-formatted version easier on the eyes. Leading dots are invalid syntax in Python without backslash magic. |
No plans to change Black's formatting approach here; we already have #5 for |
I will reconsider this. My current idea is as follows:
By fluent interface I understand treating "." as a delimiter that comes first on the line. Is there anything else? |
@ambv I had never heeard of it called fluent interface before, so I'm not sure :) I do stuff like this a lot in Warehouse though: release = (
request.db.query(Release)
.filter(Release.project == project)
.order_by(
Release.is_prerelease.nullslast(),
Release._pypi_ordering.desc())
.limit(1)
.one()
) project_names = [
r[0] for r in (
request.db.query(Project.name)
.order_by(Project.zscore.desc().nullslast(),
func.random())
.limit(5)
.all())
]
release_a = aliased(
Release,
request.db.query(Release)
.distinct(Release.name)
.filter(Release.name.in_(project_names))
.order_by(Release.name,
Release.is_prerelease.nullslast(),
Release._pypi_ordering.desc())
.subquery(),
)
trending_projects = (
request.db.query(release_a)
.options(joinedload(release_a.project))
.order_by(func.array_idx(project_names, release_a.name))
.all()
)
latest_releases = (
request.db.query(Release)
.options(joinedload(Release.project))
.order_by(Release.created.desc())
.limit(5)
.all()
)
counts = dict(
request.db.query(RowCount.table_name, RowCount.count)
.filter(
RowCount.table_name.in_([
Project.__tablename__,
Release.__tablename__,
File.__tablename__,
User.__tablename__,
]))
.all()
) I pretty much only do it with SQLAlchemy, but that's also the only thing I really use whose API is a long chain of methods like that. The reason I do it that way (and afaik prior to learning others did it, I didn't know it had a name!) was because it matches the way I would format the SQL the most and groups the important aspects together on one line (or a few lines), so like taking a look at: release = (
request.db.query(Release)
# The "Filter by Release.project" is one logical "thought"
.filter(Release.project == project)
# The Order by is another logical though, but is too long
# for one line, so it has to get split.
.order_by(
Release.is_prerelease.nullslast(),
Release._pypi_ordering.desc())
# again, a single logical though, which isn't really modified at all by
# the previous lines, they could be deleted/added and this would
# stay the same.
.limit(1)
# and again, one logical thought.
.one()
) The black way of formatting that, which is: if True:
if True:
release = (
request.db.query(Release).filter(Release.project == project).order_by(
Release.is_prerelease.nullslast(), Release._pypi_ordering.desc()
).limit(
1
).one()
) The individual lines end up doing more than one logical thing, and you end up having to be much more careful about having to read through it to actually understand what's going on, because it each line ends up doing multiple things (the Anyways, I'm glad you're reconsidering this. The conditions as described sounds like it would work fine for me. |
Oh just for completness sake, heres the Python code and the SQL that it's mimic'ing: request.db.query(Release)
.filter(Release.project == project)
.order_by(
Release.is_prerelease.nullslast(),
Release._pypi_ordering.desc())
.limit(1)
.one() SELECT * FROM releases
WHERE releases.project = 'foo'
ORDER BY
releases.is_prerelease NULLS LAST,
releases._pypi_ordering DESC
LIMIT 1 That SQL isn't completely valid, but you can get the idea? |
@ambv Thanks. @ambv & @dstufft : Fluent interfaces have been originally discussed by Martin Fowler in 2005 in: https://martinfowler.com/bliki/FluentInterface.html and Eric Evans in the "Domain Driven Design" (blue) book. There are some discussions in https://en.wikipedia.org/wiki/Fluent_interface , more specifically: https://en.wikipedia.org/wiki/Fluent_interface#Python though not very deep. In the Python context, I have seen them used mostly for creating DSLs (as stated in the top of the Wikipedia article), more specifically for ORM and ORM-like libraries and frameworks (SQLAlchemy, others). My and @dstufft 's examples give a pretty good idea of what is considered "canonical" code when building queries with the SQLAlchemy DSL. Regarding the Python standard library, the only cas I can think of strings substitutions. Here's an example where Black reformats Flask's code in a less than savoury way: - rv = dumps(obj, **kwargs) \
- .replace(u'<', u'\\u003c') \
- .replace(u'>', u'\\u003e') \
- .replace(u'&', u'\\u0026') \
- .replace(u"'", u'\\u0027')
+ rv = dumps(obj, **kwargs).replace(u'<', u'\\u003c').replace(
+ u'>', u'\\u003e'
+ ).replace(
+ u'&', u'\\u0026'
+ ).replace(
+ u"'", u'\\u0027'
+ ) Regarding how to deal with it: IMHO the best way would be to consider that code with trailing backslashes has already been crafted manually by the programmer, and don't change it. This is, more or less, the current behaviour of Yapf or PyCharm's fomatter. @ambv It you still don't agree with this idea, what you propose seems to make sense. I note, however, that programmers will have to explicitly add parenthesis about long fluent expressions (as well as non-fluent one, like complex boolean expressions with no function calls) in order for Black to split the lines and not create a pep8 violation. |
Backslashes are bad form. Organizational parentheses are much preferred. |
I generally don't use backslashes for these, because multiple lines of backslashes always seemed off to me, parentheses seemed to always make the most sense to me. |
some thoughts. the pattern is breaking it into pieces that should (preferably) end up on "logical lines":
each of these could be one line, or more than one if the line becomes too long. when wrapping, i think normal black rules could apply, with the exception that the closing paren should not go on a line of its own. re. indenting of the whole statement, multiple-of-four indents should be required, just like elsewhere. that can.be easily achieved like this:
the final closing paren is debatable, and could go on a new line. |
Final closing parens are always dedented and on a separate line in Black. |
For the final closing paren, I've switched back and forth between new line and inline with the final line, I think either works fine. I think the only tweak I'd make here is I'd align the query = (
a.b.c
.d()
.e(
...,
...
)
) This actually ends up having the dot leading lines at not a multiple of 4, but in my experience, these things tend to be "normal" python attribute access, until you get to the chain of methods portion, and I find it easier to read down the order of operations if they're all aligned. I think this is more obvious with the example I posted above: release = (
request.db.query(Release)
.filter(Release.project == project)
.order_by(
Release.is_prerelease.nullslast(),
Release._pypi_ordering.desc())
.limit(1)
.one()
) Here release = (
request.db.query(Release)
.filter(Release.project == project)
.order_by(
Release.is_prerelease.nullslast(),
Release._pypi_ordering.desc())
.limit(1)
.one()
) because it still provides some visual distinction that indicates there is a reason why these lines are starting with That being said, something like: release = (
request.db
.query(Release)
.filter(Release.project == project)
.order_by(
Release.is_prerelease.nullslast(),
Release._pypi_ordering.desc())
.limit(1)
.one()
) Isn't the worst thing ever, you just basically have to mentally ignore the first line completely when parsing the fluent interface portion. I think it's worse than aligning with the Although I guess we'd have the middle set of parens on it's own line too, so it would end up looking like: release = (
request.db.query(Release)
.filter(Release.project == project)
.order_by(
Release.is_prerelease.nullslast(),
Release._pypi_ordering.desc()
)
.limit(1)
.one()
) If you do the "align with the The downside to this, is you end with two "partial" indents until you're back on always multiples of 4. I think this is OK because I think it's far more readable. Another option is: release = (
request.db.query(Release)
.filter(Release.project == project)
.order_by(
Release.is_prerelease.nullslast(),
Release._pypi_ordering.desc()
)
.limit(1)
.one()
) Which puts 20 spaces before the |
indents >4 are avoided by black, and personally i mostly agree with that (wastes space and ugly), but in any case i don't see why fluent apis should be an exception. |
Black never aligns with anything while indenting. It will indent four characters within any block or bracket pair and that's it. This always gives us maximum space available for the line content. Also, I feel that being consistent is more important than hand-optimizing particular edge cases. The simplest implementation will be to simply treat the dot as a delimiter (like "and", "or", etc.). I don't particularly like hand-optimizing leading attribute accesses because attribute accesses might just as well happen later, eg.: result = (
request
.db
.query(Release)
.filter(Release.project == project)
.some_attribute # note
.limit(1)
.one()
.release_name # note
) A reverse situation might also be possible where the first line is a call, too: result = (
Request(with='', some='', data='')
.ditto
.ditto()
.one()
) I'm inclined to think that if somebody really hates wasting lines on the leading attribute accesses, assigning those to a name is the way to go. Oh, one more thing. If the bracket contents fit in one line, they will (like everywhere else in Black): result = (
request.db.query(Release).limit(1).one().release_name
) And to be clear, if the entire thing fits in one line, it will. result = request.db.query(Release).limit(1).one().release_name Summing up, I want Black's implementation of fluent interfaces to be as trivial and consistent as the rest of the formatting it applies. |
I was playing around with it, and I think something like: release = (
request.db
.query(Release)
.filter(Release.project == project)
.order_by(
Release.is_prerelease.nullslast(),
Release._pypi_ordering.desc(),
)
.limit(1)
.one()
) You still end up with an extra level of indentation, but I think that ends up more readable, 🤷♂️ That being said, I find all of these examples more readable than what black does now so any of them are an improvement. I'm also perfectly fine with fitting things onto one line when possible (which is what I generally do already). |
@ambv the only "special" (quite common in practice) case would be leading attribute accesses. preferring a single line for this seems worth it. when mixing calls and attr access, one per line indeed, as you suggest. |
I would also vote for one level indented version
rather than
|
anything but a 4-space indent would be inconsistent with how black works elsewhere, and would be in direct conflict with black's consistency goals. @ambv was very clear about that in an earlier comment:
that said, i do advocate for a sole exception which is not related to indentation, but to line breaking. prefer single lines for attribute accesses at the start of the chain (see my my earlier comment). black's (upcoming) support for ‘fluent apis’ seems only about line breaks (not about indentation), and i think it's reasonable to expect black to acknowledge that both ‘continuation lines are special’ and ‘leading attribute accesses are special’ since that is the predominant (only?) style for fluent apis, e.g. sqlalchemy + web frameworks. the example would then become: release = (
request.db
.query(Release)
.filter(Release.project == project)
.order_by(
Release.is_prerelease.nullslast(),
Release._pypi_ordering.desc(),
)
.limit(1)
.one()
) |
@wbolster consistency is good but I believe that we should consider user experience too, otherwise we can consistently produce code that majority of people don't like and never get into wide adoption in the community. |
black always uses 4 spaces, and that's not considered a ‘user experience’ problem for normal code, so what exactly is the ‘user experience’ problem with 4 space indents for fluent apis, as shown here? |
@wbolster for example I like to see entry level separated from the body because it helps quickly scan the beginning of chain and skip whole body if I'm not interested in it.
vs
Although I agree that in python we have parents and it helps to parse the chain as one sentence for me second version looks more readable but maybe I'm polluted by javascript code style :) |
For whatever it's worth, while I have preferences, I think either option is good enough that I'm happy enough with either. |
what @dstufft said. i suggested both because both are valid choices, and i do not have a strong preference myself. anything except this monstrosity, basically: 😉 x = (
db
.session
.query(...)
.all()
) |
Yea to be clear, I'm fine with that too, I'd probably end up writing my code slightly different, but that's about all. |
FYI when I actually went and implemented it, it turned out that singling out attributes always looks weird. So the new rule is: fluent interfaces only break calls and indexing. Example from the tests: def example(session):
result = (
session.query(models.Customer.id)
.filter(
models.Customer.account_id == account_id,
models.Customer.email == email_address,
)
.order_by(models.Customer.id.asc())
.all()
) I think this is what everybody wanted. Note that the first call is hugged to the attribute but this is necessary for all those cases where you just have one call but multiple attributes along the way. Consider: traceback.TracebackException.from_exception(
exc,
limit=limit,
lookup_lines=lookup_lines,
capture_locals=capture_locals,
# copy the set of _seen exceptions so that duplicates
# shared between sub-exceptions are not omitted
_seen=set(_seen),
) It would look very weird to break up the call from the attributes. |
That looks excellent :) |
thanks @ambv! |
Be cool if black could respect parens when only one dot is present (not minimum of two). I might be missing something in the docs. I respect the non-compromising aspect of the project - I just feel like I'm missing something. |
Imo, one needs to rethink the hugging of the first attribute. For example out of this observations = (
measurements_experiments # original dataframe
.join(measurements_bioreactor, on=["measurement_time"]) # add bioreactor data
.join(setpoints["InducerConcentration"].dropna(), how="outer") # add setpoints
) black makes this ugly thing: observations = measurements_experiments.join( # original dataframe
measurements_bioreactor, on=["measurement_time"]
).join( # add bioreactor data
setpoints["InducerConcentration"].dropna(), how="outer"
) # add setpoints I think it would make sense to special case:
This way, the example from #67 (comment) would become def example(session, models):
result = (
session
.query(models.Customer.id)
.filter(
models.Customer.account_id == account_id,
models.Customer.email == email_address,
)
.order_by(models.Customer.id.asc())
.all()
) Traceback would stay the same: traceback.TracebackException.from_exception(
exc,
limit=limit,
lookup_lines=lookup_lines,
capture_locals=capture_locals,
# copy the set of _seen exceptions so that duplicates
# shared between sub-exceptions are not omitted
_seen=set(_seen),
) |
traceback.TracebackException.from_exception(...) The This can be done by having different rules for attributes without a call [such as |
Hello, I was surprised to find that this code: obj = obj.filter("oooooooooooooooooooooooooooooooooooo").filter("oooooooooooooooooooooooooooooo") is formatted to the awkward: obj = obj.filter("oooooooooooooooooooooooooooooooooooo").filter(
"oooooooooooooooooooooooooooooo"
) while if I add a third chained call: obj = obj.filter("oooooooooooooooooooooooooooooooooooo").filter("oooooooooooooooooooooooooooooo").filter("oooooooooooooooooooooooooooooooooo") black makes it pretty: obj = (
obj.filter("oooooooooooooooooooooooooooooooooooo")
.filter("oooooooooooooooooooooooooooooo")
.filter("oooooooooooooooooooooooooooooooooo")
) I could not find a way to convince black to reformat something like the first snippet to something like the last. What am I missing? |
@eguiraud I guess it's the normal current behaviour. I don't see any occurrence where the result you call "awkward" (because it mixes two diffent styles in the same statement) could be considered the correct answer, but I may missing something. I would treat it as a feature request or a bug. |
A previous comment seems to indicate that one needs at least two dots for black to treat the call chain as a fluent API, although the docs don't mention this. With the example above I was kind of hoping to show that the current behavior when there is only one dot is not ideal. But ok, I should probably open a new issue at this point :) |
Operating system: Mac OS
Python version: 3.6.4
Black version: 18.3a3
Does also happen on master: yes
Black removes trailing backlashes, which can be used, for instance with yapf, to signal "don't join these line".
This is problematic, for instance, if one uses the "fluent interface" idiom, which can lead to long chains of attribute accesses in a given expression.
Here's an example:
The easiest solution would be to keep the line break in the presence of a backslash.
The text was updated successfully, but these errors were encountered: