Skip to content

Python Differences

K Lange edited this page Aug 14, 2022 · 3 revisions

Syntax

Some syntax differences are necessary and intentional, some may be resolved eventually.

  • Kuroko uses a let keyword to declare variables with block scoping and does not have a global or nonlocal keyword as these are not necessary.
  • Because Kuroko has block scoping rather than function scoping, some consideration must be given when using block constructs at the "module level". Top-level declarations bind globals, but block constructs introduce new scopes, and names bound within these scopes will not become attributes of the module.
  • Several syntax constructs have implicit declaration. Essentially any syntax that would naturally introduce a new name implicitly declares that name in the current scope. def and class are the most obvious, and declare new names in their enclosing scope, but many other block constructs like for ... in ... and with ... as ... also act as implicit declarations while introducing new scopes.
  • As a consequence of this, iterator destructuring (for x,y,z in ...) uses declaration lists, not "assignment targets". Assignment to an existing variable or to a complex target is not supported. This is considered an unfortunate compromise, and it would be desirable to support non-ambiguous cases such as dot and square bracket accesses for assignment targets in the future. The common use case of for x in y automatically binding x won out.
  • Class bodies, while still technically functions in a data model sense, are not parsed as functions syntactically; they are a special construct that accepts docstrings, member assignments, decorators, and functions, but arbitrary expressions that are not assigned to a class property and other statements (such as conditional blocks) are not allowed in a class body. Much of this is considered a deficiency, though the special treatment towards variable assignment was intended to allow Python-style class attributes without the need for let, and actual uses of block constructs in class bodies are exceptionally rare in Python. For compatibility with common Python idioms like assigning methods to alternative names, name bindings for class properties within a class body are treated specially and maybe referenced in expressions later in the class body. Supporting block constructs and arbitrary expressions in class bodies is also likely to lead to confusion as it would suffer from the same issue top-level block constructs fall into: what to do with names in enclosed scopes.
  • The expressions used in default arguments to functions are evaluated at call time as if they were inlined into the function body. This is intentional: a common pitfall of Python is the use of a mutable object like a list in a default argument leading to unexpected behavior. While there are legitimate uses for this behavior, it seems that mistakes are far more common, and if the behavior is intended it can be emulated by introducing a non-local just before the function declaration and referencing it in the default expression. The Python behavior may be supported by a compiler extension in the future.
  • Several async syntax constructs are still not supported.

Data Model

Kuroko was built "from scratch" and was not originally intended to be a Python implementation - it was merely designed to look similar. Over time, Kuroko's data model has "evolved" to better match Python but some differences still exist.

  • __getattribute__: Kuroko has __getattr__ as well as __setattr__, but the "always called" __getattribute__ is absent.
  • __new__: This is supposed to be a static method found on classes, and __init__ is supposed to be called after __new__ produces an object of the right type, but instead __init__ is expected to return the created class (and the compiler specially supports __init__ to do this). There's no good reason for this beyond the fact that it will take a lot of cleanup of builtin types to fix.
  • __del__: Kuroko has had a tracing garbage collector from the start, and __del__ is a tricky thing even in reference-counted CPython. Native types support a cleanup method, which must be a C function, but currently there is no way to run cleanup code for an object in managed code.
  • __ne__: Kuroko's compiler always compiles != to an equivalent instruction sequence of !(... == ...).
  • __slots__ and __dict__: These are internal implementation details of CPython that leaked into the greater specification of "Python" as a language, and Kuroko just doesn't work this way.
  • __delattr__: Haven't gotten around to it yet.
  • __delete__: Haven't gotten around to it yet.
  • __init_subclass__: Haven't gotten around to it yet.
  • __next__: The entire approach to iterators is different enough that it's unclear whether switching to __next__ is worthwhile, and there's a bunch of old Kuroko code that relies on being able to __call__ an iterator object (including several that are just functions).
  • The iterator model: Iterators do not raise a StopIteration exception when they are exhausted (they return themselves and are checked by identity). This always seemed like a rather obtuse way to end iteration, especially when it involves creating a new exception object... and then that exception has to be specially handled. I considered StopIteration being a special sentinel value so it could be returned instead of raised as an exception, but ended up with the current behavior as it was more straightforward to implement.
  • .../Ellipsis: Maybe this will be added eventually? It's an object with no purpose other than to be there for external libraries like NumPy.
  • Primitive values vs. heap objects vs. instances: Kuroko, at its core, is based on Lox, and Lox had primitive values, so Kuroko has primitive values. While this adds some complication, when combined with NaN boxing it does mean Kuroko avoids a weird corner of CPython: small integers aren't a bunch of singletons with weird semantics. It also means an array of small integers or floats exists as an array of small integers or floats, rather than an array of object references off somewhere on the heap, so Kuroko can pull some tricks CPython would need NumPy for. The distinction between "instances" and other heap objects is an odd wart, though, and could use with some cleanup, and is partly responsible for why __dict__ isn't a thing...
  • Weak references: Would be a nice thing to support...
  • raise can be used with anything, not just things that inherit from BaseException. Kuroko's original exception model was to raise strings, and exception objects came much later. There's little reason to restrict exceptions to a particular type when most of the "plumbing" for what actually ends up in an exception object is implementation-defined anyway.
  • Special methods must be functions: This is an optimization with how special method caching works, requiring special methods to be either native functions or managed functions (and to not be static methods). Might go away. Unclear if the 'object slot' approach is even necessary with a good regular method cache...
Clone this wiki locally