Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make BuiltinByteString and BuiltInString more opaque #4206

Closed
wants to merge 5 commits into from

Conversation

treeowl
Copy link

@treeowl treeowl commented Nov 11, 2021

Hopefully fixes #4193.

Pre-submit checklist:

  • Branch
    • Tests are provided (if possible)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
    • Relevant tickets are mentioned in commit messages
    • Formatting, materialized Nix files, PNG optimization, etc. are updated
  • PR
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested

@treeowl
Copy link
Author

treeowl commented Nov 11, 2021

@michaelpj, let me know what you think.

@treeowl
Copy link
Author

treeowl commented Nov 11, 2021

@phadej, your review would be appreciated too, of course. And I am rather curious what type of owl has possessed you.

@treeowl
Copy link
Author

treeowl commented Nov 11, 2021

Note that in this representation, a BuiltinByteString in memory looks exactly the same as a ByteString; it just won't ever be unpacked. In theory, GHC could break this, by treating BuiltinByteString as a product type, but it doesn't do so at present. If that ever changes, we can work around it by making the Void field lazy. But that will be a bit more verbose to deal with and perhaps increase (off-chain) code size, so I think we should avoid that unless and until it's necessary.

`GHC.Exts.lazy`, like `GHC.Exts.noinline`, is a magical function
with no unfolding. GHC erases it in Core Prep, which happens after
we interrupt GHC's compilation process, so we have to erase it
ourselves. We also need to `NOINLINE` the definition of `error`,
because that's what the Plutus compiler is looking for.
@treeowl
Copy link
Author

treeowl commented Nov 12, 2021

Some tests fail because compilation has changed. How do I determine if these changes are acceptable?

@phadej
Copy link

phadej commented Nov 12, 2021

Note that in this representation, a BuiltinByteString in memory looks exactly the same as a ByteString;

Is it? I think there is an additional word for a header of wrapper data (somewhere the constructor tag has to go). Not bad, but not exactly the same. (EDIT: If that word matters, maybe ShortByteString should be used in the first place).

@treeowl
Copy link
Author

treeowl commented Nov 12, 2021

@phadej, no, a heap object (e.g., any lifted data) always has a header word to point to the info table and entry code. This will be used by the mutator for lifted data any time the pointer to the object hasn't been tagged (or when there are too many constructors and the tag isn't sufficient to identify which). It's also used by the garbage collector every time to get the object size, pointer layout, etc.

@treeowl
Copy link
Author

treeowl commented Nov 12, 2021

@michaelpj , your thoughts on the test deviations would be appreciated. I'm also curious how I can run those particular tests locally.

@treeowl
Copy link
Author

treeowl commented Nov 12, 2021

@michaelpj, a lot of the compilation changes seem to be removal of redundant eta expansion. I'm guessing this has to do with the fact that GHC can now see the arities of the builtin functions. What's a bit odd is that strict bindings are then replaced by nonstrict ones. Guess: the Plutus compiler is assuming that things like addInteger might be thunks that shouldn't be evaluated. This is wrong. And the Plutus compiler should be able to see that it's wrong, by looking at the function arities calculated by GHC. Thunks can only ever have arity 0, so nothing with a positive arity is a thunk.

trickier part is getting the `Rep` right. The `Rep` includes metadata about
strictness, unpacking, constructor names, record selectors, etc. Some aspects
are difficult or impossible to get right manually, particularly the exact
package name and the "decided strictness". So instead of building the `Rep`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get the "package name" (actually the unit-id) with CURRENT_PACKAGE_KEY CPP macro.
OTOH using the fake type with generic trick is quite clever :)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phadej, using a fake type is the only way I know to be sure the DecidedStrictness is right no matter how the module is compiled, and I think it's easier to see that the construction is correct than it would be to do it all by hand anyway. Copying the module name from the fake type makes it robust against module renaming too. For the sake of my curiosity, what is a unit-id?

Copy link

@phadej phadej Nov 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is a unit-id

It's a difficult question to answer exhaustively, but a short version is:

unit-ids are the identifiers libraries (or more generally units) are registered with in package db (or more generally identified by GHC). Package id (which is package version + package number) is metadata used by humans. For ghc boot libraries unit-ids and package-ids are the same, for cabal installed packages (in store) they are different:

For example:

% head -n 10 /cabal/store/ghc-9.0.1/package.db/aeson-2.0.2.0-ceb356429ca3538b1da748cbb4d8187275f311ecac1e7bd3d68bd0e65c65d68a.conf
name:                 aeson
version:              2.0.2.0
visibility:           public
id:
    aeson-2.0.2.0-ceb356429ca3538b1da748cbb4d8187275f311ecac1e7bd3d68bd0e65c65d68a

key:
    aeson-2.0.2.0-ceb356429ca3538b1da748cbb4d8187275f311ecac1e7bd3d68bd0e65c65d68a

license:              BSD-3-Clause

(key was the predecessor of unit-ids before Backpack, IIRC, now there for compatibility reasons).

Unit-ids may be arbitrary strings (restricted alphabet), they don't need to start with package-id, it's there to help debugging.

@treeowl
Copy link
Author

treeowl commented Nov 12, 2021

Hmm... Looking at the compiler code, it seems there already is logic for determining whether applications of builtins are saturated, so I don't know why this is happening. Hopefully someone who knows more about the Plutus compiler can explain.

@kk-hainq
Copy link
Contributor

@treeowl I have no idea what is going on but it looks very interesting! I'll see if I could find anything on the interesting effects here.

Also, you can run local tests with cd plutus-tx-plugin && cabal test. You can also nuke all golden files find . -type f -name '*.golden' -delete, to re-generate with the new compiler changes in the next cabal test run. I wonder if committing the changes to this PR makes it easier for others to add inline comments?

@michaelpj
Copy link
Contributor

Sorry, we had a workshop all last week. I'll try to look at this next week.

@L-as
Copy link
Contributor

L-as commented Jan 19, 2022

Any chance this could be merged?

@michaelpj
Copy link
Contributor

Sorry I have been procrastinating on this one. Part of the issue is that I don't really understand the problem this is trying to solve, nor the solution. I would at least like to see a test or two that behave badly before this PR and better afterwards.

@anton-k
Copy link
Contributor

anton-k commented Jan 20, 2022

Sorry I have been procrastinating on this one. Part of the issue is that I don't really understand the problem this is trying to solve, nor the solution. I would at least like to see a test or two that behave badly before this PR and better afterwards.

Typical case is to use BuiltinByteString operations with bang patterns. Issue is that we can not use bangs with ByteStrings in plutus code. Bangs are used in plutus to force let-expressions. And when we append two BS (for example we construct token name from Int and BS or two BSs), plutus can not compile the code. So BS computations of this kind are repeated if they are used in let-expressions.

It often bites you when you try to create custom token names and assemble them from parts:

let !tok = computeFooToken (...) (...) 
    !a = expr1 tok 
    !b = expr2 tok
in  expr3 a b

@phadej
Copy link

phadej commented Feb 7, 2022

An example test is

From b33dfbfc0993c3f0d78a7b7843e84185c1b90e9e Mon Sep 17 00:00:00 2001
From: Oleg Grenrus <oleg@well-typed.com>
Date: Mon, 7 Feb 2022 16:19:03 +0200
Subject: [PATCH] Strict example

---
 plutus-tx-plugin/plutus-tx-plugin.cabal       |  1 +
 plutus-tx-plugin/test/Plugin/Spec.hs          |  2 +
 plutus-tx-plugin/test/Plugin/Strict/Spec.hs   | 41 +++++++++++++++++++
 .../test/Plugin/Strict/strictAdd.plc.golden   |  3 ++
 .../Plugin/Strict/strictAppend.plc.golden     |  0
 5 files changed, 47 insertions(+)
 create mode 100644 plutus-tx-plugin/test/Plugin/Strict/Spec.hs
 create mode 100644 plutus-tx-plugin/test/Plugin/Strict/strictAdd.plc.golden
 create mode 100644 plutus-tx-plugin/test/Plugin/Strict/strictAppend.plc.golden

diff --git a/plutus-tx-plugin/plutus-tx-plugin.cabal b/plutus-tx-plugin/plutus-tx-plugin.cabal
index 9b26d454e..7581b145d 100644
--- a/plutus-tx-plugin/plutus-tx-plugin.cabal
+++ b/plutus-tx-plugin/plutus-tx-plugin.cabal
@@ -101,6 +101,7 @@ test-suite plutus-tx-tests
         Plugin.Typeclasses.Spec
         Plugin.Typeclasses.Lib
         Plugin.Coverage.Spec
+        Plugin.Strict.Spec
         Plugin.Lib
         StdLib.Spec
         TH.Spec
diff --git a/plutus-tx-plugin/test/Plugin/Spec.hs b/plutus-tx-plugin/test/Plugin/Spec.hs
index 3e01d0976..c8b58e2f1 100644
--- a/plutus-tx-plugin/test/Plugin/Spec.hs
+++ b/plutus-tx-plugin/test/Plugin/Spec.hs
@@ -12,6 +12,7 @@ import Plugin.NoTrace.Spec
 import Plugin.Primitives.Spec
 import Plugin.Profiling.Spec
 import Plugin.Typeclasses.Spec
+import Plugin.Strict.Spec
 
 tests :: TestNested
 tests = testNested "Plugin" [
@@ -25,4 +26,5 @@ tests = testNested "Plugin" [
   , typeclasses
   , profiling
   , coverage
+  , strict
   ]
diff --git a/plutus-tx-plugin/test/Plugin/Strict/Spec.hs b/plutus-tx-plugin/test/Plugin/Strict/Spec.hs
new file mode 100644
index 000000000..fefebf000
--- /dev/null
+++ b/plutus-tx-plugin/test/Plugin/Strict/Spec.hs
@@ -0,0 +1,41 @@
+{-# LANGUAGE BangPatterns        #-}
+{-# LANGUAGE DataKinds           #-}
+{-# LANGUAGE OverloadedStrings   #-}
+{-# LANGUAGE ScopedTypeVariables #-}
+{-# LANGUAGE TypeApplications    #-}
+{-# OPTIONS_GHC -ddump-simpl -dsuppress-all #-}
+{-# OPTIONS_GHC -fplugin PlutusTx.Plugin #-}
+{-# OPTIONS_GHC -fplugin-opt PlutusTx.Plugin:defer-errors #-}
+{-# OPTIONS_GHC -fplugin-opt PlutusTx.Plugin:debug-context #-}
+
+module Plugin.Strict.Spec (strict, strictAddExample, strictAppendExample) where
+
+import Test.Tasty.Extras
+
+import PlutusCore.Test
+import PlutusTx.Builtins qualified as Builtins
+import PlutusTx.Code
+import PlutusTx.Lift
+import PlutusTx.Plugin
+import PlutusTx.Prelude qualified as P
+import PlutusTx.Test
+
+import Data.Proxy
+
+strict :: TestNested
+strict = testNested "Strict" [
+    goldenPir "strictAdd" strictAdd
+  , goldenPir "strictAppend" strictAppend
+  ]
+
+strictAdd :: CompiledCode (Integer -> Integer -> Integer)
+strictAdd = plc (Proxy @"strictLet") strictAddExample
+
+strictAddExample :: Integer -> Integer -> Integer
+strictAddExample !x !y = Builtins.addInteger x y
+
+strictAppend :: CompiledCode (P.BuiltinByteString -> P.BuiltinByteString -> P.BuiltinByteString)
+strictAppend = plc (Proxy @"strictLet") strictAppendExample
+
+strictAppendExample :: P.BuiltinByteString -> P.BuiltinByteString -> P.BuiltinByteString
+strictAppendExample !x !y = Builtins.appendByteString x y
diff --git a/plutus-tx-plugin/test/Plugin/Strict/strictAdd.plc.golden b/plutus-tx-plugin/test/Plugin/Strict/strictAdd.plc.golden
new file mode 100644
index 000000000..8f2b6246b
--- /dev/null
+++ b/plutus-tx-plugin/test/Plugin/Strict/strictAdd.plc.golden
@@ -0,0 +1,3 @@
+(program
+  (lam x (con integer) (lam y (con integer) [ [ (builtin addInteger) x ] y ]))
+)
\ No newline at end of file
diff --git a/plutus-tx-plugin/test/Plugin/Strict/strictAppend.plc.golden b/plutus-tx-plugin/test/Plugin/Strict/strictAppend.plc.golden
new file mode 100644
index 000000000..e69de29bb
-- 
2.34.1

In current master it fails with

      strictAppend:            FAIL
        Exception: Error: Unsupported feature: Type constructor: Addr#
        Context: Compiling type: Addr#
        Context: Compiling data constructor type: PS
        Context: Compiling type: ByteString
        Context: Compiling expr: case x `cast` <Co:1> of nt_snNO { PS _ _ _ _ ->
                                 case y `cast` <Co:1> of nt_snNW { PS _ _ _ _ ->
                                 appendByteString (nt_snNO `cast` <Co:2>) (nt_snNW `cast` <Co:2>)
                                 }
                                 }
        Context: Compiling expr: case x `cast` <Co:1> of nt_snNO { PS _ _ _ _ ->
                                 case y `cast` <Co:1> of nt_snNW { PS _ _ _ _ ->
                                 appendByteString (nt_snNO `cast` <Co:2>) (nt_snNW `cast` <Co:2>)
                                 }
                                 }
        Context: Compiling expr: \ y ->
                                   case x `cast` <Co:1> of nt_snNO { PS _ _ _ _ ->
                                   case y `cast` <Co:1> of nt_snNW { PS _ _ _ _ ->
                                   appendByteString (nt_snNO `cast` <Co:2>) (nt_snNW `cast` <Co:2>)
                                   }
                                   }
        Context: Compiling expr: \ x y ->
                                   case x `cast` <Co:1> of nt_snNO { PS _ _ _ _ ->
                                   case y `cast` <Co:1> of nt_snNW { PS _ _ _ _ ->
                                   appendByteString (nt_snNO `cast` <Co:2>) (nt_snNW `cast` <Co:2>)
                                   }
                                   }
        Context: Compiling definition of: strictAppendExample
        Context: Compiling expr: strictAppendExample
        Context: Compiling expr at: test/Plugin/Strict/Spec.hs:38:41-59
        Context: Compiling expr: strictAppendExample
        Context: Compiling expr at "strictLet"
        Use -p '/Strict/&&/strictAppend/' to rerun this test only.

@phadej phadej mentioned this pull request Feb 7, 2022
@michaelpj michaelpj closed this Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BuiltinByteString isn't opaque enough
6 participants