Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

show doesn't have a handler for Uchar.t #174

Open
pmetzger opened this issue Oct 5, 2018 · 25 comments
Open

show doesn't have a handler for Uchar.t #174

pmetzger opened this issue Oct 5, 2018 · 25 comments

Comments

@pmetzger
Copy link

pmetzger commented Oct 5, 2018

show doesn't have a built-in handler for Uchar.t, which makes it messy to try to derive a show function for something containing a Uchar.t.

@pmetzger
Copy link
Author

pmetzger commented Oct 5, 2018

(BTW, what's a reasonable temporary workaround?)

@hcarty
Copy link

hcarty commented Oct 5, 2018

In general, if there's a type without an existing deriver you can create a type alias and define your own deriver. In the case of show you'd need a pretty printer for the type. For example, something very close to the following should work if you fill in an appropriate definition for pp_uc.

type uc = Uchar.t
let pp_uc pp uc = ...
type t = { u : uc } [@@deriving show]

@whitequark
Copy link
Collaborator

Please submit a PR for Uchar.t, this would be very welcome.

@pmetzger
Copy link
Author

pmetzger commented Oct 5, 2018

I can submit a PR for Uchar.t, but we need to decide on a printed representation that we're comfortable with. (That is to say, can someone suggest a color for the bikeshed?)

@whitequark
Copy link
Collaborator

Format.printf "U+%04X" codepoint

@whitequark
Copy link
Collaborator

whitequark commented Oct 5, 2018

Or maybe Uchar.of_char '%c' for printable and Uchar.of_int 0x%04X for non-printable...

@pmetzger
Copy link
Author

pmetzger commented Oct 5, 2018

That latter suggestion isn't far from what @Drup suggested, though figuring out what's printable given only the tools in the stdlib isn't easy, so it might need to default to the latter.

@pmetzger
Copy link
Author

pmetzger commented Oct 5, 2018

U+HHHH seems problematic because it can't be a valid OCaml read syntax ever, at least not without nasty special casing.

@whitequark
Copy link
Collaborator

You can conservatively stick to ASCII. 0x20..0x7E?

@pmetzger
Copy link
Author

pmetzger commented Oct 5, 2018

You can conservatively stick to ASCII. 0x20..0x7E?

Sorry, can you expand on that?

@pmetzger
Copy link
Author

pmetzger commented Oct 5, 2018

@hcarty btw, in your example:

let pp_uc pp uc = ...

I am guessing the uc is the actual Uchar.t, but what's the pp argument?

@whitequark
Copy link
Collaborator

Sorry, can you expand on that?

I mean, convert it to int and check that it's between 0x20 and 0x7E. That should cover all printable ASCII range...

@pmetzger
Copy link
Author

pmetzger commented Oct 5, 2018

My current prototype is:

let pp_uchar f uc =
  let ui = Uchar.to_int uc in
  if ui < 128
  then Format.fprintf f "(Uchar.of_char '%s')" (Char.escaped (Uchar.to_char uc))
  else Format.fprintf f "(Uchar.of_int 0x%04x)" ui

@whitequark
Copy link
Collaborator

You need to skip 0x00 to 0x1F inclusive, too.

@pmetzger
Copy link
Author

pmetzger commented Oct 5, 2018

Why? Tab is better expressed as '\t' etc.

@pmetzger
Copy link
Author

pmetzger commented Oct 5, 2018

(I can see suggesting that things other than \t, \n, \r etc. should be expressed in the "normal" way though.)

@whitequark
Copy link
Collaborator

Oh sorry, I missed Char.escaped. You are right.

@pmetzger
Copy link
Author

pmetzger commented Oct 5, 2018

One suggestion for an OCaml Uchar.t direct syntax that’s evolved on the discord channel:

let pi : Uchar.t = \u'π'

(for direct entry of Unicode chars in source)

and

let alsopi: Uchar.t = \u{3C0}

(for entry of chars by their hex codepoint.)

It’s gross, but finding something less gross seems hard…

@pmetzger
Copy link
Author

pmetzger commented Oct 5, 2018

This convention could be implemented in a printer as:

let pp_uchar f uc =
  let ui = Uchar.to_int uc in
  if (ui > 31 && ui < 127) || (ui = 9) || (ui = 10) || (ui = 13)
  then Format.fprintf f "\\u'%s'" (Char.escaped (Uchar.to_char uc))
  else Format.fprintf f "\\u{%x}" ui

@pmetzger
Copy link
Author

Okay, so I started looking at implementing this in ppx_deriving.show and realized that I really should be proposing implementing it in Printf so it could be standard across OCaml, but for the moment, I'm not sure I entirely understand the code in ppx_deriving_show.cppo.ml in the expr_of_type function. Do I have the right place, though? That seems to be where this would below.

@whitequark
Copy link
Collaborator

Yes, I believe that is the right place to implement support for Uchar.t.

@pmetzger
Copy link
Author

One question is, should I start there, or should I propose something for Printf or even the lexer?

@whitequark
Copy link
Collaborator

Why not both? The stdlib/compiler changes will likely take a lot of time.

@pmetzger
Copy link
Author

I suppose the "why not both" is that I've gotten gunshy about proposing stdlib and compiler changes, but yah, I suppose I should start somewhere.

Is the proposed syntax tasteful enough to you?

@whitequark
Copy link
Collaborator

@pmetzger I like the syntax.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants