Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serde is potentially simplifying information twice, creating a bad nested structure #2424

Closed
cereallarceny opened this issue Aug 1, 2019 · 7 comments
Labels
Good first issue 🎓 Perfect for beginners, welcome to OpenMined!

Comments

@cereallarceny
Copy link
Member

Describe the bug

I'll attempt to explain this one as best as I can. Forgive me, there may be a few places where I'm not entirely familiar with the Python language or Serde/PySyft.

I've noticed while writing the Serde parser in Javascript a few strange places where Serde in PySyft appears to unnecessarily nest information. I think it would be in the best interest if PySyft cleaned up the Serde simplify functions to get a cleaner, shorter, more consistent exported value. I know that myself and @mccorby would greatly benefit from some de-duplication in the data structure being produced, and it would cause information being sent over the wire to be significantly smaller.

I want to be clear: I'm not familiar with all of the places where Serde is duplicating information... this issue is merely my initial understanding of where duplication may be taking place. Grain of salt. 😉

To Reproduce

import torch as th
import syft as sy
sy.create_sandbox(globals(), download_data=False, verbose=False)
@sy.func2plan(verbose=False)
def plan_double_abs(x):
    x = x + x
    x = th.abs(x)
    return x

x = th.tensor([1,2,3]).send(bob)

sy.serde._simplify(x.child)
sy.local_worker.verbose=False
plan_double_abs.build(th.tensor([1., -2.]))

sy.serde._simplify(plan_double_abs)

The above will produce the following output:

(18,
 (((6,
    (1,
     (6,
      ((6,
        ((5, (b'__add__',)),
         (19,
          (23885703668, 30300883787, 85156589176, None, (13, (2,)), False)),
         (6,
          ((19,
            (23885703668,
             30300883787,
             85156589176,
             None,
             (13, (2,)),
             False)),)),
         (0, ()))),
       (6, (53361601662,)))))),
   (6,
    (1,
     (6,
      ((6,
        ((5, (b'torch.abs',)),
         None,
         (6,
          ((19, (50671613206, 53361601662, 85156589176, None, None, True)),)),
         (0, ()))),
       (6, (68554228008,)))))),
   (6, (9, 53361601662))),
  85156589176,
  (1, (30300883787,)),
  (6, (68554228008,)),
  (5, (b'plan_double_abs',)),
  None,
  None,
  True))

To reproduce this exact object in Javascript I must write the following code:

const firstPointerTensor = new PointerTensor(
  23885703668,
  30300883787,
  85156589176,
  null,
  new TorchSize([2]),
  false
);

const secondPointerTensor = new PointerTensor(
  50671613206,
  53361601662,
  85156589176,
  null,
  null,
  true
);

const operations = [
  Tuple(
    1,
    Tuple(
      Tuple(
        '__add__',
        firstPointerTensor,
        Tuple(firstPointerTensor),
        new Map()
      ),
      Tuple(53361601662)
    )
  ),
  Tuple(
    1,
    Tuple(
      Tuple('torch.abs', null, Tuple(secondPointerTensor), new Map()),
      Tuple(68554228008)
    )
  ),
  Tuple(9, 53361601662)
];

const plan = new Plan(
  operations,
  85156589176,
  [30300883787],
  Tuple(68554228008),
  'plan_double_abs',
  null,
  null,
  true
);

Expected behavior

You'll see in the above Javascript code that the majority of the duplication comes in the operations variable.

  1. We have multiple nested Tuples, most of which I feel could probably be eliminated.
  2. We also see in the first operation two references to the same PointerTensor: firstPointerTensor, the second of which is contained within a Tuple for some reason. The second operation, for whatever reason, does not have the duplicate reference.
  3. Each PointerTensor has an empty Map (a Dictionary in Python) as the final argument. This may serve a purpose, but I'm not aware of one.

These are the duplications that I've been able to suss out. Either way, this issue, combined with my other two issues on Serde referenced below, are crucial to having a common language between Python and Javascript (and Kotlin and other languages that will be ported in the future).

Desktop

  • OS: MacOS
  • Version 10.14.6

Additional context

I've written about the addition of unnecessary commas in another issue (#2423) as well as the use of parens over square brackets in another issue (#2422)

@kakirastern
Copy link
Contributor

Hi, I would like to work on this issue to try and resolve it

@cereallarceny
Copy link
Member Author

Go for it @kakirastern !

@kakirastern
Copy link
Contributor

Yeah, I see the same phenomenon too, with my output of the Python version being:

(19,
 (((25,
    (1,
     (6,
      ((6,
        ((5, (b'__add__',)),
         (20,
          (97982647266, 93721612411, 94566476898, None, (13, (2,)), False)),
         (6,
          ((20,
            (97982647266,
             93721612411,
             94566476898,
             None,
             (13, (2,)),
             False)),)),
         (0, ()))),
       (6, (30225810401,)))))),
   (25,
    (1,
     (6,
      ((6,
        ((5, (b'torch.abs',)),
         None,
         (6,
          ((20, (83640269041, 30225810401, 94566476898, None, None, True)),)),
         (0, ()))),
       (6, (8729013785,)))))),
   (25, (9, 30225810401))),
  94566476898,
  (1, (93721612411,)),
  (6, (8729013785,)),
  (5, (b'plan_double_abs',)),
  None,
  None,
  True))

Am actively checking the code base to see where things went wrong...

@kakirastern
Copy link
Contributor

Setting sy.local_worker.verbose=False yields the following additional info...

worker <VirtualWorker id:me #objects:0> sending 2 tensor([1, 2, 3]) to <VirtualWorker id:bob #objects:1>
worker <VirtualWorker id:me #objects:0> sending 9 31422344287 to <VirtualWorker id:bob #objects:2>

And if I turn the verbose option back on by changing to sy.local_worker.verbose=True, I see what I would expect to see...

worker <VirtualWorker id:me #objects:0> sending 2 tensor([1, 2, 3]) to <VirtualWorker id:bob #objects:1>
worker <VirtualWorker id:me #objects:0> sending 9 82966780001 to <VirtualWorker id:bob #objects:2>
worker <VirtualWorker id:me #objects:0> sending 2 tensor([ 1., -2.]) to <Plan plan_double_abs id:69601957138 owner:me>
worker <VirtualWorker id:me #objects:0> sending 1 (('__add__', [PointerTensor | me:64383816141 -> 69601957138:44578402202], ([PointerTensor | me:64383816141 -> 69601957138:44578402202],), {}), (57269619476,)) to <Plan plan_double_abs id:69601957138 owner:me>
worker <VirtualWorker id:me #objects:0> sending 1 (('torch.abs', None, ([PointerTensor | me:61746159920 -> 69601957138:57269619476],), {}), (551558933,)) to <Plan plan_double_abs id:69601957138 owner:me>
worker <VirtualWorker id:me #objects:0> sending 9 57269619476 to <Plan plan_double_abs id:69601957138 owner:me>

where a few lines correspond with the some of the extra output which may not be needed and could cause some overheads issues during transmission. So from the context of the output as cited previously #2424 (comment) probably will need to add some truly global verbose=False feature to serde to tidy up the output a bit...

@cereallarceny
Copy link
Member Author

That sounds good to me @kakirastern. Feel free to submit a PR when you've got something. @iamtrask - any thoughts here?

@kakirastern
Copy link
Contributor

Let me not wait on @iamtrask for feedback and start look into it deeper. Sorry about the delay.

@kakirastern
Copy link
Contributor

It is indeed a bug I have come to realise, since in the line https://github.com/OpenMined/PySyft/blob/dev/syft/serde/serde.py#L289 one is supposed to be able to turn off details for the deserialise function in order to hide all the gory details in the deserialised object, but that is not the case at the moment.

If I attempt to set the details attribute to False the following error would be thrown:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-8192b19c7fd5> in <module>
      5     return x
      6 
----> 7 x = th.tensor([1,2,3]).send(bob)
      8 
      9 sy.serde._simplify(x.child)

~/PySyft/syft/frameworks/torch/tensors/interpreters/native.py in send(self, inplace, local_autograd, preinitialize_grad, no_wrap, garbage_collect_data, *location)
    394                 local_autograd=local_autograd,
    395                 preinitialize_grad=preinitialize_grad,
--> 396                 garbage_collect_data=garbage_collect_data,
    397             )
    398 

~/PySyft/syft/workers/base.py in send(self, obj, workers, ptr_id, garbage_collect_data, **kwargs)
    354 
    355         # Send the object
--> 356         self.send_obj(obj, worker)
    357 
    358         return pointer

~/PySyft/syft/workers/base.py in send_obj(self, obj, location)
    532                 receive the object.
    533         """
--> 534         return self.send_msg(messaging.ObjectMessage(obj), location)
    535 
    536     def request_obj(self, obj_id: Union[str, int], location: "BaseWorker") -> object:

~/PySyft/syft/workers/base.py in send_msg(self, message, location)
    236 
    237         # Step 2: send the message and wait for a response
--> 238         bin_response = self._send_msg(bin_message, location)
    239 
    240         # Step 3: deserialize the response

~/PySyft/syft/workers/virtual.py in _send_msg(self, message, location)
      8 class VirtualWorker(BaseWorker, FederatedClient):
      9     def _send_msg(self, message: bin, location: BaseWorker) -> bin:
---> 10         return location._recv_msg(message)
     11 
     12     def _recv_msg(self, message: bin) -> bin:

~/PySyft/syft/workers/virtual.py in _recv_msg(self, message)
     11 
     12     def _recv_msg(self, message: bin) -> bin:
---> 13         return self.recv_msg(message)

~/PySyft/syft/workers/base.py in recv_msg(self, bin_message)
    265         msg = sy.serde.deserialize(bin_message, worker=self)
    266 
--> 267         (msg_type, contents) = (msg.msg_type, msg.contents)
    268 
    269         if self.verbose:

AttributeError: 'tuple' object has no attribute 'msg_type'

So I guess I will need to work on perfecting the verbose option for the worker(s) say the local_worker at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Good first issue 🎓 Perfect for beginners, welcome to OpenMined!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants