Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate: what if naz reads 4bytes and they are not part of the header? #135

Closed
komuw opened this issue Jun 8, 2019 · 7 comments · Fixed by #147
Closed

Investigate: what if naz reads 4bytes and they are not part of the header? #135

komuw opened this issue Jun 8, 2019 · 7 comments · Fixed by #147
Labels
bug Something isn't working

Comments

@komuw
Copy link
Owner

komuw commented Jun 8, 2019

naz reads 4 bytes from socket

naz/naz/client.py

Lines 1727 to 1728 in 14fe3be

# https://github.com/komuw/naz/issues/116
command_length_header_data = await self.reader.read(4)

and those 4 bytes are the smpp length field of smpp header(smpp header has four fields; length, commnd_id, status, sequence_number)

naz/naz/client.py

Line 1762 in 14fe3be

total_pdu_length = struct.unpack(">I", command_length_header_data)[0]

But what if naz read 4bytes and they happen to be in the middle somewhere. ie, the four bytes we read are not the length smpp header field but let's say they are part of the body.

Investigate how naz would/should handle that situation.

@komuw
Copy link
Owner Author

komuw commented Jun 8, 2019

this may(or may not) be related. it was picked up when doing some benchmarks:

{'timestamp': '2019-06-08 12:07:57,705', 'event': 'MyRedisQueue.dequeue', 'stage': 'start'}
{'timestamp': '2019-06-08 12:07:57,822', 'event': 'naz.Client.receive_data', 'stage': 'end', 'state': 'unable to read from SMSC', 'error': '[Errno 54] Connection reset by peer', 'project': 'naz_benchmarks', 'smsc_host': 'smsc_host, 'system_id': 'smppclient1', 'client_id': 'HY6CK1YC3LIBOWAQ3', 'pid': 82393}
Traceback (most recent call last):
  File "/Users/home/mystuff/naz/naz/client.py", line 1810, in receive_data
    chunk = await self.reader.read(min(MSGLEN - bytes_recd, 2048))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/streams.py", line 630, in read
    yield from self._wait_for_data("read")
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/streams.py", line 461, in _wait_for_data
    yield from self._waiter
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/selector_events.py", line 714, in _read_ready
    data = self._sock.recv(self.max_size)
ConnectionResetError: [Errno 54] Connection reset by peer
{'timestamp': '2019-06-08 12:07:57,822', 'event': 'naz.Client._parse_response_pdu', 'stage': 'start', 'project': 'naz_benchmarks', 'smsc_host': 'smsc_host, 'system_id': 'smppclient1', 'client_id': 'HY6CK1YC3LIBOWAQ3', 'pid': 82393}
{'timestamp': '2019-06-08 12:07:57,823', 'event': 'naz.cli.main', 'stage': 'end', 'error': 'unpack requires a buffer of 4 bytes'}
Traceback (most recent call last):
  File "/Users/home/mystuff/naz/cli/cli.py", line 88, in main
    async_main(client=client, logger=logger, loop=loop, dry_run=dry_run)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
    return future.result()
  File "/Users/home/mystuff/naz/cli/cli.py", line 115, in async_main
    await tasks
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/coroutines.py", line 129, in throw
    return self.gen.throw(type, value, traceback)
  File "/Users/home/mystuff/naz/naz/client.py", line 1830, in receive_data
    await self._parse_response_pdu(full_pdu_data)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/coroutines.py", line 110, in __next__
    return self.gen.send(None)
  File "/Users/home/mystuff/naz/naz/client.py", line 1852, in _parse_response_pdu
    command_id = struct.unpack(">I", command_id_header_data)[0]
struct.error: unpack requires a buffer of 4 bytes
{'timestamp': '2019-06-08 12:07:57,824', 'event': 'naz.cli.main', 'stage': 'end'}

@komuw komuw added the bug Something isn't working label Jun 8, 2019
@komuw
Copy link
Owner Author

komuw commented Jun 8, 2019

We have added debug log for the full pdu data we get from smsc so if the exception above happens again, we should be able to have some data to help debug issues:

naz/naz/client.py

Lines 1829 to 1837 in 3f0a6e1

full_pdu_data = command_length_header_data + b"".join(chunks)
self._log(
logging.DEBUG,
{
"event": "naz.Client.receive_data",
"stage": "end",
"full_pdu_data": full_pdu_data,
},
)

@komuw
Copy link
Owner Author

komuw commented Jun 8, 2019

import naz
import asyncio

x = naz.Client(
    smsc_host="host",
    smsc_port=2775,
    system_id="smppclient1",
    password="pass",
    loglevel="DEBUG",
    outboundqueue=naz.q.SimpleOutboundQueue(maxsize=1000),
)

pdu = b"\x00\x00\x00\x10\x80\x00\x00\x15\x00\x00\x00\x00\x00\x00\x03f"
loop = asyncio.get_event_loop()

loop.run_until_complete(x._parse_response_pdu(pdu=pdu))

produces:

{
    "timestamp": "2019-06-08 12:34:07,280",
    "event": "naz.Client._parse_response_pdu",
    "stage": "start",
    "smsc_host": "host",
    "system_id": "smppclient1",
    "client_id": "8U6W8OGK4LYDAHS1K",
    "pid": 83257,
}
{
    "timestamp": "2019-06-08 12:34:07,280",
    "event": "naz.Client.command_handlers",
    "stage": "start",
    "smpp_command": "enquire_link_resp",
    "log_id": "",
    "command_status": 0,
    "state": "Success",
    "smsc_host": "host",
    "system_id": "smppclient1",
    "client_id": "8U6W8OGK4LYDAHS1K",
    "pid": 83257,
}
{
    "timestamp": "2019-06-08 12:34:07,280",
    "event": "naz.Client._parse_response_pdu",
    "stage": "end",
    "smpp_command": "enquire_link_resp",
    "log_id": "",
    "command_status": 0,
    "smsc_host": "host",
    "system_id": "smppclient1",
    "client_id": "8U6W8OGK4LYDAHS1K",
    "pid": 83257,
}

however this;

loop.run_until_complete(x._parse_response_pdu(pdu=pdu[:4]))

produces:

{
    "timestamp": "2019-06-08 12:35:13,067",
    "event": "naz.Client._parse_response_pdu",
    "stage": "start",
    "smsc_host": "host",
    "system_id": "smppclient1",
    "client_id": "8U6W8OGK4LYDAHS1K",
    "pid": 83257,
}
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
    return future.result()
  File "/Users/home/mystuff/naz/naz/client.py", line 1860, in _parse_response_pdu
    command_id = struct.unpack(">I", command_id_header_data)[0]
struct.error: unpack requires a buffer of 4 bytes

@komuw
Copy link
Owner Author

komuw commented Jun 8, 2019

It happened again and this is what we have

{
    "timestamp": "2019-06-08 12:42:27,957",
    "event": "naz.Client.receive_data",
    "stage": "end",
    "full_pdu_data": b"\x00\x00\x00\x15\x80",
    "project": "naz_benchmarks",
    "smsc_host": "134.209.80.205",
    "system_id": "smppclient1",
    "client_id": "DRNTKRMTCJR97D9OO",
    "pid": 83197,
}

  File "/Users/home/mystuff/naz/naz/client.py", line 1838, in receive_data
    await self._parse_response_pdu(full_pdu_data)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/coroutines.py", line 110, in __next__
    return self.gen.send(None)
  File "/Users/home/mystuff/naz/naz/client.py", line 1860, in _parse_response_pdu
    command_id = struct.unpack(">I", command_id_header_data)[0]
struct.error: unpack requires a buffer of 4 bytes
{'timestamp': '2019-06-08 12:42:27,959', 'event': 'naz.cli.main', 'stage': 'end'}

@komuw
Copy link
Owner Author

komuw commented Jun 12, 2019

we added a mitigation;

naz/naz/client.py

Lines 1869 to 1895 in 321dbdd

except (ConnectionError, asyncio.TimeoutError) as e:
self._log(
logging.ERROR,
{
"event": "naz.Client.receive_data",
"stage": "end",
"state": "unable to read from SMSC",
"error": str(e),
},
)
if self.SHOULD_SHUT_DOWN:
return None
_read_smsc_interval = 62.00
self._log(
logging.DEBUG,
{
"event": "naz.Client.receive_data",
"stage": "end",
"state": "unable to read from SMSC. sleeping for {0:.2f} seconds".format(
_read_smsc_interval
),
"error": str(e),
},
)
await asyncio.sleep(_read_smsc_interval)
continue # important so that we do not hit the bug: issues/135

@komuw
Copy link
Owner Author

komuw commented Jun 19, 2019

With the following diff;

diff --git a/naz/client.py b/naz/client.py
index ad24442..b0dccf1 100644
--- a/naz/client.py
+++ b/naz/client.py
@@ -1891,7 +1891,14 @@ class Client:
                         # make mypy happy; https://github.com/python/mypy/issues/4805
                         assert isinstance(self.reader, asyncio.streams.StreamReader)

-                    chunk = await self.reader.read(min(MSGLEN - bytes_recd, 2048))
+                    chunk = await self.reader.read(min(MSGLEN - bytes_recd, 5))
+                    import time
+                    print("\n\n\t SLEEPING.")
+                    print("MSGLEN: ", MSGLEN)
+                    print("bytes_recd: ", bytes_recd)
+                    print("chunk: ", chunk)
+                    print("\n\n")
+                    time.sleep(25)
                     if chunk == b"":
                         # TODO: maybe we also need todo; `self.writer=None`
                         # so that the `re_establish_conn_bind` mechanism can kick in.
  • start smsc and redis containers
  • start naz-cli
  • after the first print statement, kill the SMSC container

The following log is produced;

{
  'timestamp': '2019-06-19 14:22:10,275', 
  'event': 'naz.Client.receive_data', 
  'stage': 'start', 
  'smsc_host': '127.0.0.1'
}

	 SLEEPING.
MSGLEN:  20
bytes_recd:  0
chunk:  b'\x80\x00\x00\t\x00'

	 SLEEPING.
MSGLEN:  20
bytes_recd:  5
chunk:  b'\x00\x00\x00\x00\x00'


	 SLEEPING.
MSGLEN:  20
bytes_recd:  10
chunk:  b'\x00\x02SMP'


	 SLEEPING.
MSGLEN:  20
bytes_recd:  15
chunk:  b'PSim\x00'

{
  'timestamp': '2019-06-19 14:23:50,325', 
  'event': 'naz.Client.command_handlers', 
  'stage': 'start', 
  'smpp_command': 'bind_transceiver_resp', 
  'log_id': 'l5aklrkdzbwaqs7an', 
  'command_status': 0, 
  'state': 'Success', 
  'smsc_host': '127.0.0.1'
}

naz-cli seems to just work fine.

@komuw
Copy link
Owner Author

komuw commented Jun 21, 2019

I asked on gophers slack performance channel and;

I dont know if there are formal definitions for this sort of thing, 
but if you run go-fuzz on your parser, I imagine you'll find those sorts of potential infinite recursion bugs. 
My best advice is 'bail early and bail often' if input seems off. 
The fuzzer should help find those sorts of inputs too.   
- mdlayher
Fuzz. And close the network connection if you encounter a protocol error. 
They're generally impossible to recover from safely. It's not like parsing code.   
- dgryski

komuw added a commit that referenced this issue Jun 22, 2019
What:
- If `naz` were to encounter an SMPP protocol error, it now bails early by unbinding and closing connection.
  The usual mechanisms of `Client.re_establish_conn_bind` will kick in at a later time and re-connect/re-bind and things will start over afresh
- also fixed a few documentation issues

Why:
- Fixes: #135
komuw added a commit that referenced this issue Aug 31, 2019
What:
- make sire that `naz` reads exacly the first 16bytes of the smpp header  
  this builds on the earlier work; #153 but now `naz` takes it a step further and will unbind & close connection if it is unable to read the entire SMPP header

Why:
-  this is done to prevent inconsistency and also to try and be faithful to the smpp spec.
- updates; #135
- Other[1] smpp clients also do something similar

References:
1. https://github.com/fiorix/go-smpp/blob/6dbf72b9bcea72cea4a035e3a2fc434c4dabaae7/smpp/pdu/header.go#L67-L73
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant