-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SAM firmware reverse engineering #64
Comments
Oh, very nice work! Regarding debug logs: How are they sent to the host? Standard events? I think it would be nice if we could provide some way to enable them via sysfs and dump them to the kernel logs. |
Regarding terminology: I think SurfLink is the charging/dock connector. Blade (at least on the SPX) is the keyboard connector, which also uses a UART. I don't know the specifics of that on earlier generations like the SP7. Also there's an older talk from Alex Ionescu about the SP4 firmware (including KaOS, IIRC). Might interest you: https://recon.cx/media-archive/2017/mtl/recon2017-mtl-04-alex-ionescu-Fun-with-Sam-Inside-the-Surface-Aggregator-Module.mp4. (Note that the SP4 uses the old HID interface instead of the "new" UART one). |
Not events I think, just regular messages. If you do:
Then you'll start seeing "dropping unexpected command message" errors in dmesg.
"Blades" was MS's term for Surface accessories, but all the news articles about it are from 2013, so I figured it was dead technology. Makes sense that the pins are still there on the keyboard connector, but is the Blade stuff actually used for the keyboard? I thought the keyboard used the KIP or HID messages?
Definitely very interesting! Skimmed it quickly. His description of Kaos seems very similar to what I've seen, but most of the other stuff he mentions sounds different. He says he spent months reverse engineering almost everything, I wonder why he never published anything beyond this talk. |
Ah, I think that's what I meant with "events" (messages sent by the EC that are not a response to a direct previous command. Might have a look at that later today.
Ah, I kind of feared that. But makes sense if they have some tracing infrastructure set up. Might be worth a try checking how that windows kernel driver trace stuff works and see if there are similarities.
So I'm not sure how the SP4 to SP7+ handle the keyboard stuff since the kernel sees that as USB, but on the SPX, SP8, and SP9, the connector thing for the keyboard is some custom serial/UART thing, which I think is what they call the blade interface. That then gets handled/translated into HID messages by SAM, which it presents via the HID interface. I think the KIP subsystem (likely keyboard and integrated peripherals or something like that) is also involved (also that might be some separate processor due to extra firmware I think), so maybe that translates it instead of SAM. So in essence: I'm wondering whether the blade thing is relevant for the gens before that at all... maybe SAM somehow has a USB interface on those gens and the blade interface is still used? Or maybe in some reduced capacity? Otherwise I'd have expected it only on the newer Pros and maybe on the Surface Books.
Yeah, he has the presentation somewhere as PDF but that's unfortunately all I could find. Also a lot has probably changed since the SP4 days. Especially due to the new interface and added components. |
Yeah, normal events. I also had to enable them via
|
So CID=0x48 seems to send some null-terminated strings, but the majority of data seems to be in some binary format. Also my guess is that there's some timestamp and other header stuff before each record. |
Well, Alex also discusses the Blade stuff he found in the old firmware, which includes communication and authentication. And the SP7 firmware also has various Blade related tasks/etc. There's some old info here which suggests the old type cover also uses a serial protocol: http://edwardsh.in/keyboard%20cover/2015/08/13/applying-logic-to-the-surface-touch-cover So my guess would be that all the type covers use the "Blade" protocol, and they've just changed where/how the translation to HID takes place.
Ah, ok. From a quick look at your code I figured events had small request ids whereas the ids in the errors looked more random. I need to have a closer look at the event stuff.
Yeah, it should be similar/identical to the TCL data. I'll see if I can figure out the format. |
I think that makes sense.
On my SB2, the request ID seems to always be 0x0007 (normally request ID should match the target category, so that checks out here). But other values (command ID and especially instance ID) seem all over the place. I assume instance ID is some subsystem ID. Haven't checked on the SPX yet though, so maybe things are a bit different on newer devices. |
And you are absolutely correct with the request IDs on the newer devices... they're all over the place. With the new format it looks like
and I don't have to enable any events (in fact trying to do that will return some error code). So, there's the |
I've added some info about the debug log data format. And I was wrong about it being related to TCL, it doesn't look like the log data ends up in the TCL buffers. There's a fault handler function that seems to fill TCL buffer 1. It's not yet clear to me how the other TCL buffers are filled.
Yeah, AFAICT that just means the message was intended to go to TID 3 = Debug. When you override the log target, it doesn't bother to set the "correct" TID. |
If I haven't messed anything up it's actually a bit weirder than that: For "normal" messages you have e.g.:
But we have |
I think it's possible this could also be interpreted as debug to SAM and SAM to host, but I'm not sure if that would fit into the whole KIP perspective and I'd also have thought that debug messages originate from SAM itself. Any chance you can find out more about the two target ID bytes, especially on how they seem to be used? |
The bytes are just target ID and source ID. |
Ah, got it. That actually makes much more sense, thanks. I'll update the docs accordingly. |
Alright, I've improved the handling for unknown/unsupported TIDs a bit: linux-surface/kernel@32815a5...351805f. Mostly just linux-surface/kernel@f1b2c93, which means that instead of trying to match up the request ID to something that in the best case doesn't exist and in the worst case is a wrong match, it ignores and drops anything that isn't addressed to the host directly. I guess if we want to properly handle the debug messages, we'll have to handle them separately from regular messages anyways, meaning we'll need a |
Looks good! Meanwhile I've managed to flash modified firmware to the SAM. Interestingly there seems to be some code in the firmware update logic to do something with hashes and (maybe) signatures, but it's not actually used for the firmware images MS provides. So all you need to do is extract the firmware image, modify it, update the CRC16 at the end of the file, and then upload it using TC 9 CID 3 and CID 4. So now I've added a command to the firmware to write arbitrary memory. And it works. :) I've published some scripts here: https://github.com/quo/sam-fw-tools |
Nice work! I'm kind of surprised that it's not signed. Is there any other protection against that or could just any random user with admin permissions on Windows upload some firmware? |
My guess is that anyone who can communicate with the SAM can flash new firmware. You might even be able to do it via the Surflink connection (ie. without being admin, and even when the device is turned off). There are some conditions that seem to relate to whether firmware updates are allowed that I haven't really figured out yet, but that's bypassed when the safe mode is disabled. |
If you manage to do that without any sort of authentication (I'd hope there is some), I think you should ask MS for a sizeable bug bounty xD I honestly have no idea how locked down the driver interface is on Windows, so chances are that that's limited to some kernel stuff and user-space might be blocked somehow... might be worth checking if the Windows API allows arbitrary commands. Some SAM stuff should definitely be available like DTX (on the SB2 and SB3) or I think the TCL stuff. But if an attacker has admin rights, you've probably lost anyways. Anyways... awesome work! |
I actually wonder if the Surflink UART connection from the SAM is directly exposed on the Surflink connector, or if there is another controller in between. Might be fun to try to probe the connector, send some commands to TID 4, and see if anything shows up. I'm not really sure what the worst thing is that you could actually do with modified SAM firmware though, in terms of security. Keylogging and then somehow exfiltrating via synthesized HID events maybe? |
Yeah, HID interface is probably the most dangerous thing. Keylogging, full keyboard and touchpad control... I'd guess you could find ways to use that as a sort of basic rootkit to pull more advanced stuff into the OS (like having it write commands into a terminal or something, use that to disable some protection stuff and download OS-level malware). Doesn't really have to be that dangerous by itself, the problem is you can't really detect it from the OS until it starts to act. |
I'm not sure if the information that I have is correct or how reliable it is, but there is a SAM_DEBUG_RX and a SAM_DEBUG_TX line connected directly from SAM to SurfLink. Now I'm not sure if that is only for pre-production / debug models or if that is also on the final ones. In fact, there even seems to be a debug mux that can multiplex those lines to lines of one of the USB-C ports on the SPX (again, no idea if that's still present on the final models). The USB-C mux apparently also allows access to the blade UART and the SAM-to-SoC/Host TX/RX lines. I would somewhat assume that the USB-C mux has been stripped from the final models but the SurfLink pins might still be present (since I don't think they'd need any additional hardware). |
Finally got around to unpack the SPX firmware and load it into Ghidra. Had to specify 0x67c as offset (not 0x66c), but with that, everything seems to work. |
Nice! The offset is 0x67c because the unpack script prepends the 0x10 byte setup data, since this is needed by the upload script. So the file consists of the 0x10 byte setup header + 0x66c bytes of headers (see parse_fw()) + the actual ARM binary. A couple pointers to get you started (assuming the firmware is very similar to the SP7 firmware):
|
Thanks! That is quite helpful! |
First of all, @qzed thanks for all the work you've already done reverse engineering and documenting a lot of the SAM stuff. It's been quite helpful!
I've been reverse engineering the SP7 SAM firmware (specifically SurfaceSAM_14.312.139.bin) in an attempt to debug something. Haven't found anything particularly useful so far, but figured I would share regardless. Let me know if you'd like me to look at anything in particular.
Click here for info dump
Firmware structure
As can be seen on SP7 teardown photos, the SAM microcontroller is an NXP LPC54S001J (Cortex-M4, 360KB RAM), with a separate Winbond 16MB flash chip.
The SVD is for a slightly different part number, but it does the job. There's a script to import SVDs into Ghidra, but it fails due to some overlapping address ranges. You can either remove these from the SVD or hack the script a bit.
The bin file consists of a signature and two firmware images. The images are encoded as arrays of
{ u32 offset, u8 len = 16, u8[16] data }
, so can be extracted fairly easily. The two images are identical, except one is meant to be flashed at 0x10004000 and the other at 0x10084000 (standard A/B update handling), so some addresses differ by one bit. The images have a header and end with a CRC16, which are used by the SAM when flashing. The actual raw firmware image starts at 0x66C.The raw firmware images start with a standard ARM vector table, and contain an NXP image header which tells the NXP bootloader to load the first 0x29484 bytes into SRAM at address 0. For reverse engineering, you can just split the raw image and load the first 0x30000 bytes at 0, and the remaining 0x50000 bytes at 0x10034000 or 0x100B4000.
The firmware contains an RTOS which is internally referred to as Kaos. I can't find anything about it on Google, so I assume it was created by MS. Kaos appears somewhat inspired by FreeRTOS and offers the same basic primitives: tasks, timers, events, semaphores, and message queues. There are dozens of tasks and timers, which communicate through dozens of message queues, which leads to a ton of indirection (and memory overhead) so it can be very difficult to follow what's going on. Some parts of the firmware use vtable-like constructions for some additional indirection. Lots of fun.
SAM protocol
(I'll try to use the terminology from https://github.com/linux-surface/surface-aggregator-module/tree/master/doc.)
The following target(/source) IDs are implemented:
I believe the SAM just forwards any messages with TID != 1 to different serial connections. (But I have not yet traced the entire message path.)
For TID == 1, only the following TCs are handled by the SAM on the SP7:
The firmware does not explicitly name the TCs in any way (not even using the abbreviated names), so the above names are based on names of related tasks, message queues, etc. Of course, most of these names were already known.
FWU/TCL/SRQ commands are handled together via the NVM message queue, so presumably they all use the flash storage in some way.
Debug mode
The SAM has a debug mode variable (default 0) and a "safe mode" flag (default 1).
The safe mode flag is read with TC 1, CID 0x27, and written by TC 7, CID 0x5F.
When the safe mode flag is true, all CIDs >= 0x80 are disabled (for all TCs), and various other functionality is disabled.
The debug mode is read with TC 1, CID 0x29, and written by TC 7, CID 0x4E.
The debug mode values are:
The debug mode can only be set to 0 or 2 normally. It is currently not known how to set it to other values.
There is a command to read arbitrary RAM in debug mode 2, which is very useful, however there appears to be no command to write RAM, even in the higher debug modes. And everything seems to be locked down pretty well (range checks on all command arguments), so I've been unable to find a way to write arbitrary memory so far.
Commands
Here's a complete list of command IDs I've found (for TID 1), and descriptions for some of them. This list will probably contain mistakes! Will try to update this as I figure out more.
Format:
CID { command data } => { response data } description
"Handled separately" means the switch handling these commands is in a separate function, so presumably the commands have related functionality.
(Please excuse the poor formatting.)
TC 01: SAM
TC 02: Battery
Handled separately:
Handled separately:
TC 03: Thermal
Handled separately:
Handled separately:
TC 04: Power
TC 05: Fan
TC 07: Debug
Requires debug mode 2 or 4:
Requires debug mode 4:
When log target is set to Host, SAM will send log messages with TID=3. The request ID for these messages (except CID 49) is set to a hash of internal timers, so is effectively random.
When the amount of data in a log record exceeds 40 bytes, it is split over multiple messages with the same request ID.
The first 8 bytes of the log data are always
{ u32 timestamp_millis, u32 event_code }
. For split records, only the first message will have this header.There are four types of log record: u32 array, string array, error, and buffer. These use the following CIDs:
TC 09: Firmware update
Requires debug mode 3 or 4:
Firmware destinations:
TC 0C: TCL
Valid buffer/instance combinations:
Handled separately (data seems to be all zeroes?):
Handled separately (setters for above):
TC 0D: Surflink
(TODO The command decoding seems different here. Maybe not CIDs?)
TC 10: Surface Blades
Handled separately:
(TODO The command decoding seems different here?)
TC 12: Sensors
Instance ids:
TC 13: SRQ
TC 15: HID
TC 17: Backlight
TC 1B: USB-C
Handled separately:
The text was updated successfully, but these errors were encountered: