Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: extra copying justification #304

Open
xaionaro opened this issue Jan 31, 2020 · 15 comments
Open

Discussion: extra copying justification #304

xaionaro opened this issue Jan 31, 2020 · 15 comments

Comments

@xaionaro
Copy link
Member

xaionaro commented Jan 31, 2020

Hello.

I'm wondering if there's any specific reason of this copyings:

	newBuf := data[:fv.Length]
	fv.buf = make([]byte, fv.Length)
	copy(fv.buf, newBuf)

I've removed copying in my sandbox and everything still works. Also:

BenchmarkFianoUEFIParse-8      26365509 ns/op         730 B/op    10 allocs/op

vs

BenchmarkFianoUEFIParse-8      33305961 ns/op    33555673 B/op    13 allocs/op

So this copyings creates a lot of extra memory consumption for nothing.

Also in my case this copyings creates some obstacles, because I'm trying to calculate offsets by subtracting one pointer from another. (this point is not relevant anymore)

[xaionaro@void fiano]$ go test ./...
ok      github.com/linuxboot/fiano/cmds/fmap    0.097s
ok      github.com/linuxboot/fiano/cmds/fspinfo 0.024s
?       github.com/linuxboot/fiano/cmds/glzma   [no test files]
?       github.com/linuxboot/fiano/cmds/guid2english    [no test files]
?       github.com/linuxboot/fiano/cmds/utk     [no test files]
ok      github.com/linuxboot/fiano/integration  4.587s
ok      github.com/linuxboot/fiano/pkg/compression      6.119s
ok      github.com/linuxboot/fiano/pkg/fmap     0.030s
ok      github.com/linuxboot/fiano/pkg/fsp      0.012s
ok      github.com/linuxboot/fiano/pkg/guid     0.041s
ok      github.com/linuxboot/fiano/pkg/guid2english     0.020s
?       github.com/linuxboot/fiano/pkg/knownguids       [no test files]
ok      github.com/linuxboot/fiano/pkg/uefi     0.004s
?       github.com/linuxboot/fiano/pkg/unicode  [no test files]
?       github.com/linuxboot/fiano/pkg/utk      [no test files]
ok      github.com/linuxboot/fiano/pkg/visitors 6.907s
?       github.com/linuxboot/fiano/scripts/checklicenses        [no test files]
?       github.com/linuxboot/fiano/scripts/namecollect  [no test files]
[xaionaro@void fiano]$ go test ./... -race
ok      github.com/linuxboot/fiano/cmds/fmap    4.040s
ok      github.com/linuxboot/fiano/cmds/fspinfo 1.010s
?       github.com/linuxboot/fiano/cmds/glzma   [no test files]
?       github.com/linuxboot/fiano/cmds/guid2english    [no test files]
?       github.com/linuxboot/fiano/cmds/utk     [no test files]
ok      github.com/linuxboot/fiano/integration  20.109s
ok      github.com/linuxboot/fiano/pkg/compression      51.295s
ok      github.com/linuxboot/fiano/pkg/fmap     1.082s
ok      github.com/linuxboot/fiano/pkg/fsp      1.009s
ok      github.com/linuxboot/fiano/pkg/guid     1.026s
ok      github.com/linuxboot/fiano/pkg/guid2english     1.422s
?       github.com/linuxboot/fiano/pkg/knownguids       [no test files]
ok      github.com/linuxboot/fiano/pkg/uefi     1.030s
?       github.com/linuxboot/fiano/pkg/unicode  [no test files]
?       github.com/linuxboot/fiano/pkg/utk      [no test files]
ok      github.com/linuxboot/fiano/pkg/visitors 106.731s
?       github.com/linuxboot/fiano/scripts/checklicenses        [no test files]
?       github.com/linuxboot/fiano/scripts/namecollect  [no test files]

So the question is would you accept a PR where I just remove this copyings?

@hugelgupf
Copy link
Member

I'm not involved here or anything, but

I'm trying to calculate offsets by subtracting one pointer from another.

caught my attention in email. I would say pointer arithmetic like that is kind of frowned upon in Go. Is it necessary?

xaionaro added a commit to xaionaro-facebook/fiano that referenced this issue Jan 31, 2020
Removed extra copies. It reduces memory consumption,
improves performances and simplifies the result.

ITS: linuxboot#304

Signed-off-by: Dmitrii Okunev <xaionaro@fb.com>
xaionaro added a commit to xaionaro-facebook/fiano that referenced this issue Jan 31, 2020
Removed extra copies. It reduces memory consumption,
improves performances and simplifies the result.

ITS: linuxboot#304

Signed-off-by: Dmitrii Okunev <xaionaro@fb.com>
@xaionaro
Copy link
Member Author

xaionaro commented Jan 31, 2020

I would say pointer arithmetic like that is kind of frowned upon in Go. Is it necessary?

@hugelgupf : Yeah, it's a fair question. But I do so very rarely and in my current case I think (but not sure, yet:) it's justified :)

Anyway the library should try to create less constraints if possible ("require less, return more"). And anyway I was hoping to convince to remove copyings using other arguments: first of all, by reducing memory consumption multiple times.

@xaionaro xaionaro changed the title Extra copying Discussion: extra copying justification Jan 31, 2020
@GanShun
Copy link
Member

GanShun commented Jan 31, 2020

I did the copying in the past because once you start modifying the files underneath a firmwarewarevolume, during the reconstruction I was seeing some corruption because the firmwarevolume buffer is shared with the file buffer. If we really want to avoid this, I would change this so that the parent buffers terminate after their headers, and not contain the child buffers

@GanShun
Copy link
Member

GanShun commented Jan 31, 2020

I would +1 on not trying to calculate offsets by subtracting pointers. Lets discuss what you're trying to achieve and maybe we can find another way to do it

@rminnich
Copy link
Contributor

safety is the critical element of fiano, not speed. If there's even a question erring on the side of safety is the right move.

@xaionaro
Copy link
Member Author

xaionaro commented Jan 31, 2020

I did the copying in the past because once you start modifying the files underneath a firmwarewarevolume, during the reconstruction I was seeing some corruption because the firmwarevolume buffer is shared with the file buffer.

It seems to me a bad solution to just hide changes. If something was changed underneath it should be shown "over-neath" as well.

Is that possible to get an unit-test for the situation you faced then? May be I will find a way to fix the problem without copying. I would be happy to try to do so :)

I would +1 on not trying to calculate offsets by subtracting pointers. Lets discuss what you're trying to achieve and maybe we can find another way to do it

It's a too long and unrelated discussion, IMHO, because it would require to explain details of the program I work on (and discuss some other decisions). And anyway I doubt it's possible to find another elegant way to do what I need. So as I said:

Anyway the library should try to create less constraints if possible ("require less, return more"). And anyway I was hoping to convince to remove copyings using other arguments: first of all, by reducing memory consumption multiple times.

So I suggest to just forget about pointers for a second and make more focus to other arguments :)

safety is the critical element of fiano, not speed. If there's even a question erring on the side of safety is the right move.

As I said, my main argument was to reduce memory consumption. If it is desired to make this tools to be able to used in tiny embedded environments then it's important.

Also an important argument is: the behavior of pkg/-stuff should be predictable. But it appears that if you change a byte in a "File" then in the total firmware image nothing will be changed. Which is subjectively counter-intuitive and I'm not sure that it is documented somewhere. If you walk through a document/object/whatever, find something you wanna change and change it, you usually expect to the document/object/whatever be changed as well. So I believe we just should remove copyings and fix the corruptions. But, of course, it's just my subjective perception :)

@JulienVdG
Copy link
Collaborator

Hello,
Just a few thought:
To my knowledge, the copy is not really a problem, all you have to do is call assemble after you changed a file.
I'm not an expert of the go internal but how would you handle growing the file size ? I'm guessing that if you don't have a copy, growing the slice will overwrite the next file as it will still use the underling array.
Or we can simply ensure the size is not changed and remove/insert the larger file instead.

@JulienVdG
Copy link
Collaborator

I'm not against the idea, just that there is more to think about it... the copy makes it easy to change the content without breaking the structure until we use the assemble visitor to put everything back in place.

Also there are other similar copies in other part of the code than the 3 you address that probably should be addressed in the same issue. (meregion.go, section.go, rawregion.go, nvram.go and flash.go)

@xaionaro
Copy link
Member Author

xaionaro commented Feb 3, 2020

I'm not an expert of the go internal but how would you handle growing the file size ?

I suppose when (and only then) we need to grow, we can perform a copy.

@xaionaro
Copy link
Member Author

xaionaro commented Feb 4, 2020

Also, it seems, there's currently even actually no existing methods to extract offsets with a proper way :)
Related: #164

@JulienVdG
Copy link
Collaborator

Also, it seems, there's currently even actually no existing methods to extract offsets with a proper way :)
Related: #164

nice catch, I had already forgotten about that one, but true solving this before would certainly help

@GanShun
Copy link
Member

GanShun commented Feb 5, 2020

Also an important argument is: the behavior of pkg/-stuff should be predictable. But it appears that if you change a byte in a "File" then in the total firmware image nothing will be changed. Which is subjectively counter-intuitive and I'm not sure that it is documented somewhere. If you walk through a document/object/whatever, find something you wanna change and change it, you usually expect to the document/object/whatever be changed as well. So I believe we just should remove copyings and fix the corruptions. But, of course, it's just my subjective perception :)

I agree that it's counterintuitive that if you don't call assemble, changes in the underlying data structure will not be reflected up to the top. The issue with not copying, and copying when you grow, is actually deeper. If we don't force a reassemble when you both either shrink or grow an element even by a few bytes, using the top level buffer results in an invalid image.

Example:

Suppose the FV contains 3 files and you shrink the first file by 10 bytes:
Due to the way UEFI files are parsed, if you take the entire buffer as is, this means there are now 10 empty bytes after the first file. This results in the header of the second file being treated as invalid, and all subsequent files being treated as invalid. If we wanted to maintain that the child buffers are slices of the parent buffers, every size change to a file would result in the entire image after the file having to be moved each time you make a modification. This also doesn't take into account compressed firmware volumes, because you can't make a change in a compressed section or file, and expect the parent buffer to be correct. You need to reassemble and recompress everything.

I feel like adding support for not having to reassemble after every change would result in a lot of complications. We can get around the copy, if we strictly enforce that the parents buffers be truncated at the end of their headers, so that there's only one slice containing the data at one time, however you still have to call assemble/save after making your modifications, because this would then finally generate the top level binary you can reuse and flash.

@GanShun
Copy link
Member

GanShun commented Feb 5, 2020

Again I would like to ask you, even if it takes a long while to explain, what are you trying to achieve? maybe we can make a custom visitor for you to make these modifications easily without relying on not calling assemble.

@xaionaro
Copy link
Member Author

xaionaro commented Mar 24, 2020

I feel like adding support for not having to reassemble after every change would result in a lot of complications.

Yeah. I guess you right. The copy()-s are justified, indeed.

even if it takes a long while to explain, what are you trying to achieve?

Well. It appears the approach with pointers was wrong. So I was have to use another. But still I need offsets (which is blocked by #164 ). Roughly speaking, my initial task is to perform a smart comparison of two firmwares images (the original one, and the one was extracted from a server) to see what could lead to the boot failure and why could that (image corruption) happen. So when I analyze the original image I find byte ranges which are important to the boot process. And since I try to detect bit flips I would prefer to just compare using the same byte ranges instead of re-parsing the damaged image (if a bit flipped in some "length" of whatever I will receive a wrong report). Anyway thank you for explaining about copy()-s, I believe this task is not relevant anymore :)

@xaionaro
Copy link
Member Author

I have to return to this question. The service I implemented (based on fiano) consumes too much memory due to this copies. May be it worth to consider a special "read-only" mode to avoid this copy()-ing? The mode could be disabled by default, and be enable-able through a global variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants