-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RSDK-9132] Add (Get)Image to the camera interface #4487
base: main
Are you sure you want to change the base?
Conversation
Warning your change may break code samples. If your change modifies any of the following functions please contact @viamrobotics/fleet-management. Thanks!
|
I'm gonna be out next week but this is ready for a first pass review. Also lmk if we should add Nick or/(and?) Dan to this review. Perhaps for future reviews that are a bit more invasive? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're breaking go modules by adding another method to the interface, we should make that method follow 1:1 the inputs and returns of the proto api.
This will mean bookkeeping the release
of each image stream,should you choose to use ReadImage. This will definitely go away once we get the image from the underlying comms connection like we do in the intel realsense and oak-d camera, I don't know how intense that is.
If changing the returns from GetImage to just be 1:1 the API type cause more problems in changing code outside of camera, we can focus on replacements in the camera packages, and let datamanger and vision packages use Stream and Next for this pr, but I think you're nearly there.
components/camera/client.go
Outdated
@@ -228,6 +228,10 @@ func (c *client) Stream( | |||
return stream, nil | |||
} | |||
|
|||
func (c *client) GetImage(ctx context.Context) (image.Image, func(), error) { | |||
return c.Read(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we Read
and handle release so we do not need to include release func in returns? (perhaps that would make sense in the videosourcewrapper layer).
As mentioned, this would be a larger departure from Stream.Next which assuming will cause major problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna try to get rid of release in camera components and see what happens
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we just need to move the rimage.EncodeImage
step one level in i.e. outBytes
should come out of the Image
call instead of being handled in the data collector or camera server/client. Currently release
is called in the collector/server/client, so we should just move it into the Image
implementation. Same logic, just handled more nested in our abstraction.
I guess future module writers should get used to using rimage
to encode their output i.e. read from the source bytes and output a newly constructed []byte
result? Does that sound okay to everyone?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we just need to move the rimage.EncodeImage step one level in i.e. outBytes should come out of the Image call instead of being handled in the data collector or camera server/client.
In the client, I think that makes sense to return rimage
to match the python sdk functionality -- avoid unnecessary decodes if you just want bytes.
I guess future module writers should get used to using rimage to encode their output i.e. read from the source bytes and output a newly constructed []byte result? Does that sound okay to everyone?
Would this be similar to viamimage
in the python sdk, i think that works pretty well.
Looks like the server is already handling release and encoding for the caller.
Are you also suggesting removing encoding step in server GetImage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my WIP in server.go
switch castedCam := cam.(type) {
case ReadImager:
// RSDK-8663: If available, call a method that reads exactly one image. The default
// `ReadImage` implementation will otherwise create a gostream `Stream`, call `Next` and
// `Close` the stream. However, between `Next` and `Close`, the stream may have pulled a
// second image from the underlying camera. This is particularly noticeable on camera
// clients. Where a second `GetImage` request can be processed/returned over the
// network. Just to be discarded.
// RSDK-9132(sean yu): In addition to what Dan said above, ReadImager is important
// for camera components that rely on the `release` functionality provided by gostream's `Read`
// such as viamrtsp.
// (check that this comment is 100% true before code review then delete this paranthetical statement)
img, release, err := castedCam.Read(ctx)
if err != nil {
return nil, err
}
defer func() {
if release != nil {
release()
}
}()
actualMIME, _ := utils.CheckLazyMIMEType(req.MimeType)
resp.MimeType = actualMIME
outBytes, err := rimage.EncodeImage(ctx, img, req.MimeType)
if err != nil {
return nil, err
}
resp.Image = outBytes
default:
imgBytes, mimeType, err := cam.Image(ctx, req.MimeType, ext)
if err != nil {
return nil, err
}
actualMIME, _ := utils.CheckLazyMIMEType(mimeType)
resp.MimeType = actualMIME
resp.Image = imgBytes
}
So I think yes, in the default case we don't encode anymore since the return type is now bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok cool I like that.
I am little confused on why we still need the ReadImager
path here. Shouldnt the camera interface now always have Image
defined so we can just use that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thinking here is since viamrtsp
uses the VideoReaderFunc
and Read
's release
functionality to keep track of pool frames in the frame allocation optimization flow, for the server.go
that serves viamrtsp
, we need to be able to call release()
I think we could refactor viamrtsp
though to just copy out the bytes on return early and use the new .Image
pathway... I'm down to remove ReadImager
entirely as long as we make it a high priority to refactor viamrtsp
to use .Image
and []byte
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, interesting. I think it makes sense to leave in for now for viamrtsp compatibility.
I think we could refactor viamrtsp though to just copy out the bytes on return early and use the new .Image pathway... I'm down to remove ReadImager entirely as long as we make it a high priority to refactor viamrtsp to use .Image and []byte
Yep the whole point to having a memory manager was the issue with passing in a pointer to the avframe in VideoReaderFunc
return causing segfaults with the encoding in the server.
Since we are removing encoding from the server and relying on the user to handle this, we should eventually be able to use Image pathway and simplify things quite a bit : )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All our go cameras will need a refactor for the full fix of this interface, I do not anticipate that we will have the best version of this code and go modules after one pr. Eventually we will get rid of Stream and Next completely. The problem is those videosourcewrapper helpers are all over the place in go modules since they're exported convenience functions.
Always export code conscientiously. But we're breaking this interface anyway, so we'll deal with the following breaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay 👍 will remove ReadImager in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering if it makes sense to do a full removal of Read
from client & server, ReadImage
, and ReadImager
here instead of just aliasing in those cases.
Yes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we replace the cemeraservice GetImage call in RenderFrame
with Image
?
func (s *serviceServer) RenderFrame(
ctx context.Context,
req *pb.RenderFrameRequest,
) (*httpbody.HttpBody, error) {
ctx, span := trace.StartSpan(ctx, "camera::server::RenderFrame")
defer span.End()
if req.MimeType == "" {
req.MimeType = utils.MimeTypeJPEG // default rendering
}
resp, err := s.GetImage(ctx, (*pb.GetImageRequest)(req))
serviceServers use the gRPC named version, so GetImage is right |
} | ||
resp.Image = outBytes | ||
return &resp, nil | ||
actualMIME, _ := utils.CheckLazyMIMEType(resMetadata.MimeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This will change behavior.
Lets say you have a viam-server and camera hosted in a module.
Lets say that the camera in the module can only product PNGs but the user ask for a jpeg.
Previously, the server.go in the module would convert the PNG into a JPEG before sending it to viam-server.
Now, with this change, it will return the PNG to the caller.
I think this is still the right change, as implicitly converting between image formats can introduce image quality degridation as well as significantly worse performance (like 15X worse performance), but I want to confirm:
@bhaney & @randhid are we confident no customer robots depend on this implicitly convert between image mime types
behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The customers I am aware of rely on video storage and upload and rtsp passthrough where they don't really ask for a specific mime type anyway.
Still trying to track down where gostream actually sets a default for both stream types we use.
@@ -28,7 +27,7 @@ type classifierConfig struct { | |||
|
|||
// classifierSource takes an image from the camera, and overlays labels from the classifier. | |||
type classifierSource struct { | |||
stream gostream.VideoStream | |||
src camera.VideoSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't this take a camera?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -19,13 +18,13 @@ import ( | |||
// depthToPretty takes a depth image and turns into a colorful image, with blue being | |||
// farther away, and red being closest. Actual depth information is lost in the transform. | |||
type depthToPretty struct { | |||
originalStream gostream.VideoStream | |||
cameraModel *transform.PinholeCameraModel | |||
src camera.VideoSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can't this be a camera?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -23,13 +22,13 @@ type depthEdgesConfig struct { | |||
|
|||
// depthEdgesSource applies a Canny Edge Detector to the depth map. | |||
type depthEdgesSource struct { | |||
stream gostream.VideoStream | |||
src camera.VideoSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't this be a camera?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -27,7 +26,7 @@ type detectorConfig struct { | |||
|
|||
// detectorSource takes an image from the camera, and overlays the detections from the detector. | |||
type detectorSource struct { | |||
stream gostream.VideoStream | |||
src camera.VideoSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not camera?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
originalStream gostream.VideoStream | ||
stream camera.ImageType | ||
angle float64 | ||
src camera.VideoSource |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not camera?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil, ImageMetadata{}, err | ||
} | ||
defer release() | ||
imgBytes, err := rimage.EncodeImage(ctx, img, mimeType) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this going to force us to do an extra Encode
, which is expensive? Doesn't VideoSource already implement Image
(Which has the same signature as ReadImageBytes
)?
Why do we need this helper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This helper is necessary for cameras that used adhere to the Read
/Stream.Next
signature to be able to serve Image
. The encode here isn't really an extra encode; it would've happened anyways in server.go
before turning into a protobuf message.
robot/client/client_test.go
Outdated
@@ -512,7 +506,7 @@ func TestStatusClient(t *testing.T) { | |||
|
|||
camera1, err := camera.FromRobot(client, "camera1") | |||
test.That(t, err, test.ShouldBeNil) | |||
_, _, err = camera.ReadImage(context.Background(), camera1) | |||
_, _, err = camera1.Image(context.Background(), rutils.MimeTypeJPEG, nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we test that this DOESN'T return JPEG bytes and still doesn't error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a check to make sure that the bytes and metadata are nil and empty struct respectively
robot/impl/local_robot_test.go
Outdated
@@ -93,7 +93,7 @@ func TestConfig1(t *testing.T) { | |||
c1, err := camera.FromRobot(r, "c1") | |||
test.That(t, err, test.ShouldBeNil) | |||
test.That(t, c1.Name(), test.ShouldResemble, camera.Named("c1")) | |||
pic, _, err := camera.ReadImage(context.Background(), c1) | |||
pic, err := camera.GoImageFromCamera(context.Background(), rutils.MimeTypeJPEG, nil, c1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we specifying jpeg here, when previously the ReadImage
implementation didn't?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to specify something because otherwise we hit
/Users/seanyu/Projects/sean-rdk/robot/impl/local_robot_test.go:98: Expected: nil
Actual: 'could not get image bytes from camera: do not know how to encode ""'
…from img; Fix prev test bug
} | ||
|
||
// one kvp created with camera.Extra | ||
ext := camera.Extra{"hello": "world"} | ||
ctx = camera.NewContext(ctx, ext) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving this test deleted since we no longer have this helper function
Overview
https://viam.atlassian.net/browse/RSDK-9132
This PR does not remove
Stream
from theCamera
interface or any gostream logic. It just wraps it all in better English that corresponds with the gRPC methodGetImage
. Mostly just adding syntactic sugar overstream.Next
/ReadImage
using a newImage
method (for existing builtins that use gostream), and modifying the camera server and client appropriately to utilizeImage
.My hopes are that this encourages people to never be tempted to use
Stream
in modules or evenVideoReaderFunc
. We will eventually removeStream
from the baseCamera
interfaceIt will also provide a significant performance boost to the
ReadImage
data collector since we no longer have to transcode the oldimage.Image
output into bytes anymore incollector.go
.Summary of changes
Image
to camera interfaceImage
incamera/client.go
,camera/server.go
and data collectorImage
in builtins (webcam, replaypcd, ffmpeg, entire transform pipeline, fake etc.) instead of Stream wrappers in preparation for removing Stream entirely from the camera interfaceManual Tests
Did this list of manual tests already twice at different points in development, but should re-test after these code reviews.
viamrtsp
h264 with and without passthroughAlso built the viamrtsp module with local replacement off this branch and the stream is working for both rtp passthrough true and false, indicating that videosourcewrapped modules shouldn't be affected by this change.
Profile for potential performance improvement in data collection
If the PDFs are insanely blurry, try downloading and opening with system viewer
Link
The big thing that Nick noted was alarming was the staggering memory used to alloc
image NewYCbCr
. This is no longer the case on this feature branch.Config used for pprof
Comparative performance from an operator's perspective
Config used
v0.50.0 branch (RPI4):
Feature branch (RPI4):
v0.50.0 (Orin Nano):
Feature branch (Orin Nano):
RPI4 memory improvement: I noticed that off the feature branch, memory stabilized around 1.9gb, while on the main branch, memory stabilized around 2.5gb
RPI4 CPU improvement: I learned how to filter for processes in this comparison. It's quite clear that the CPU usage is lower as all figures are lower across the board for the feature branch.
Orin Nano CPU improvement: Though the core usage was about the same between main and feature, I noticed that on the latest release, the viam-server processes were always taking the most CPU. Off the feature branch, viamrtsp were always on top.
I did not notice memory improvements on the Orin Nano. May be missing something. Operator skill issue perhaps