Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSDK-8719: - use maintenance config in RDK #4396

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
6614ddf
first pass
Kschappacher Sep 25, 2024
984b765
clean up errors
Kschappacher Sep 25, 2024
43e8cd9
lint
Kschappacher Sep 25, 2024
a2b0b48
fix tests and lint
Kschappacher Sep 27, 2024
8964051
lint
Kschappacher Sep 27, 2024
33aec5b
Merge remote-tracking branch 'upstream/main' into RSDK-8167-enable-ma…
Kschappacher Sep 27, 2024
ad3eff8
add timeout and proto conversions
Kschappacher Oct 9, 2024
3ac4051
fix timeout
Kschappacher Oct 9, 2024
7660bba
add remote testings
Kschappacher Oct 10, 2024
b20cfda
clean up code
Kschappacher Oct 11, 2024
c3b2410
Merge remote-tracking branch 'upstream/main' into RSDK-8167-enable-ma…
Kschappacher Oct 11, 2024
b11af36
lint and clean up code
Kschappacher Oct 11, 2024
20aae80
Merge remote-tracking branch 'upstream/main' into RSDK-8167-enable-ma…
Kschappacher Oct 11, 2024
10ebeeb
address pr comments
Kschappacher Oct 11, 2024
7653ecc
Merge remote-tracking branch 'upstream/main' into RSDK-8167-enable-ma…
Kschappacher Oct 14, 2024
acd77e0
address pr comments, move config revision to after reconfig blocker
Kschappacher Oct 14, 2024
f4effb4
update tests
Kschappacher Oct 14, 2024
7ae8e32
add nil check
Kschappacher Oct 14, 2024
98f69d3
Update robot/impl/local_robot.go
Kschappacher Oct 14, 2024
c7f5cfc
Update robot/impl/local_robot.go
Kschappacher Oct 14, 2024
443ca6d
Update robot/impl/local_robot.go
Kschappacher Oct 14, 2024
62e51aa
remove dup
Kschappacher Oct 14, 2024
15bcb78
Merge branch 'RSDK-8167-enable-maintenance-reconfige-sensor' of ssh:/…
Kschappacher Oct 14, 2024
f8a58d9
add test cases and ensure readings work after going to proto and back
Kschappacher Oct 14, 2024
36a788b
clean up dif
Kschappacher Oct 14, 2024
b70a0a7
fix log names
Kschappacher Oct 15, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 64 additions & 0 deletions robot/impl/local_robot.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import (
"google.golang.org/grpc/status"

"go.viam.com/rdk/cloud"
"go.viam.com/rdk/components/sensor"
"go.viam.com/rdk/config"
icloud "go.viam.com/rdk/internal/cloud"
"go.viam.com/rdk/logging"
Expand Down Expand Up @@ -1174,6 +1175,44 @@ func (r *localRobot) applyLocalModuleVersions(cfg *config.Config) {
}

func (r *localRobot) reconfigure(ctx context.Context, newConfig *config.Config, forceSync bool) {
// Maintenance config can be configured to block reconfigure based off of a sensor reading
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for encoding rules here. Some questions:

  • Where is the logic for the priority described below?
  • Will this documentation be exposed to users in any other way besides this comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The logic is spread out a bit. Some is controlled by how we start up resources when there are conflicts. The other parts of the logic depends on how sensors are returned from ResourceFromRobot when there are conflicts
  2. This is a good point. I think we should expose this logic to users. I think the frontend component would be a good place to share this info

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I buy that they all just "work" without much changes. How does this one happen?

Main robot and fragment share sensor name -> main robot sensor is chosen

I wasn't aware that we had any special logic around fragments.

This is a good point. I think we should expose this logic to users. I think the frontend component would be a good place to share this info

SGTM; as long as that's captured in some eventual fleet ticket.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh forgot to mention the fragments. The logic for that is in the frontend when we apply a fragment to a config. so I guess the logic is spread out even more 😅

Also 👍 will add it to the front end ticket

// These sensors can be configured on the main robot, or a remote
// In situations where there are conflicting sensor names the following behavior happens
// Main robot and remote share sensor name -> main robot sensor is chosen
// Only remote has the sensor name -> remote sensor is read
// Multiple remotes share a senor name -> conflict error is returned and reconfigure happens
// To specify a specific remote sensor use the name format remoteName:sensorName to specify a remote sensor
if newConfig.MaintenanceConfig != nil {
name, err := resource.NewFromString(newConfig.MaintenanceConfig.SensorName)
if err != nil {
r.logger.Warnf("sensor_name %s in maintenance config is not in a supported format", newConfig.MaintenanceConfig.SensorName)
} else {
sensorComponent, err := robot.ResourceFromRobot[sensor.Sensor](r, name)
if err != nil {
r.logger.Warnf("%s, Starting reconfiguration", err.Error())
} else {
canReconfigure, err := r.checkMaintenanceSensorReadings(ctx, newConfig.MaintenanceConfig.MaintenanceAllowedKey, sensorComponent)
if !canReconfigure {
r.logger.Info("maintenance_allowed_key found from readings on maintenance sensor. Skipping reconfiguration.")
diff, err := config.DiffConfigs(*r.Config(), *newConfig, false)
if err != nil {
r.logger.CErrorw(ctx, "error diffing the configs", "error", err)
}
// NetworkEqual checks if Cloud/Auth/Network are equal between configs
if diff != nil && !diff.NetworkEqual {
r.logger.Info("Machine reconfiguration skipped but Cloud/Auth/Network config section contain changes and will be applied.")
}
return
}
if err != nil {
r.logger.Warn(err.Error() + ". Starting reconfiguration")
} else {
r.logger.Info("maintenance_allowed_key found from readings on maintenance sensor. Starting reconfiguration")
}
}
}
}

r.configRevisionMu.Lock()
r.configRevision = config.Revision{
Revision: newConfig.Revision,
Expand Down Expand Up @@ -1435,3 +1474,28 @@ func (r *localRobot) MachineStatus(ctx context.Context) (robot.MachineStatus, er
func (r *localRobot) Version(ctx context.Context) (robot.VersionResponse, error) {
return robot.Version()
}

// checkMaintenanceSensorReadings ensures that errors from reading a sensor are handled properly.
func (r *localRobot) checkMaintenanceSensorReadings(ctx context.Context,
maintenanceAllowedKey string, sensor resource.Sensor,
) (bool, error) {
timeout := 5 * time.Second
ctx, cancel := context.WithTimeout(ctx, timeout)
defer cancel()

// Context timeouts on this call should be handled by grpc
readings, err := sensor.Readings(ctx, map[string]interface{}{})
if err != nil {
// if the sensor errors or timeouts we return false to block reconfigure
return false, errors.Errorf("error reading maintenance sensor readings. %s", err.Error())
}
readingVal, ok := readings[maintenanceAllowedKey]
if !ok {
return true, errors.Errorf("error getting maintenance_allowed_key %s from sensor reading", maintenanceAllowedKey)
}
canReconfigure, ok := readingVal.(bool)
if !ok {
return true, errors.Errorf("maintenance_allowed_key %s is not a bool value", maintenanceAllowedKey)
}
return canReconfigure, nil
}
Loading
Loading