Handle stop of price submission in the oracle gracefully #851

nud3l · 2023-01-13T09:15:19Z

Describe the bug
When the oracle stops submitting prices for an asset, the parachain goes into an error state ORACLE_OFFLINE. However, there is a valid reason to stop submitting prices for an asset: The asset is delisted from the bridge or lending protocol, so the prices are no longer needed.

Also, it's possible that a single asset could just be removed from our oracle source. Delisting might happen due to issues with the underlying asset, lack of demand, or more.

Expected behavior

When a single oracle asset is not reported anymore, we should not stop the whole bridge and the whole lending market. Instead, the affected assets should be paused.
Delisting assets requires governance. Within the same governance to delist a market or a collateral asset from the bridge, governance should be able to delist the oracle price reporting of an asset without the parachain entering the ORACLE_OFFLINE state.

The text was updated successfully, but these errors were encountered:

sander2 · 2023-01-23T18:16:08Z

When a single oracle asset is not reported anymore, we should not stop the whole bridge and the whole lending market. Instead, the affected assets should be paused.

This can most be achieved by just removing the error state altogether. That would let everything continue as normal, except that any calls that do a currency conversion to that collateral will fail. We might also want disable rewards for vault with that collateral. However, what would we want to do when an asset is delisted? Would we liquidate all vaults and borrows? There wouldn't be a straightforward exit path

nud3l · 2023-02-02T02:58:32Z

If we remove the error state, how would the downstream components like the UI or squid know that we stopped submitting that asset? Possibly, we need still some state change but instead of Error it would be Error(Asset) so the UI could check that.

For delisting, what other protocols are doing is setting the ceiling to 0. I'm not sure if that would break our code though. In these other protocols, setting the ceiling to 0 allows vaults/borrowers to withdraw their assets but not deposit more collateral. Likewise, the capacity for borrows/bridging for that asset would be 0.

nud3l · 2023-02-03T09:57:55Z

Only handle oracle going offline for one or more assets in this issue.
Move the delisting of assets to separate issue.

For oracle going offline:

Assume asset price will be available again after timeout period.

Questions:

How would that affect liquidations? If price crashes when oracle is coming online again, vaults/borrowers will get liquidated. Liquidations are not possible in the bridge while oracle is offline. Check how lending protocol handles that, likely liquidations also would not work.
Vault stake: should the vault stake be set to 0 while the oracle is offline, i.e., should vaults receive rewards while users cannot bridge from/to that asset?
Lending: can users repay their loans?
Remove all security states? (Running, Error, Shutdown) -> replaced by tx-pause.
Active block number: this is an issue. Active block number is not "asset aware". Either oracle works or it does not. When one asset is not reported anymore, we should give users/vaults more time - not count active blocks - for that asset. But not for the other assets.

Plan of action:

Remove OracleOffline error state to not halt the entire bridge when one or more assets are not reported by oracle anymore.

sander2 · 2023-05-12T12:35:20Z

Active block number: this is an issue. Active block number is not "asset aware". Either oracle works or it does not. When one asset is not reported anymore, we should give users/vaults more time - not count active blocks - for that asset. But not for the other assets.

So let's just make ActiveBlockNumber a map of CurrencyId -> ActiveBlockNumber. In the on_initialize hook, we can just iterate over the currencies and increment those that are still active. The on_initialize hook can return a dynamic weight so that shouldn't be a problem. We could also optimize this in the future by only recording a (ActiveBlockNumber, BlockNumber, IsOnline) tuple when there is a state change between [online, offline]. The client can then add (currentBockNumber - recordedBlockNumber) to ActiveBlockNumber to determine the current value. But that's premature optimization, let's stick with the loop in on_initialize for now.

gregdhill · 2023-05-19T15:19:34Z

Seems like a substantial escalation of complexity but I haven't reviewed the problem in-depth to understand if there is an alternate solution. Which value would you propose we use for issue, redeem and replace requests?

sander2 · 2023-05-19T17:07:19Z

Which value would you propose we use for issue, redeem and replace requests?

I was thinking for issue/redeem, the active block number associated with the collateral currency. But actually, now that I think about it.. Do we need the active block number at all? Right now we need it because you can't execute when the oracle is offline. But if we remove that limitation, I think we may actually be able to just execute if the oracle is down. Probably not all codepaths would work - for issue overpayment we need to check vault minting capacity, so we need the oracle to be online.. But it may be acceptable to not mint extra tokens tokens in that case. I'll investigate this idea further

sander2 · 2023-05-22T08:09:59Z

Which value would you propose we use for issue, redeem and replace requests?

I was thinking for issue/redeem, the active block number associated with the collateral currency. But actually, now that I think about it.. Do we need the active block number at all? Right now we need it because you can't execute when the oracle is offline. But if we remove that limitation, I think we may actually be able to just execute if the oracle is down. Probably not all codepaths would work - for issue overpayment we need to check vault minting capacity, so we need the oracle to be online.. But it may be acceptable to not mint extra tokens tokens in that case. I'll investigate this idea further

What should work as normal:

execute_issue of pending issue with amount less than or equal to expected amount
execute_redeem (although the vault in question wouldn't get rewards since stake would be 0)
execute_replace of pending request
cancel_issue

What wouldn't work (or work differently):

execute_issue of pending issue with greater amount than expected: they wouldn't get more KBTC than they issued
execute_issue of cancelled issue would not work (can't check collateralization rate of vault)
cancel_redeem: requires price to determine compensation
request_replace
accept_replace
execute_replace of cancelled request: needs price for new_vault's collateral currency

All in all, I think this would be acceptable - there is no need to extend the time allocated for execution if the oracle goes down. All actors are able to process open requests as normal, as long as they act correctly. By that I mean that we can't provide some of the mitigations in place for user error (e.g. issue overpayment). But since these mitigations are "best effort" anyway, I think that's acceptable. The only problem is cancel_redeem, which requires the oracle to determine the compensation amount. I think it's acceptable to require users to wait with cancellation in the event of the oracle being down - that should be a very exception case anyway. If required, though, we can think about changes to the cancel_redeem s.t. it still works, but it comes at the cost of increased complexity and I'm not sure that's worth it

gregdhill · 2023-05-23T14:16:22Z

@sander2 so the new proposal is to remove the error status and active block handling? For the latter I would assume we would still want to increase the global counter for now pending some future migration to revert back to the system height. I can't recall if there was another reason to extend the time allocated for execution but I'm open to this approach.

sander2 · 2023-05-23T14:25:20Z

@sander2 so the new proposal is to remove the error status and active block handling? For the latter I would assume we would still want to increase the global counter for now pending some future migration to revert back to the system height. I can't recall if there was another reason to extend the time allocated for execution but I'm open to this approach.

Yea that's right. Basically, we'd allow every function that doesn't fail in the oracle conversion to be callable.

Thanks for your reply, I'll discuss with Dom as well when he's back

nud3l · 2023-05-29T06:32:48Z

I can't recall if there was another reason to extend the time allocated for execution but I'm open to this approach.

Yes, in case we would pause the parachain via SHUTDOWN state or the tx-pause module. In that case, users and vaults would not be able to execute requests.

However, this is not currency-specific. The tx-pause/SHUTDOWN would come into play when governance decide if there's emergency action to be taken.

So, I'd be in favor of the following change:

Remove the ORACLE_OFFLINE state
Stop counting the active blocks when the parachain is in SHUTDOWN state/tx-pause is active (not sure there's a nice way to combine the two?)
As Sander said, allow every function that doesn't fail in the oracle conversion to be callable
Handle oracle errors upstream: (1) The UI should show a validation error when users try to call functions that err when called. (2) The vault client should log that it can't call certain functions due to the oracle being offline.

nud3l added the bug Something isn't working label Jan 13, 2023

nud3l added this to the Security milestone Jan 13, 2023

gregdhill modified the milestones: Security, Kintsugi De-Fi Hub Feb 2, 2023

nud3l removed this from the Kintsugi De-Fi Hub milestone Feb 3, 2023

gregdhill mentioned this issue Jun 20, 2023

Offboarding collateral assets #1100

Open

sander2 self-assigned this Jul 3, 2023

sander2 mentioned this issue Jul 4, 2023

[BREAKING] Feat: handle stopping of oracle submissions better #1119

Merged

sander2 closed this as completed in #1119 Jul 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle stop of price submission in the oracle gracefully #851

Handle stop of price submission in the oracle gracefully #851

nud3l commented Jan 13, 2023 •

edited

Loading

sander2 commented Jan 23, 2023

nud3l commented Feb 2, 2023

nud3l commented Feb 3, 2023

sander2 commented May 12, 2023

gregdhill commented May 19, 2023

sander2 commented May 19, 2023

sander2 commented May 22, 2023

gregdhill commented May 23, 2023

sander2 commented May 23, 2023

nud3l commented May 29, 2023

Handle stop of price submission in the oracle gracefully #851

Handle stop of price submission in the oracle gracefully #851

Comments

nud3l commented Jan 13, 2023 • edited Loading

sander2 commented Jan 23, 2023

nud3l commented Feb 2, 2023

nud3l commented Feb 3, 2023

sander2 commented May 12, 2023

gregdhill commented May 19, 2023

sander2 commented May 19, 2023

sander2 commented May 22, 2023

gregdhill commented May 23, 2023

sander2 commented May 23, 2023

nud3l commented May 29, 2023

nud3l commented Jan 13, 2023 •

edited

Loading