Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ethclient: SubscribeFilterLogs() not respecting FilterQuery.FromBlock #15063

Open
mcdee opened this issue Aug 31, 2017 · 30 comments
Open

ethclient: SubscribeFilterLogs() not respecting FilterQuery.FromBlock #15063

mcdee opened this issue Aug 31, 2017 · 30 comments

Comments

@mcdee
Copy link
Contributor

mcdee commented Aug 31, 2017

System information

Geth
Version: 1.6.7-stable
Git Commit: ab5646c
Architecture: amd64
Protocol Versions: [63 62]
Network Id: 1
Go Version: go1.8.1
Operating System: linux

Expected behaviour

Setting up a simple log filter for a given contract address starting at block 1000000:

client, _ := ethclient.Dial("/home/foo/.ethereum/geth.ipc")

filter := ethereum.FilterQuery{}
filter.Addresses = make([]common.Address, 0)
filter.Addresses = append(filter.Addresses, common.HexToAddress("0x..."))
filter.FromBlock = big.NewInt(1000000)

Using this with filterLogs() provides a set of results:

logs, _ := client.FilterLogs(ctx, filter)
fmt.Println(len(logs)) // Ouptuts '143'

However using SubscribeFilterLogs() does not provide any immediate results:

ctx := context.Background()
ch := make(chan types.Log)
client.SubscribeFilterLogs(ctx, filter, ch)

for true {
    log := <-ch
    fmt.Println("Matching log encountered")
}

It is expected that SubscribeFilterLogs() start from the block as specified in the FilterQuery's FromBlock.

Actual behaviour

SubscribeFilterLogs() appears to start from the first new block received by the client regardless of the block as specified in the FilterQuery's FromBlock.

Steps to reproduce the behaviour

The above code provides the scenario to test (pick your favourite contract address and start block for testing purposes).

@rjl493456442
Copy link
Member

The behaviour is correct.

Reason:

client.filterLogs function invokes eth_getLogs function ultimately, which traverses the whole database to retrieve all matched logs.

client.SubscribeFilterLogs function creates a subscription with specific args. Once a new log arrives, if it matches the filter criteria,the log will been returned via subscription, if not, it will been discarded. It will not traverse the past log.

@mcdee
Copy link
Contributor Author

mcdee commented Sep 4, 2017

This is not at all obvious from the documentation.

It also begs the question: how would one obtain all log entries in the correct order? FilterLogs() followed by SubscribeFilterLogs() could miss log entries in new blocks between the calls.

The only thing that I can think of doing would be to SubscribeFilterLogs(), then FilterLogs(), work through the log entries returned by FilterLogs(), then start working through SubscribeFilterLogs() but that's hardly intuitive and requires handling duplicates. Is there a better way?

@nkbai
Copy link
Contributor

nkbai commented Sep 20, 2017

first call FilterLogs ,write result to the channel which is for SubscribeFilterLogs
for example:

logs, err := c.caller.FilterLogs(ctx, q)
	if err != nil {
		return nil, nil, err
	}
	var ch = make(chan types.Log, len(logs))
	//subcribe logs that will happen.
	sub, err := c.caller.SubscribeFilterLogs(ctx, q, ch)
	for _, log := range logs {
		ch <- *log
	}
	return ch, sub, err

@mcdee
Copy link
Contributor Author

mcdee commented Sep 20, 2017

@nkbai what happens if a block comes in between the calls to FilterLogs and SubscribeFilterLogs?

@nkbai
Copy link
Contributor

nkbai commented Sep 20, 2017

i don't know , you may lost this block. you can listen pending event SubscribePendingTransactions.

@nkbai
Copy link
Contributor

nkbai commented Sep 20, 2017

you can specify the query's toblock argument to rpc.PendingBlockNumber. if FilterLogs doesn't need too much time, it will get all the related blocks.

@mcdee
Copy link
Contributor Author

mcdee commented Sep 20, 2017

It doesn't sound like there is a working solution to this at current, which is a shame given that reading logs is a very common requirement.

@fjl
Copy link
Contributor

fjl commented Sep 21, 2017

This is a valid feature request, and we're aware of the limitation. It would also be nice to support ToBlock in SubscribeFilterLogs for streaming queries.

@yondonfu
Copy link
Contributor

Hi has there been any movement regarding implementing this feature? Support for fromBlock in SubscribeFilterLogs would be especially helpful for clients that rely on logs to update a local DB based on state changes in a contract such that if the client stops and restarts at a later point in time, the client can start receiving logs starting from the last block that it was online.

@kminevskiy
Copy link

Hello, any updates on this feature request?

@qustavo
Copy link

qustavo commented Apr 17, 2019

I struggled with the same issue, perhaps documentation could provide a good example on how to handle this

@adamschmideg
Copy link
Contributor

Waiting for #20012

hkalodner added a commit to OffchainLabs/arbitrum-classic that referenced this issue Jan 13, 2020
@stale
Copy link

stale bot commented Aug 29, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@pascaldekloe
Copy link

Here's a library for fetching a continuous view (to workaround this issue): https://github.com/pascaldekloe/etherstream

@JackG-eth
Copy link

Did this ever get solved? the above workaround would not solve the issue if you're doing a large query.

@fjl
Copy link
Contributor

fjl commented Jan 9, 2023

@s1na and I have re-evaluated this last week to see how much it would take to implement.

@fjl fjl reopened this Jan 9, 2023
@s1na s1na self-assigned this Jan 9, 2023
@JackG-eth
Copy link

@s1na Do you have any idea on when this might get implemented? I'm super super keen to use this functionality.

@s1na
Copy link
Contributor

s1na commented Jan 12, 2023

@JackG-eth Unfortunately I can't make an estimate. I'll update here when I get started on it.

@JackG-eth
Copy link

Thank you :)

@JackG-eth
Copy link

JackG-eth commented Feb 2, 2023

@s1na I could really use this methodology, could you push me in the direction of where i need to make the changes as I'd be happy to implement it myself (re your evaluation)

  • From my understanding the work flow for FilterLogs is:
  1. Arguments
  2. CallContext on the client passing in (eth_getLogs) as the method
  3. Create an RPC call with these arguments
  4. Umarshal results into struct and return
  • From my understanding the work flow for SubscribeFilterLogs is:
  1. Arguments
  2. EthSubscribeon the client passing in (Logs) as the method
  3. Create an RPC call with these arguments
  4. Return new events to a subscription
  5. Notifications are sent for current events and not for past events. For use cases that cannot afford to miss any notifications, subscriptions are probably not the best option.
  6. I'm assuming the change is in subscription.go somewhere
  7. I think I've narrowed it down to the forward() function which receieves RPC notifcations but im not sure

Any help/advice would be appreciated!

@chochihim
Copy link

chochihim commented Mar 2, 2023

For anyone interested, this is how web3.js does the trick: call eth_getLogs and then eth_subscribe.

But they also don't handle the edge case mentioned by @mcdee

@ryanc414
Copy link
Contributor

ryanc414 commented Mar 29, 2023

I hit this issue recently as well. It's not at all intuitive when the SubscribeFilterLogs takes the same ethereum.FilterQuery type as the FilterLogs call, in spite of the FromBlock being ignored. I assume that the BlockHash and ToBlock parameters are ignored as well though it seems that the Addresses and Topics filter parameters are respected.

I think at minimum there should be a change to the SubscribeFilterLogs method to take a different type as input (could be named e.g. ethereum.FilterSubscription or ethereum.SubscribeFilterQuery etc) which does not include the ignored fields. If that is not possible for compatibility reasons, then the docs should make this quirk ABUNTANTLY CLEAR!

For now my solution is to have two concurrent processes, one which calls subscribe and the other which slowly polls FilterLogs. Both processes write their results to a queue which is then deduplicated downstream. I think this should avoid dropping logs while providing a quick detection speed for the majority of the logs.

@ryanc414
Copy link
Contributor

Another thing I would be curious to understand is what is the behaviour of the SubscribeFilterLogs method during chain reorgs? Will we get the logs from the new canonical chain all at once, or could some logs be missed?

@pascaldekloe
Copy link

See the Removed field from https://pkg.go.dev/github.com/ethereum/go-ethereum/core/types#Log @ryanc414.

@ryanc414
Copy link
Contributor

See the Removed field from https://pkg.go.dev/github.com/ethereum/go-ethereum/core/types#Log @ryanc414.

OK great thanks. So if I only want events that are in the main canonical chain I should drop events with Removed is true?

@fjl
Copy link
Contributor

fjl commented Mar 30, 2023

Not really! Depending on your application, you may need to apply some logic to process removed logs.

For example, if you were tracking balances of a token by summing up transfer events, you'd need to subtract the amounts contained in removed logs.

@ryanc414
Copy link
Contributor

ryanc414 commented Apr 3, 2023

Not really! Depending on your application, you may need to apply some logic to process removed logs.

For example, if you were tracking balances of a token by summing up transfer events, you'd need to subtract the amounts contained in removed logs.

Got it thanks. In my case I am just using certain events as a trigger to invalidate cached values. I'm also assuming that we cannot rely on logs always being received strictly in order.

@s1na
Copy link
Contributor

s1na commented Apr 18, 2023

Posting some thoughts about this. We want to maintain order in the logs, so we should push all historical logs first then go into live-mode. In order to do this we should "inch closer" to the head:

  1. First do historical until current head block num n.
  2. Check if n has been reorged: issue rmLogs and new logs for reorged section
  3. By the time that's finished, head has moved n2.
  4. Do another historical from n to n2.
  5. Check if n2 has been reorged: issue rmLogs and new logs for reorged section
  6. repeat until nN and current head are same
  7. then toggle live mode

these first 4 steps need to be run in a separate routine than EventSystem.eventLoop otherwise other events would be blocked.

holiman added a commit that referenced this issue May 25, 2023
This change implements async log retrievals via feeding logs in channels, instead of returning slices. This is a first step to implement #15063.  

---------

Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: Sina Mahmoodi <1591639+s1na@users.noreply.github.com>
@Abso1ut3Zer0
Copy link

Hi, has work started on this feature?

I started running into this problem, but my use case relies that the events are ordered and deduplicated, which can make a workaround quite complex for a standard recovery scenario. I fear handling this via a workaround may cause unforeseen bugs given the problems I mentioned.

@s1na
Copy link
Contributor

s1na commented Jul 14, 2023

Hi, has work started on this feature?

I started running into this problem, but my use case relies that the events are ordered and deduplicated, which can make a workaround quite complex for a standard recovery scenario. I fear handling this via a workaround may cause unforeseen bugs given the problems I mentioned.

You can follow the progress at #27439

devopsbo3 pushed a commit to HorizenOfficial/go-ethereum that referenced this issue Nov 10, 2023
This change implements async log retrievals via feeding logs in channels, instead of returning slices. This is a first step to implement ethereum#15063.  

---------

Signed-off-by: jsvisa <delweng@gmail.com>
Co-authored-by: Sina Mahmoodi <itz.s1na@gmail.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
Co-authored-by: Sina Mahmoodi <1591639+s1na@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

15 participants