Rewrote existing parser with a new logic to detect tweets #229

Bhargav-Dave · 2023-02-05T22:59:04Z

The earlier parser detected tweets by using RegEx queries and matching the Xpath to the tweet text elements with certain given Xpaths.

However, I discovered that every DIV on the twitter DOM that contains a tweet text has an attribute titled 'data-testid' whose value is always 'tweetText'

Hence, the new method uses the 'querySelectorAll()' method (ref: https://developer.mozilla.org/en-US/docs/Web/API/Document/querySelectorAll) of the DOM in order to select all such DIVs whose data-testid is set to tweetText. This gives an array of all DIVs in the current DOM that contain the spans which contains the tweetText.

This array is discovered in the file parser-v2.js and then sent to the file transform-v2.js and processed there where a for loop runs through all the DIVs and does the processing that was done in transform.js

Bhargav-Dave · 2023-02-05T23:06:27Z

Is related to: #179

dennyabrain · 2023-02-06T03:34:54Z

hey @Bhargav-Dave, there's two concerns with this approach :

data-testid is added to make testing easy for the twitter developer. its possible they will take it out someday
this works for the tweet text but can you extract tweet url, timestamp and author handle using this method?

Bhargav-Dave added 2 commits February 6, 2023 04:14

feat!: added parser update

0377297

chore: added comments to parser-v2

0ea3ed0

dennyabrain merged commit 0a49c89 into tattle-made:main Mar 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrote existing parser with a new logic to detect tweets #229

Rewrote existing parser with a new logic to detect tweets #229

Bhargav-Dave commented Feb 5, 2023

Bhargav-Dave commented Feb 5, 2023

dennyabrain commented Feb 6, 2023

Rewrote existing parser with a new logic to detect tweets #229

Rewrote existing parser with a new logic to detect tweets #229

Conversation

Bhargav-Dave commented Feb 5, 2023

Bhargav-Dave commented Feb 5, 2023

dennyabrain commented Feb 6, 2023