Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

主题帖回复帖回复帖楼中楼接口中的发帖人等级现在可能放在user_list[].level_id而非post_list[].level_id #216

Closed
n0099 opened this issue Jul 26, 2024 · 10 comments
Labels
discussion discussion

Comments

@n0099
Copy link

n0099 commented Jul 26, 2024

https://tieba.baidu.com/p/9105338452
https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9105338452&pn=1&client_version=12.62.1.0
https://web.archive.org/web/20240726153335/https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9105338452&pn=1&client_version=12.62.1.0

curl -s https://n0099.net/tbm/v1/client_tester.php\?type\=replies\&tid\=9105338452\&pn\=1\&client_version\=12.62.1.0 \
| jq -r '.user_list as $users | .post_list[] | .author_id as $author | select(.level_id < 1) | (.id | tostring) + " " + ($author | tostring) + " " + (.level_id | tostring) + " " + ($users[] | select(.id == $author) | .level_id | tostring)'
150661661802 4051368579 null 10
150661669563 4474588442 null 12
150661736182 1102805097 null 9
150661743361 1467788141 null 11
150661754146 400188218 null 14
150661756477 5655640935 null 1
150661762762 571941096 null 11
150661858924 2269845119 null 7
150661889082 1132231151 null 11
150661914497 4051368579 null 10
150661971779 762078351 null 1
150661984915 1073452138 null 13

2024-07-12开始 image

SELECT COUNT(*), "triggeredBy", to_timestamp("discoveredAt")::date
FROM "tbmcr_authorExpGrade"
WHERE "authorExpGrade" = 0
GROUP BY "triggeredBy", to_timestamp("discoveredAt")::date;
count triggeredBy to_timestamp
... ... ...
19 subReply 2024-05-22
1 subReply 2024-05-25
28 reply 2024-05-28
10 reply 2024-06-15
30 reply 2024-06-24
6 reply 2024-06-29
29 reply 2024-06-30
27 reply 2024-07-03
13 reply 2024-07-06
1279 subReply 2024-07-12
23017 reply 2024-07-12
51306 reply 2024-07-13
7457 subReply 2024-07-13
41789 reply 2024-07-14
7165 subReply 2024-07-14
63635 reply 2024-07-15
12960 subReply 2024-07-15
59924 reply 2024-07-16
23598 subReply 2024-07-16
82357 reply 2024-07-17
10421 subReply 2024-07-17
11716 subReply 2024-07-18
63931 reply 2024-07-18
58489 reply 2024-07-19
9233 subReply 2024-07-19
59450 reply 2024-07-20
7975 subReply 2024-07-20
6488 subReply 2024-07-21
60110 reply 2024-07-21
8494 subReply 2024-07-22
56564 reply 2024-07-22
9970 subReply 2024-07-23
70321 reply 2024-07-23
54702 reply 2024-07-24
7484 subReply 2024-07-24
6122 subReply 2024-07-25
46308 reply 2024-07-25
54335 reply 2024-07-26
6215 subReply 2024-07-26

似乎是特定于某些tid的(在某些tid上一直复现而某些一直不能)
https://github.com/n0099/tbclient.protobuf/blob/12.51.7.1/proto/User.proto#L72

@n0099
Copy link
Author

n0099 commented Jul 26, 2024

当我没说 https://github.com/n0099/tbclient.protobuf/blob/12.51.7.1/proto/Post.proto 本来就没有post_list[].level_id

@n0099 n0099 closed this as completed Jul 26, 2024
n0099 added a commit to n0099/open-tbm that referenced this issue Jul 27, 2024
…value of props with attribute `[NotMapped]` since b81b64a @ `ReplySaver.Save()`

* partial revert b688071 @ `AuthorRevisionSaver.SaveAuthorExpGrade()`

+ prop `Parsed` like `AllAfter` before b81b64a
* rename primary ctor param `existingAfterAndNewlyAdded` to `parsed`
@ SaverChangeSet.cs

* rename variable `existingPostsBeforeMerge` to `existingPostsBeforeSave` @ `PostSaver.Save()`
@ c#/crawler
n0099 added a commit to n0099/open-tbm that referenced this issue Jul 27, 2024
…value of props with attribute `[NotMapped]` since b81b64a @ `ReplySaver.Save()`

* partial revert b688071 @ `AuthorRevisionSaver.SaveAuthorExpGrade()`

+ prop `Parsed` like `AllAfter` before b81b64a
* rename primary ctor param `existingAfterAndNewlyAdded` to `parsed`
@ SaverChangeSet.cs

* rename variable `existingPostsBeforeMerge` to `existingPostsBeforeSave` @ `PostSaver.Save()`
@ c#/crawler
@lumina37
Copy link
Owner

蛤?

@n0099
Copy link
Author

n0099 commented Jul 28, 2024

我自己搞混了 https://en.wikipedia.org/wiki/Semipredicate_problem 从数据库读出来的实体类(.AuthorExpGrade == 0因为数据库表里没这个列)和parser生成的类(.AuthorExpGrade != 0因为我还拿实体类同时作为跨parser/saver传递不直接对应于该实体类在数据库表列的信息的载体aka Context
n0099/open-tbm@b81b64a
hotfix n0099/open-tbm@52bbdb2
按照 #207 (comment) 精神进一步区分实体类和跨parser/saver传递信息的载体避免以后也误用 n0099/open-tbm@274f15b n0099/open-tbm@50c22e7

@n0099
Copy link
Author

n0099 commented Aug 1, 2024

但在正义

DELETE FROM "tbmcr_authorExpGrade"
WHERE "authorExpGrade" = 0
  AND to_timestamp("discoveredAt")::date BETWEEN '2024-07-12' AND '2024-07-27';

删掉1m+行后在24-07-2908-01时针对某些(fid, uid)仍然出现了少量level_id=0

SELECT to_timestamp("discoveredAt") AT TIME ZONE 'UTC+8', *
FROM "tbmcr_authorExpGrade" WHERE fid = 6087183 AND uid = 169860828
ORDER BY "discoveredAt" DESC;

image

WITH a AS (
    SELECT COUNT(*) count, fid, uid, "triggeredBy",
           to_timestamp(min("discoveredAt"))::date AS min_day,
           to_timestamp(max("discoveredAt"))::date AS max_day
    FROM "tbmcr_authorExpGrade"
    WHERE "authorExpGrade" = 0
    GROUP BY fid, uid, "triggeredBy"
)
SELECT *, max_day - min_day duration FROM a ORDER BY duration DESC;

image

SELECT COUNT(*), fid, "triggeredBy", to_timestamp(min("discoveredAt"))::date AS day
FROM "tbmcr_authorExpGrade"
WHERE "authorExpGrade" = 0
GROUP BY fid, "triggeredBy"

image
image

count fid triggeredBy day
22 27497591 reply 2024-08-01
29 27497591 reply 2024-07-29
13 27497591 reply 2024-07-06
27 27497591 reply 2024-07-03
29 97650 reply 2024-06-30
6 97650 reply 2024-06-29
30 27497591 reply 2024-06-24
10 27497591 reply 2024-06-15
28 27497591 reply 2024-05-28
1 27497591 subReply 2024-05-25
19 27497591 subReply 2024-05-22
4 27497591 reply 2024-05-21
29 27497591 reply 2024-05-20
15 27497591 reply 2024-05-03
28 27497591 reply 2024-05-01
29 27497591 reply 2024-04-30
30 27497591 reply 2024-04-27
29 27497591 reply 2024-04-26
58 27497591 reply 2024-04-25
5 6087183 reply 2024-04-23
7 27497591 subReply 2024-04-19
23 97650 reply 2024-04-12
13 6087183 reply 2024-04-10
28 6087183 reply 2024-04-08
141 27497591 reply 2024-04-08
9 97650 reply 2024-04-07
60 27497591 reply 2024-04-07
2 27497591 subReply 2024-04-03
3 97650 subReply 2024-03-27
138 27497591 subReply 2024-03-27
30 97650 subReply 2024-03-22
179 97650 subReply 2024-03-20
19 898666 reply 2024-03-11
1 27497591 subReply 2024-03-08
14 27497591 subReply 2024-03-06
30 97650 subReply 2024-03-05
14 6087183 reply 2024-03-04
30 97650 subReply 2024-03-03
14 27497591 subReply 2024-03-01
30 97650 subReply 2024-02-04
1 27497591 subReply 2024-02-01
2 27497591 subReply 2024-01-31
25 27497591 subReply 2024-01-22
25 27497591 subReply 2024-01-17
1 97650 subReply 2024-01-07
6 6087183 reply 2024-01-05
10 27497591 subReply 2023-12-30
8 97650 subReply 2023-12-27
3 27497591 subReply 2023-12-27
4 27497591 subReply 2023-12-24
3 27497591 subReply 2023-12-23
6 6087183 reply 2023-12-21
7 27497591 subReply 2023-12-21
10 6087183 reply 2023-12-20
9 27497591 subReply 2023-12-20
1 97650 subReply 2023-12-17
46 27497591 subReply 2023-12-17
4 27497591 subReply 2023-12-14
13 6087183 reply 2023-12-13
1 27497591 subReply 2023-12-10
7 6087183 reply 2023-12-04
19 27497591 subReply 2023-12-01
7 27497591 subReply 2023-11-27
10 6087183 reply 2023-11-25
14 27497591 subReply 2023-11-24
18 27497591 subReply 2023-11-23
11 27497591 subReply 2023-11-09
7 6087183 reply 2023-11-08
30 27497591 reply 2023-11-08
89 27497591 subReply 2023-10-30
47 27497591 subReply 2023-10-27
7 6087183 reply 2023-10-25
7 6087183 reply 2023-10-24
2 27497591 subReply 2023-10-19
5 6087183 reply 2023-10-18
3 27497591 subReply 2023-10-17
1 27497591 subReply 2023-09-30
2 27497591 subReply 2023-09-29
1 27497591 subReply 2023-09-27
10 6087183 reply 2023-09-26
1 27497591 subReply 2023-09-26
5 6087183 reply 2023-09-24
28 898666 reply 2023-09-22
27 27497591 subReply 2023-09-21
179 97650 subReply 2023-09-20
2 6087183 subReply 2023-09-20
4 19871743 subReply 2023-09-20
623 27497591 subReply 2023-09-20
2 27497591 subReply 2023-09-19
70 27497591 subReply 2023-09-17
52 27497591 subReply 2023-09-16
38 27497591 subReply 2023-09-14
12 27497591 subReply 2023-09-12
23 27497591 subReply 2023-09-11
26 898666 reply 2023-09-03
6 6087183 reply 2023-08-22
6 6087183 reply 2023-08-12
15 27497591 reply 2023-08-12
9 6087183 reply 2023-08-08
4 6087183 reply 2023-07-25
10 6087183 reply 2023-07-23
12 6087183 reply 2023-07-18
26 97650 reply 2023-07-17
88 6087183 reply 2023-07-17
26 898666 reply 2023-07-16
52 6087183 reply 2023-07-16
24 27278534 reply 2023-07-16
71 898666 reply 2023-07-15
38 6087183 reply 2023-07-15
68 6087183 reply 2023-07-14
27 898666 reply 2023-07-13
67 6087183 reply 2023-07-13
29 97650 reply 2023-07-12
60 6087183 reply 2023-07-12
30 27497591 reply 2023-07-12
24 6087183 reply 2023-07-11
7 6087183 reply 2023-07-10
26 898666 reply 2023-07-09
11 6087183 reply 2023-07-09
22 6087183 reply 2023-07-08
1 4734432 reply 2023-07-07
64 6087183 reply 2023-07-07
68 898666 reply 2023-07-06
61 6087183 reply 2023-07-06
24 898666 reply 2023-07-05
36 6087183 reply 2023-07-05
27 898666 reply 2023-07-04
31 6087183 reply 2023-07-04
22 23546288 reply 2023-07-04
5 97650 reply 2023-07-03
44 6087183 reply 2023-07-03
28 27278534 reply 2023-07-03
28 898666 reply 2023-07-02
85 6087183 reply 2023-07-02
74 6087183 reply 2023-07-01
53 6087183 reply 2023-06-30
88 6087183 reply 2023-06-29
60 6087183 reply 2023-06-28
35 6087183 reply 2023-06-27
16 6087183 reply 2023-06-25
77 6087183 reply 2023-06-24
46 898666 reply 2023-06-23
90 6087183 reply 2023-06-23
46 6087183 reply 2023-06-22
6 6087183 reply 2023-06-21
66 6087183 reply 2023-06-20
21 898666 reply 2023-06-19
74 6087183 reply 2023-06-19
76 6087183 reply 2023-06-18
51 898666 reply 2023-06-17
18 4734432 reply 2023-06-17
96 6087183 reply 2023-06-17
25 19871743 reply 2023-06-17
9 27497591 reply 2023-06-17
25 898666 reply 2023-06-16
25 6087183 reply 2023-06-16
24 898666 reply 2023-06-15
79 6087183 reply 2023-06-15
55 6087183 reply 2023-06-14
26 97650 reply 2023-06-13
50 898666 reply 2023-06-13
73 6087183 reply 2023-06-13
71 898666 reply 2023-06-12
56 6087183 reply 2023-06-12
24 898666 reply 2023-06-11
86 6087183 reply 2023-06-11
42 898666 reply 2023-06-10
52 6087183 reply 2023-06-10
47 898666 reply 2023-06-09
29 3255599 reply 2023-06-09
53 6087183 reply 2023-06-09
3 19871743 reply 2023-06-09
63 6087183 reply 2023-06-08
27 898666 reply 2023-06-07
16 6087183 reply 2023-06-07
2 19871743 reply 2023-06-07
46 898666 reply 2023-06-06
17 6087183 reply 2023-06-06
121 898666 reply 2023-06-05
69 6087183 reply 2023-06-05
73 898666 reply 2023-06-04
1 4734432 reply 2023-06-04
23 27278534 reply 2023-06-04
71 6087183 reply 2023-06-03
23 898666 reply 2023-06-02
34 6087183 reply 2023-06-02
9 27278534 reply 2023-06-02
24 3255599 reply 2023-06-01
27 6087183 reply 2023-06-01
18 898666 reply 2023-05-31
4 4734432 reply 2023-05-31
63 6087183 reply 2023-05-31
13 6087183 reply 2023-05-30
53 6087183 reply 2023-05-29
25 898666 reply 2023-05-28
106 6087183 reply 2023-05-28
21 898666 reply 2023-05-27
35 6087183 reply 2023-05-27
29 27497591 reply 2023-05-27
8 6087183 reply 2023-05-26
11 898666 reply 2023-05-02
5 6087183 reply 2023-04-26
30 6087183 reply 2023-04-22
26 6087183 reply 2023-04-18
15 6087183 reply 2023-04-01
1 19871743 reply 2023-03-26
27 3255599 reply 2023-03-13
27 6087183 reply 2023-03-13
30 97650 reply 2023-03-07
10 6087183 reply 2023-03-05
9 6087183 reply 2023-03-03
1 228500 reply 2023-03-02
1 4734432 reply 2023-03-01
1 898666 reply 2023-02-26
24 898666 reply 2023-02-24
6 6087183 reply 2023-02-13
1 23546288 reply 2023-02-13
1 27278534 reply 2023-02-11
1 97650 reply 2023-02-09
1 3255599 reply 2023-02-09
1 6087183 reply 2023-02-07
48 6087183 reply 2023-02-04
25 6087183 reply 2023-02-03
25 97650 reply 2023-01-31
15 6087183 reply 2023-01-31
62 6087183 reply 2023-01-30
1 27497591 subReply 2023-01-29

23-01-29以来就一直都有
合理假设是贴吧分布式微服务后端体系架构 #64 (comment) 中负责缓存用户在某吧等级的repo层超时导致fallback到默认值0

@n0099 n0099 reopened this Aug 1, 2024
@lumina37
Copy link
Owner

lumina37 commented Aug 2, 2024

所以说这只是一种server glitch?

@n0099
Copy link
Author

n0099 commented Aug 2, 2024

在我找到新的证据证明是我写错之前是的

@lumina37 lumina37 added the discussion discussion label Aug 2, 2024
@lumina37 lumina37 closed this as completed Aug 7, 2024
@n0099
Copy link
Author

n0099 commented Aug 9, 2024

找到了是吧首页主题帖列表接口在_client_version<12.40时没有user_list[].level_id
https://n0099.net/tbm/v1/client_tester.php?type=posts&forum=v&pn=1&rn=30&client_version=12.39

curl -s https://n0099.net/tbm/v1/client_tester.php\?type\=posts\&forum\=v\&pn\=1\&rn\=30\&client_version\=12.39 \
| jq -c '.user_list[] | { (.id | tostring): .level_id }'
{"1447150457":null}
{"1402002589":null}
{"4543421180":null}
{"4468004397":null}
{"6675906847":null}
{"4875608415":null}
{"1733033498":null}
{"6328952342":null}
{"1520437512":null}
{"204338778":null}
{"614500188":null}
{"187991726":null}
{"6088061729":null}
{"5987746517":null}
{"23929304":null}
{"3819235464":null}
{"6683108031":null}
{"23313205":null}
{"2031944551":null}
{"5182522350":null}
{"2572256569":null}
{"563686252":null}
{"416729378":null}
{"4605592380":null}
{"3436707562":null}
{"6650816803":null}
{"774066515":null}

https://n0099.net/tbm/v1/client_tester.php?type=posts&forum=v&pn=1&rn=30&client_version=12.40

curl -s https://n0099.net/tbm/v1/client_tester.php\?type\=posts\&forum\=v\&pn\=1\&rn\=30\&client_version\=12.40 \
| jq -c '.user_list[] | { (.id | tostring): .level_id }'
{"1447150457":14}
{"1323283231":11}
{"6530439247":6}
{"5770533996":1}
{"84251189":5}
{"4543421180":11}
{"1512933272":11}
{"2651464622":14}
{"3031045165":9}
{"4501410938":5}
{"3511477459":10}
{"6647521408":1}
{"5819769370":11}
{"959590357":12}
{"5987746517":15}
{"4090680065":13}
{"6390588135":11}
{"1402002589":12}
{"6584763185":5}
{"1038770490":12}
{"5450747790":13}
{"5829220713":6}
{"771234994":10}
{"6328952342":14}
{"4474588442":12}
{"6411622419":7}
{"2888537466":1}
{"50860272":13}
{"1143534304":13}

@n0099
Copy link
Author

n0099 commented Aug 9, 2024

主题帖回复贴列表接口里还是有某些用户仍然level_id=0
image
https://tieba.baidu.com/p/9098899860 https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9098899860&pn=1&client_version=12.66.1.0

$ curl -s 'https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9098899860&pn=1&client_version=12.66.1.0' | jq -c '.user_list[] | { (.id | tostring): .level_id }'
{"243784075":15}
{"782010266":15}
{"20006581914068":null}
$ curl -s 'https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9098899860&pn=1&client_version=12.66.1.0' | jq -c '.user_list[] | { (.id | tostring): .level_id }'
{"243784075":15}
{"782010266":15}
{"6653138544":null}
$ curl -s 'https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9098899860&pn=1&client_version=12.66.1.0' | jq -c '.user_list[] | { (.id | tostring): .level_id }'
{"243784075":15}
{"782010266":15}
{"20006581941845":null}
$ curl -s 'https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9098899860&pn=1&client_version=12.66.1.0' | jq -c '.user_list[] | { (.id | tostring): .level_id }'
{"243784075":15}
{"782010266":15}
{"20006581932057":null}
$ curl -s 'https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9098899860&pn=1&client_version=12.66.1.0' | jq -c '.user_list[] | { (.id | tostring): .level_id }'
{"243784075":15}
{"782010266":15}
{"20006581941827":null}
$ curl -s 'https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9098899860&pn=1&client_version=12.66.1.0' | jq -c '.user_list[] | { (.id | tostring): .level_id }'
{"243784075":15}
{"782010266":15}
{"0":null}
$ curl -s 'https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9098899860&pn=1&client_version=12.66.1.0' | jq -c '.user_list[] | { (.id | tostring): .level_id }'
{"243784075":15}
{"782010266":15}
{"20002383684206":null}

这主题帖里就两个用户但却有3个且第三个是随机的(大多数的百度uid是在开始随机分配 #213 (comment) 后注册的)甚至百度uid0 #213

curl -s 'https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9098899860&pn=1&client_version=12.66.1.0' \
| jq '(.user_list[] | select(.level_id == null)) as $user | (.recom_ala_info | .user_info | .user_id) as $alauid | [(.recom_ala_info), ($user), ($alauid == $user.id)]'

实际上是前ala直播(疑似直接复用某早已倒闭的时任百度内部多个直播app竞品的遗留字段)主播广告位中的贴吧用户
百度uid6470214027https://tieba.baidu.com/home/main?id=tb.1.248440ba.t0aAXPcWotQjz1XxQtG7ZA
百度uid2496591178 https://tieba.baidu.com/home/main?id=tb.1.77d01d4c.kNkhL32l58EYLe6PTuxP7g 吧龄8.3年疑似挖号

[
  {
    "live_id": 9477424228,
    "description": "孤独的你和孤独的我刚刚好",
    "live_from": 0,
    "first_headline": "小姐姐萝莉音在线直播,精彩不容错过!",
    "room_id": 9477424228,
    "yy_ext": {
      "feed_id": "16795871389423559215",
      "sid": "1538441763",
      "ssid": "1538441763",
      "template_id": "33554534",
      "yy_uid": "2954945485",
      "is_yy_game": 0,
      "stream_info": "{\"appid\":1296924150,\"cid\":1538441763,\"sid\":1538441763,\"wh_ratio\":\"368:640\",\"stream\":[{\"a_encoderType\":40,\"a_rate\":128,\"a_stream_name\":\"xa_1538441763_1538441763_0_0_0\",\"v_encoderType\":100,\"v_rate\":2000,\"v_stream_name\":\"xv_1538441763_1538441763_0_0_0\",\"pair\":2,\"extend\":{\"gearInfo\":\"{\\\"name\\\":\\\"高清\\\",\\\"seq\\\":200,\\\"gear\\\":2}\",\"line_infos\":\"{\\\"line_addrs\\\":[{\\\"line_seq\\\":6,\\\"reason\\\":\\\"cache\\\",\\\"cdn_info\\\":{\\\"url_id\\\":404,\\\"url_type\\\":1,\\\"url\\\":\\\"http://bd-flv.baizhanlive.com/live/1296924150_xv_1538441763_1538441763_0_0_0-1296924150_xa_1538441763_1538441763_0_0_0-0-0-0-0-0-1723237292976433.flv?codec=orig\\\\u0026appid=1296924150\\\\u0026rts_tk=f6ffcbdc5a3007e405e49c5fba1f9f23\\\\u0026secret=5b5f95d70e2897b2f23be31ffe1ac12c\\\\u0026t=1723330920\\\\u0026stream_key=1296924150_xv_1538441763_1538441763_0_0_0\\\\u0026r_stream_key=1296924150_xa_1538441763_1538441763_0_0_0\\\\u0026channelid=1538441763\\\\u0026mtk=1\\\\u0026line_seq=6\\\\u0026cp_id=4\\\\u0026rts=1\\\\u0026ctx=eyJydHNfYXV0aCI6MiwiZGVmX3Bhc3MiOjF9\\\\u0026r=cache\\\",\\\"is_p2p\\\":1,\\\"json\\\":\\\"{\\\\\\\"audio_only_flag\\\\\\\":\\\\\\\"\\\\\\\\u0026only-audio=1\\\\\\\\u0026audio-only=1\\\\\\\"}\\\",\\\"report_json\\\":\\\"{\\\\\\\"reason\\\\\\\":\\\\\\\"cache\\\\\\\"}\\\"},\\\"weight\\\":100,\\\"stage\\\":\\\"1723237292976433\\\"}],\\\"line_list\\\":[{\\\"line_seq\\\":4,\\\"line_print_name\\\":\\\"线路4\\\",\\\"line_print_sort\\\":4},{\\\"line_seq\\\":6,\\\"line_type\\\":1,\\\"line_print_name\\\":\\\"线路6\\\",\\\"line_print_sort\\\":6}]}\",\"stage\":\"1723237292976433\",\"mix_token\":\"AraDw8g0Sff9zoAQaq40fg==\",\"mix_ver\":\"1723237302202606\"}},{\"a_encoderType\":40,\"a_rate\":128,\"a_stream_name\":\"xa_1538441763_1538441763_0_0_0\",\"v_encoderType\":100,\"v_rate\":600,\"v_stream_name\":\"xv_1538441763_1538441763_0_1_0\",\"pair\":3,\"extend\":{\"gearInfo\":\"{\\\"name\\\":\\\"流畅\\\",\\\"seq\\\":100,\\\"gear\\\":1}\",\"line_infos\":\"{\\\"line_addrs\\\":[{\\\"line_seq\\\":6,\\\"reason\\\":\\\"cache\\\",\\\"cdn_info\\\":{\\\"url_id\\\":404,\\\"url_type\\\":1,\\\"url\\\":\\\"http://bd-flv.baizhanlive.com/live/1296924150_xv_1538441763_1538441763_0_1_0-1296924150_xa_1538441763_1538441763_0_0_0-0-0-0-0-0-1723237292976433.flv?codec=orig\\\\u0026appid=1296924150\\\\u0026rts_tk=b044cadcfa167aca53ab798d02a79ed6\\\\u0026secret=14c32859e59a3226e70d8fd2807c737c\\\\u0026t=1723330920\\\\u0026stream_key=1296924150_xv_1538441763_1538441763_0_1_0\\\\u0026r_stream_key=1296924150_xa_1538441763_1538441763_0_0_0\\\\u0026channelid=1538441763\\\\u0026line_seq=6\\\\u0026cp_id=4\\\\u0026mtk=1\\\\u0026rts=1\\\\u0026ctx=eyJydHNfYXV0aCI6MiwiZGVmX3Bhc3MiOjF9\\\\u0026r=cache\\\",\\\"is_p2p\\\":1,\\\"json\\\":\\\"{\\\\\\\"audio_only_flag\\\\\\\":\\\\\\\"\\\\\\\\u0026only-audio=1\\\\\\\\u0026audio-only=1\\\\\\\"}\\\",\\\"report_json\\\":\\\"{\\\\\\\"reason\\\\\\\":\\\\\\\"cache\\\\\\\"}\\\"},\\\"weight\\\":100,\\\"stage\\\":\\\"1723237292976433\\\"}],\\\"line_list\\\":[{\\\"line_seq\\\":4,\\\"line_print_name\\\":\\\"线路4\\\",\\\"line_print_sort\\\":4},{\\\"line_seq\\\":6,\\\"line_type\\\":1,\\\"line_print_name\\\":\\\"线路6\\\",\\\"line_print_sort\\\":6}]}\",\"stage\":\"1723237292976433\",\"mix_token\":\"AraDw8g0Sff9zoAQaq40fg==\",\"mix_ver\":\"1723237302202606\"}}],\"extend\":{\"layout\":\"[{\\\"mic\\\":0,\\\"uid\\\":2954945485}]\",\"uid64\":\"2954945485\"},\"liveChannelInfo\":{\"isOfficial\":0,\"avatar\":\"http://downhdlogo.yy.com/hdlogo/640640/640/640/94/2954945485/u2954945485ZNYl1i-.jpg\",\"nick\":{\"2\":\"静静jjang\",\"13\":\"静静jjang\",\"3\":\"静静jjang\",\"4\":\"静静jjang\",\"5\":\"静静jjang\",\"6\":\"静静jjang\",\"10\":\"静静jjang\"}}}"
    },
    "cover": "https://lpfm2-cover-static.myzhiniu.com/cover/934576852/2e558cf4b1044966b4672ca18be95b54.png?x-bce-process=image/resize,m_fill,w_720,h_404/quality,q_80",
    "live_type": 3,
    "third_room_id": "",
    "third_live_type": "",
    "pb_display_type": 1,
    "dislike_info": [
      {
        "dislike_id": 401,
        "extra": "{\"show_text\":\"9124244603|6470214027|已经看过\"}",
        "dislike_reason": "已经看过"
      },
      {
        "dislike_reason": "低质广告",
        "dislike_id": 402,
        "extra": "{\"show_text\":\"9124244603|6470214027|低质广告\"}"
      },
      {
        "dislike_reason": "恐怖恶心",
        "dislike_id": 403,
        "extra": "{\"show_text\":\"9124244603|6470214027|恐怖恶心\"}"
      },
      {
        "extra": "{\"show_text\":\"9124244603|6470214027|屏蔽该主播\"}",
        "dislike_reason": "屏蔽该主播",
        "dislike_id": 404
      }
    ],
    "audience_count": 404,
    "live_status": 1,
    "router_type": "",
    "second_headline": "超萌主播人美歌甜,点我看看吧~",
    "user_info": {
      "portrait": "",
      "live_status": 1,
      "user_id": 6470214027,
      "user_name": "静静jjang"
    },
    "cover_wide": "https://lpfm2-cover-static.myzhiniu.com/cover/934576852/2e558cf4b1044966b4672ca18be95b54.png?x-bce-process=image/resize,m_fill,w_720,h_404/quality,q_80"
  },
  {
    "ala_info": {
      "live_status": 1
    },
    "ala_live_info": {
      "live_status": 1,
      "live_from": 0,
      "third_room_id": "",
      "router_type": "",
      "third_live_type": "",
      "room_id": 9477424228,
      "live_id": 9477424228,
      "live_type": 3
    },
    "id": 6470214027,
    "portrait": "tb.1.248440ba.t0aAXPcWotQjz1XxQtG7ZA",
    "has_concerned": 0
  },
  true
]

直接特征是user_list[].ala(_live)?_info.live_status == 1
但也可能误伤这些主播用户本人发帖如 https://tieba.baidu.com/p/9044981332 https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9044981332&pn=1&client_version=12.66.1.0

curl -s 'https://n0099.net/tbm/v1/client_tester.php?type=replies&tid=9044981332&pn=1&client_version=12.66.1.0' \
| jq '.user_list[] | [(.level_id), (.ala_info | .live_status), (.ala_live_info | .live_status)]'
[
  1,
  1,
  1
]

但在其本人发帖中level_id != 0所以&&两个特征应该足够了

@n0099
Copy link
Author

n0099 commented Aug 9, 2024

ala直播(疑似直接复用某早已倒闭的时任百度内部多个直播app竞品的遗留字段)

git clone --branch tbclient https://github.com/n0099/tbclient.protobuf
grep -r live_status tbclient.protobuf/proto
proto/AlaLiveInfo.proto:    int32 live_status = 18;
proto/AlaUserInfo.proto:    uint32 live_status = 8;

https://github.com/n0099/tbclient.protobuf/blame/12.51.7.1/proto/AlaLiveInfo.proto#L31
https://github.com/n0099/tbclient.protobuf/blame/12.51.7.1/proto/AlaUserInfo.proto#L13
17年的8.x时就有后来灭亡的 https://baike.baidu.com/item/ALa直播/19867707

@n0099
Copy link
Author

n0099 commented Aug 9, 2024

tl;dr: 这就像是不久前的 #210 (comment)
这应该也部分地解释了为什么我这收录了24429个不存在于任何帖子发帖人主题帖最后回复人 百度uid中的孤立用户

SELECT COUNT(*) FROM tbmc_user u
WHERE NOT EXISTS(SELECT 1 FROM "tbmc_latestReplier" WHERE uid = u.uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f17019292_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f17019292_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f17019292_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f19871743_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f19871743_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f19871743_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f2265748_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f2265748_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f2265748_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f228500_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f228500_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f228500_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f23546288_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f23546288_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f23546288_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f25459979_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f25459979_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f25459979_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f26066262_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f26066262_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f26066262_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f27278534_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f27278534_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f27278534_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f27497591_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f27497591_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f27497591_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f27546680_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f27546680_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f27546680_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f3255599_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f3255599_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f3255599_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f4734432_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f4734432_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f4734432_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f6087183_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f6087183_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f6087183_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f78579_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f78579_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f78579_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f898666_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f898666_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f898666_thread" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f97650_reply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f97650_subReply" WHERE "authorUid" = uid)
    AND NOT EXISTS(SELECT 1 FROM "tbmc_f97650_thread" WHERE "authorUid" = uid);

n0099 added a commit to n0099/open-tbm that referenced this issue Aug 10, 2024
…` following b372f22 to fix lumina37/aiotieba#216 (comment) @ derived classes of `BaseCrawler<,>`

* fix lumina37/aiotieba#216 (comment) @ `ReplyCrawlFacade.OnPostParse()`
@ crawler

* update path of protoBuf codegen files @ TbClient.csproj
@ tbclient
@ c#

$ git submodule update --remote
n0099 added a commit to n0099/open-tbm that referenced this issue Aug 10, 2024
…` following b372f22 to fix lumina37/aiotieba#216 (comment) @ derived classes of `BaseCrawler<,>`

* fix lumina37/aiotieba#216 (comment) @ `ReplyCrawlFacade.OnPostParse()`
@ crawler

* update path of protoBuf codegen files @ TbClient.csproj
@ tbclient
@ c#

$ git submodule update --remote
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion discussion
Projects
None yet
Development

No branches or pull requests

2 participants