Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: 词云可以排除某些用户吗? #77

Open
Ohdmire opened this issue Nov 29, 2022 · 17 comments
Open

Feature: 词云可以排除某些用户吗? #77

Ohdmire opened this issue Nov 29, 2022 · 17 comments
Labels
enhancement New feature or request

Comments

@Ohdmire
Copy link

Ohdmire commented Nov 29, 2022

群里有多个bot发送的很多消息都是重复的,但是占了很大的比例

@he0119
Copy link
Owner

he0119 commented Nov 29, 2022

暂时不能,看来这个功能还是有必要。我后面找个时间加上吧。

@he0119 he0119 added the enhancement New feature or request label Nov 29, 2022
@MSDNicrosoft
Copy link
Contributor

MSDNicrosoft commented Jan 24, 2023

没想到吧,又是我(

如另一个议题一样,我也实现了。


可参考以下我实现的代码

msg_records = await get_messages_plain_text(
            user_ids=[user_id, ] if user_id else None,
            group_ids=[group_id, ],
            exclude_user_ids=[bot_id, *[f"{user}" for user in config.wordcloud_ignore_users]],
            time_start=start_time.astimezone(ZoneInfo("UTC")),
            time_stop=stop_time.astimezone(ZoneInfo("UTC"))
        )

其中重点是第四行

exclude_user_ids = [bot_id, *[f"{user}" for user in config.wordcloud_ignore_users]]

# 注: 变量 bot_id 类型同样也为 str

config.wordcloud_ignore_users 可以手动实现相关配置项

@he0119
Copy link
Owner

he0119 commented Jan 24, 2023

没想到吧,又是我(

如另一个议题一样,我也实现了。


可参考以下我实现的代码

msg_records = await get_messages_plain_text(
            user_ids=[user_id, ] if user_id else None,
            group_ids=[group_id, ],
            exclude_user_ids=[bot_id, *[f"{user}" for user in config.wordcloud_ignore_users]],
            time_start=start_time.astimezone(ZoneInfo("UTC")),
            time_stop=stop_time.astimezone(ZoneInfo("UTC"))
        )

其中重点是第四行

exclude_user_ids = [bot_id, *[f"{user}" for user in config.wordcloud_ignore_users]]

# 注: 变量 bot_id 类型同样也为 str

config.wordcloud_ignore_users 可以手动实现相关配置项

我在思考是否需要一个数据库,分群进行设置。之前去写 ob12 支持了,所以一直没写这个。现在 datastore 支持迁移脚本,修改数据库也方便起来了。

@MSDNicrosoft
Copy link
Contributor

MSDNicrosoft commented Jan 24, 2023

我在思考是否需要一个数据库,分群进行设置。之前去写 ob12 支持了,所以一直没写这个。现在 datastore 支持迁移脚本,修改数据库也方便起来了。

数据库可能相对于使用者来说不好修改配置。我的建议是使用 json 等可序列化文件存储这部分配置。


下面是夹带私货

这类文件应当注意 安全地 进行读写,推荐使用 threading.Lock() 实现相关功能,或者也可以使用队列。

@he0119
Copy link
Owner

he0119 commented Jan 24, 2023

我在思考是否需要一个数据库,分群进行设置。之前去写 ob12 支持了,所以一直没写这个。现在 datastore 支持迁移脚本,修改数据库也方便起来了。

数据库可能相对于使用者来说不好修改配置。我的建议是使用 json 等可序列化文件存储这部分配置。

配置肯定是通过命令来的,就像之前的每日定时发送的设置一样,所以应该不用太担心。我个人现在确实不太喜欢用文件了,数据库各种读写用起来都比文件舒服。

@MSDNicrosoft
Copy link
Contributor

配置肯定是通过命令来的,就像之前的每日定时发送的设置一样,所以应该不用太担心。我个人现在确实不太喜欢用文件了,数据库各种读写用起来都比文件舒服。

虽然但是,我还是想说一下,但是最后如何实现完全取决于你,你可以选择不接受我的建议:

必定有一些用使用者喜欢手动通过配置文件来修改 (比如我

而且使用者可能想要导出配置等等

@he0119
Copy link
Owner

he0119 commented Jan 24, 2023

配置肯定是通过命令来的,就像之前的每日定时发送的设置一样,所以应该不用太担心。我个人现在确实不太喜欢用文件了,数据库各种读写用起来都比文件舒服。

虽然但是,我还是想说一下,但是最后如何实现完全取决于你,你可以选择不接受我的建议:

必定有一些用使用者喜欢手动通过配置文件来修改 (比如我

而且使用者可能想要导出配置等等

哈哈哈哈哈,这个我突然想到了一个方法,可以给词云加个 nb-cli 的 script,支持导入导出配置,感觉完美了。可以思考一下,哪些命令比较有用。

@MSDNicrosoft
Copy link
Contributor

MSDNicrosoft commented Jan 24, 2023

哈哈哈哈哈,这个我突然想到了一个方法,可以给词云加个 nb-cli 的 script,支持导入导出配置,感觉完美了。可以思考一下,哪些命令比较有用。

我觉得可以。

另外我想问一问,不知是 datastore 还是 chatrecorder 的问题:

当协议端(比如 go-cqhttp)一直保持运行,Nonebot 断开。
过较长时间,启动 Nonebot,协议端给 Nonebot 上报大量聊天记录,导致出现以下报错:

[2023-01-23 20:26:40] [ ERROR ]    nonebot     | Error when running EventPostProcessors
Traceback (most recent call last):
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1900, in _execute_context
    self.dialect.do_execute(
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\engine\default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\dialects\sqlite\aiosqlite.py", line 100, in execute
    self._adapt_connection._handle_exception(error)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\dialects\sqlite\aiosqlite.py", line 228, in _handle_exception
    raise error
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\dialects\sqlite\aiosqlite.py", line 82, in execute
    self.await_(_cursor.execute(operation, parameters))
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 68, in await_only
    return current.driver.switch(awaitable)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 121, in greenlet_spawn
    value = await result
  File "D:\Path\Python311\Lib\site-packages\aiosqlite\cursor.py", line 37, in execute
    await self._execute(self._cursor.execute, sql, parameters)
  File "D:\Path\Python311\Lib\site-packages\aiosqlite\cursor.py", line 31, in _execute
    return await self._conn._execute(fn, *args, **kwargs)
  File "D:\Path\Python311\Lib\site-packages\aiosqlite\core.py", line 137, in _execute
    return await future
  File "D:\Path\Python311\Lib\site-packages\aiosqlite\core.py", line 110, in run
    result = function()
> sqlite3.OperationalError: database is locked

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "[隐私问题,已替换]\bot.py", line 59, in <module>
    nonebot.run(app="__mp_main__:app")
  File "D:\Path\Python311\Lib\site-packages\nonebot\__init__.py", line 273, in run
    get_driver().run(*args, **kwargs)
  File "D:\Path\Python311\Lib\site-packages\nonebot\drivers\fastapi.py", line 187, in run
    uvicorn.run(
  File "D:\Path\Python311\Lib\site-packages\uvicorn\main.py", line 569, in run
    server.run()
  File "D:\Path\Python311\Lib\site-packages\uvicorn\server.py", line 60, in run
    return asyncio.run(self.serve(sockets=sockets))
  File "D:\Path\Python311\Lib\asyncio\runners.py", line 190, in run
    return runner.run(main)
  File "D:\Path\Python311\Lib\asyncio\runners.py", line 118, in run
    return self._loop.run_until_complete(task)
  File "D:\Path\Python311\Lib\asyncio\base_events.py", line 640, in run_until_complete
    self.run_forever()
  File "D:\Path\Python311\Lib\asyncio\windows_events.py", line 321, in run_forever
    super().run_forever()
  File "D:\Path\Python311\Lib\asyncio\base_events.py", line 607, in run_forever
    self._run_once()
  File "D:\Path\Python311\Lib\asyncio\base_events.py", line 1919, in _run_once
    handle._run()
  File "D:\Path\Python311\Lib\asyncio\events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "D:\Path\Python311\Lib\site-packages\nonebot\adapters\onebot\v11\bot.py", line 194, in handle_event
    await handle_event(self, event)
>> File "D:\Path\Python311\Lib\site-packages\nonebot\message.py", line 333, in handle_event
    await asyncio.gather(*coros)
  File "D:\Path\Python311\Lib\site-packages\nonebot\utils.py", line 157, in run_coro_with_catch
    return await coro
  File "D:\Path\Python311\Lib\site-packages\nonebot\dependencies\__init__.py", line 108, in __call__
    return await cast(Callable[..., Awaitable[R]], self.call)(**values)
  File "D:\Path\Python311\Lib\site-packages\nonebot_plugin_chatrecorder\__init__.py", line 48, in record_recv_msg_v11
    await session.commit()
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\ext\asyncio\session.py", line 583, in commit
    return await greenlet_spawn(self.sync_session.commit)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 128, in greenlet_spawn
    result = context.switch(value)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\orm\session.py", line 1451, in commit
    self._transaction.commit(_to_root=self.future)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\orm\session.py", line 829, in commit
    self._prepare_impl()
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\orm\session.py", line 808, in _prepare_impl
    self.session.flush()
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\orm\session.py", line 3386, in flush
    self._flush(objects)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\orm\session.py", line 3525, in _flush
    with util.safe_reraise():
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\util\langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\util\compat.py", line 208, in raise_
    raise exception
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\orm\session.py", line 3486, in _flush
    flush_context.execute()
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\orm\unitofwork.py", line 456, in execute
    rec.execute(self)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\orm\unitofwork.py", line 630, in execute
    util.preloaded.orm_persistence.save_obj(
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\orm\persistence.py", line 245, in save_obj
    _emit_insert_statements(
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\orm\persistence.py", line 1238, in _emit_insert_statements
    result = connection._execute_20(
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1705, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\sql\elements.py", line 333, in _execute_on_connection
    return connection._execute_clauseelement(
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1572, in _execute_clauseelement
    ret = self._execute_context(
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1943, in _execute_context
    self._handle_dbapi_exception(
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 2124, in _handle_dbapi_exception
    util.raise_(
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\util\compat.py", line 208, in raise_
    raise exception
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\engine\base.py", line 1900, in _execute_context
    self.dialect.do_execute(
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\engine\default.py", line 736, in do_execute
    cursor.execute(statement, parameters)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\dialects\sqlite\aiosqlite.py", line 100, in execute
    self._adapt_connection._handle_exception(error)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\dialects\sqlite\aiosqlite.py", line 228, in _handle_exception
    raise error
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\dialects\sqlite\aiosqlite.py", line 82, in execute
    self.await_(_cursor.execute(operation, parameters))
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 68, in await_only
    return current.driver.switch(awaitable)
  File "D:\Path\Python311\Lib\site-packages\sqlalchemy\util\_concurrency_py3k.py", line 121, in greenlet_spawn
    value = await result
  File "D:\Path\Python311\Lib\site-packages\aiosqlite\cursor.py", line 37, in execute
    await self._execute(self._cursor.execute, sql, parameters)
  File "D:\Path\Python311\Lib\site-packages\aiosqlite\cursor.py", line 31, in _execute
    return await self._conn._execute(fn, *args, **kwargs)
  File "D:\Path\Python311\Lib\site-packages\aiosqlite\core.py", line 137, in _execute
    return await future
  File "D:\Path\Python311\Lib\site-packages\aiosqlite\core.py", line 110, in run
    result = function()
> sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked
[SQL: INSERT INTO nonebot_plugin_chatrecorder_messagerecord (bot_type, bot_id, platform, time, type, detail_type, message_id, message, plain_text, user_id, group_id, guild_id, channel_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)]
[parameters: ('OneBot V11', '[隐私问题,已替换]', 'qq', '2023-01-23 08:07:22.000000', 'message', 'group', '[隐私问题,已替换]', '[{"type": "image", "data": {"file": "314db052c1847c0b51794ce3eff22482.image", "subType": "0", "url": "[隐私问题,已替换]"}}]', '', '[隐私问题,已替换]', '[隐私问题,已替换]', None, None)]
(Background on this error at: https://sqlalche.me/e/14/e3q8)

我认为比较重要的信息已在所在行使用蓝色高亮。

我推测问题是数据库队列满了

@he0119

This comment was marked as off-topic.

@MSDNicrosoft

This comment was marked as off-topic.

@he0119

This comment was marked as off-topic.

@wei-z-git
Copy link
Contributor

wei-z-git commented Mar 13, 2023

不仅bot用户需要排除,有些表情比如/汪汪, /斜眼似乎也需要排除,

但是我觉得如果根据bot user_id排除的话,似乎没必要分群设置,因为是bot在哪里都应该被排除

我在思考是否需要一个数据库,分群进行设置。之前去写 ob12 支持了,所以一直没写这个。现在 datastore 支持迁移脚本,修改数据库也方便起来了。

@he0119
Copy link
Owner

he0119 commented Mar 13, 2023

但是我觉得如果根据bot user_id排除的话,似乎没必要分群设置,因为是bot在哪里都应该被排除

这个好说,当 group_id 为空时则在所有群排除。

@wei-z-git
Copy link
Contributor

另外stopword刚看了下jieba的大概懂了,关于屏蔽我的一点不靠谱的想法是

  • 排除bot可以提前写在配置文件里,毕竟bot id是少量且可预料且不经常变的(我不认为会有bot在a群需要exclude而在b群需要include的场景。。)
  • 关键词排除用命令添加,因为是不可预见且容易变化的,存在数据库比较合理

另外配置导入导出这个,如果用1.命令 /导出配置 2.导出一堆json在qq消息框,然后让用户copy 4. 使用/导入配置 粘贴json,不知道是否可行。。(这样好像就能近似实现配置服务端无状态了。。)

@he0119
Copy link
Owner

he0119 commented Mar 19, 2023

0.4.8 提供了一个排除指定用户的配置。更复杂的版本等以后再来实现(

@HuangArmagh
Copy link

定时发送的词云依然会统计排除的用户,这是不是一个bug?

@he0119
Copy link
Owner

he0119 commented May 7, 2023

定时发送的词云依然会统计排除的用户,这是不是一个bug?

还真是,忘记在那里排除了(

@he0119 he0119 changed the title 词云可以排除某些用户吗? Feature: 词云可以排除某些用户吗? Aug 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants