- About
- Features
- Installation
- Usage
- Data Structure
- Privacy and Security
- Troubleshooting
- Contributing
- Roadmap
- License
- Acknowledgments
AI Computer Interaction Logger is a powerful tool designed to capture and log human-computer interactions, creating rich datasets for training multi-modal Language Learning Models (LLMs). By recording various aspects of user interactions, including screen content, mouse movements, keyboard inputs, and audio, this tool enables the development of AI systems capable of understanding and replicating complex computer operations.
Our goal is to provide researchers and developers with high-quality, diverse datasets that can be used to train AI models for tasks such as:
- Automated software testing
- User experience analysis
- Assistive technologies for computer usage
- AI-driven task automation
- πΌοΈ High-frequency screenshot capture
- Configurable capture rate (default: 10 fps)
- Supports multiple monitors
- Images saved in PNG format for high quality and compression
- π±οΈ Precise mouse movement and click logging
- Tracks mouse coordinates (x, y)
- Records left, right, and middle button clicks
- Captures scroll wheel movements
- β¨οΈ Keyboard input recording
- Logs all key presses and releases
- Supports special keys and modifiers (Ctrl, Alt, Shift, etc.)
- Option to mask sensitive data (e.g., passwords)
- π€ Audio environment capture
- Records system audio and microphone input
- Configurable sample rate and bit depth
- πΎ Efficient data storage and organization
- Structured file hierarchy for easy navigation
- Compressed storage formats to minimize disk usage
- π Real-time processing and logging
- Minimal impact on system performance
- Live monitoring of recording status
- π Structured output for easy ML model ingestion
- CSV logs for events and metadata
- Synchronized timestamps across all data types
- Python 3.8 or higher
- 4GB RAM (minimum)
- 1GB free disk space for the application
- Additional disk space for recorded data (varies based on recording duration and quality)
- Clone the repository:
git clone https://github.com/yourusername/AIComputerInteractionLogger.git
- Navigate to the project directory:
cd AIComputerInteractionLogger
- Install the required dependencies:
pip install -r requirements.txt
To start recording computer interactions:
from src.recorder import DatasetRecorder
recorder = DatasetRecorder(screenshot_freq=5)
recorder.start_recording(duration=60) # Record for 60 seconds
You can customize various aspects of the recording process:
recorder = DatasetRecorder(
base_output_dir="custom_dataset",
screenshot_freq=10,
audio_channels=2,
audio_samplerate=48000
)
recorder.start_recording(duration=300) # Record for 5 minutes
For more detailed usage instructions, please refer to our Usage Guide.
Recorded data is organized in the following structure:
dataset/
βββ session_YYYYMMDD_HHMMSS/
βββ events.csv
βββ audio.wav
βββ screenshots/
βββ screenshot_timestamp1.png
βββ screenshot_timestamp2.png
βββ ...
events.csv
: Contains timestamped logs of mouse and keyboard eventsaudio.wav
: Audio recording of the sessionscreenshots/
: Directory containing all captured screenshots
- All data is stored locally on your machine
- No data is transmitted over the network
- Consider implementing additional encryption for sensitive data
- Be cautious when recording in environments with confidential information
Common issues and their solutions:
- Recording not starting: Ensure you have the necessary permissions for screen capture and audio recording.
- High CPU usage: Try lowering the screenshot frequency or reducing the number of monitored events.
- Missing events in CSV: Check if any antivirus software is blocking the event hooks.
For more troubleshooting tips, see our FAQ.
We welcome contributions to the AI Computer Interaction Logger! Please see our Contributing Guidelines for more details on how to get started.
To report bugs or request features, please open an issue on our GitHub Issues page.
Future development plans include:
- Support for video capture of specific screen regions
- Integration with popular machine learning frameworks
- Web browser extension for capturing in-browser events
- Multi-language support for broader accessibility
This project is licensed under the MIT License - see the LICENSE file for details.
- Open source AI community for inspiration on multi-modal AI systems
- The open-source community for various tools and libraries used in this project
Explore more groundbreaking projects that are shaping the future of technology:
- pyPortMan: Revolutionizing port management
- transformers_stock_prediction: AI-powered stock market insights
- TrendMaster: Stay ahead of market trends
- hjAlgos_notebooks: A treasure trove of algorithmic wisdom
- AutoCut: Streamlining video editing workflows
- My_Projects: A showcase of diverse tech innovations
- Arduino and ESP8266 Wonders: Pushing the boundaries of IoT
- TelegramTradeMsgBacktestML: Where finance meets machine learning