Control Any Computer Using LLMs
-
Updated
Dec 13, 2024 - Python
Control Any Computer Using LLMs
Convert different model APIs into the OpenAI API format out of the box.
"Improving Mathematical Reasoning with Process Supervision" by OPENAI
Visual Instruction-guided Explainable Metric. Code for "Towards Explainable Metrics for Conditional Image Synthesis Evaluation" (ACL 2024 main)
This is a tool that uses GPT4 Vision to operate your computer
This repository offers a Python framework for a retrieval-augmented generation (RAG) pipeline using text and images from MHTML documents, leveraging Azure AI and OpenAI services. It includes ingestion and enrichment flows, a RAG with Vision pipeline, and evaluation tools.
Capture images with HoloLens and receive descriptive responses from OpenAI's GPT-4V(ision).
Your own personal Ruskin.
Digital Artificial Intelligence Agent
VisionQuery GPT-4v is a cutting-edge tool that combines screenshot-based queries with OpenAI's GPT-4. It enables users to capture screens, ask questions, and receive insightful answers from GPT-4v, revolutionizing digital interaction and understanding.
Web-based user interface for GPT4All and set it up to be hosted on GitHub Pages. This will allow users to interact with the model through a browser. We'll use Flask for the backend and some modern HTML/CSS/JavaScript for the frontend.
Developed an IoT-based construction site inspector using a Raspberry Pi 4 that autonomously navigates and inspects construction sites. The system features two DC motors for line-following and a servo-mounted ultrasonic sensor for real-time obstacle detection.
Camera powered with AI on the web
Add a description, image, and links to the gpt4vision topic page so that developers can more easily learn about it.
To associate your repository with the gpt4vision topic, visit your repo's landing page and select "manage topics."