A curated list of my GitHub stars! Generated by starred.
- C
- C#
- C++
- CSS
- Cython
- Dockerfile
- Go
- HCL
- HTML
- Haskell
- Java
- JavaScript
- Jupyter Notebook
- Kotlin
- Lua
- Others
- PHP
- Pascal
- Perl
- Python
- R
- Ruby
- Rust
- SCSS
- Scala
- Shell
- Swift
- Tcl
- TypeScript
- Vue
- XSLT
- JoeDog/siege - Siege is an http load tester and benchmarking utility
- mean00/avidemux2 - Avidemux2, simple video editor
- dericed/american-archive-kaldi - This repo houses open-source models for Kaldi speech-to-text software that have been trained on public media content.
- kaltura/nginx-vod-module - NGINX-based MP4 Repackager
- gpac/gpac - GPAC Ultramedia OSS for Video Streaming & Next-Gen Multimedia Transcoding, Packaging & Delivery
- x42/libtimecode - deal with A/V timecode and framerates
- x42/ltc-tools - tools to deal with linear-timecode (LTC)
- x42/libltc - Linear/Logitudinal Time Code (LTC) Library
- sandflow/ffmpeg-imf - Adds an IMF demuxer to FFMPEG (https://github.com/sandflow/ffmpeg-imf/blob/develop/README-IMF.md)
- ggreer/the_silver_searcher - A code-searching tool similar to ack, but faster.
- sebastiencs/ls-icons - ls command with files icons
- setmind/sacd-ripper - Improved sacd_extract
- FFmpeg/FFmpeg - Mirror of https://git.ffmpeg.org/ffmpeg.git
- SubtitleEdit/subtitleedit - the subtitle editor :)
- mlichtenberg/hocrimagemapper - Tool for visualizing hOCR output from Tesseract (or other OCR engines that support hOCR).
- microsoft/BitNet - Official inference framework for 1-bit LLMs
- intel/neural-speed - An innovative library for efficient LLM inference via low-bit quantization
- google/gemma.cpp - lightweight, standalone C++ inference engine for Google's Gemma models.
- RWKV/rwkv.cpp - INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
- aarnphm/whispercpp - Pybind11 bindings for Whisper.cpp
- ml-explore/mlx - MLX: An array framework for Apple silicon
- ggerganov/llama.cpp - LLM inference in C/C++
- ggerganov/whisper.cpp - Port of OpenAI's Whisper model in C/C++
- Mozilla-Ocho/llamafile - Distribute and run LLMs with a single file.
- shundhammer/qdirstat - QDirStat - Qt-based directory statistics (KDirStat without any KDE - from the original KDirStat author)
- 3ximus/md5-collisions - MD5 collision testing
- IMFTool/IMFTool - A tool for editing IMF CPLs and creating new versions of an existing IMF (Interoperable Master Format) package
- mkrufky/node-dvbtee - MPEG2 transport stream parser for Node.js with support for television broadcast PSIP tables and descriptors
- mkrufky/libdvbtee - dvbtee: a digital television streamer / parser / service information aggregator supporting various interfaces including telnet CLI & http control
- mipops/dvrescue - Archivist-made software that supports data migration from DV tapes into digital files suitable for long-term preservation. Snapshot daily builds are at https://mediaarea.net/download/snapshots/binary/
- MediaArea/RAWcooked - Encodes RAW audio-visual data into the Matroska container (MKV), using the video codec FFV1 for the image and audio codec FLAC for the sound.
- logankilpatrick/gemini-api-quickstart - Get up and running in under 5 minutes with the Google AI Gemini API (in Python)
- timpaul/form-extractor-prototype - A prototype of a tool that generates web forms from document forms
- aravindputrevu/app-search-flask-app - This is an example of a Python Flask app with Elasticsearch/ Elastic App Search with respective Python Clients
- IIIF/cookbook-recipes - For working on the recipes
- explosion/thinc-apple-ops - 🍏 Make Thinc faster on macOS by calling into Apple's native Accelerate library
- nytimes/nginx-vod-module-docker - Docker image for nginx with Kaltura's VoD module used by The New York Times
- MightyMoud/sidekick - Bare metal to production ready in mins; your own fly server on your VPS.
- gotenberg/gotenberg - A developer-friendly API for converting numerous document formats into PDF files, and more!
- ollama/ollama - Get up and running with Llama 3.3, Mistral, Gemma 2, and other large language models.
- mikefarah/yq - yq is a portable command-line YAML, JSON, XML, CSV, TOML and properties processor
- hillu/local-log4j-vuln-scanner - Simple local scanner for vulnerable log4j instances
- anchore/syft - CLI tool and library for generating a Software Bill of Materials from container images and filesystems
- anchore/grype - A vulnerability scanner for container images and filesystems
- johnkerl/miller - Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON
- eyelevelai/groundx-on-prem - A Kubernetes deployable instance of GroundX for document parsing, storage, and search.
- swyxio/ai-notes - notes for software engineers getting up to speed on new AI developments. Serves as datastore for https://latent.space writing, and product brainstorming, but has cleaned up canonical references under
- CatalogueLegacies/antconc.github.io - Computational Analysis of Catalogue Data
- internetarchive/Zeno - State-of-the-art web crawler 🔱
- simonw/tools - Assorted tools
- Unstructured-IO/unstructured - Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
- pdf2htmlEX/pdf2htmlEX - Convert PDF to HTML without losing text or format.
- coolwanglu/pdf2htmlEX - Convert PDF to HTML without losing text or format.
- kmurmur/embARC -
- tesseract-ocr/tessdoc - Tesseract documentation
- GLAM-Workbench/glam-workbench.github.io -
- LibraryOfCongress/embARC - embARC (“metadata embedded for archival content”) manages internal file metadata including embedding and validation. Created by FADGI (Federal Agencies Digital Guidelines Initiative), in conjunction w
- bitmovin/bitmovin-player-web-samples - Showcases build around the Bitmovin Adaptive Streaming Player, demonstrating usage and capabilities of the HTML5 based HLS and MPEG-DASH player, as well as the Flash based Fallback.
- ColorlabMD/DPX_Metadata_Editor - View, Edit and Modify DPX file headers
- bfidatadigipres/bfidatadigipres.github.io -
- bfi-prog-notes/bfi-prog-notes.github.io -
- KBNLresearch/iromlab - Loader software for automated imaging of optical media with Nimbie disc robot
- IIIF-Commons/biiif-cli - A CLI to Build Static IIIF Collections
- TheScienceMuseum/collection-chrome-extension - Museum in a Tab: A Chrome Browser extension showing objects from the Science Museum Group Collection
- krzemienski/awesome-video - A curated list of awesome streaming video tools, frameworks, libraries, and learning resources.
- kba/hocrjs - Working with hOCR in Javascript
- algorythmik/python-hocr - HOCR parsing
- archival-IIIF/archival-iiif.github.io - Website
- facebook/duckling - Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
- Stirling-Tools/Stirling-PDF - #1 Locally hosted web application that allows you to perform various operations on PDF files
- kestra-io/kestra - ⚡ Workflow Automation Platform. Orchestrate & Schedule code in any language, run anywhere, 500+ plugins. Alternative to Zapier, Rundeck, Camunda, Airflow...
- Netflix/maestro - Maestro: Netflix’s Workflow Orchestrator
- kermitt2/grisp - Knowledge Base stuff
- kermitt2/grobid - A machine learning software for extracting information from scholarly documents
- kermitt2/entity-fishing - A machine learning tool for fishing entities
- stanfordnlp/CoreNLP - CoreNLP: A Java suite of core NLP tools for tokenization, sentence segmentation, NER, parsing, coreference, sentiment analysis, etc.
- DSRCorporation/imf-conversion - NF IMF media conversion utility allows to handle flat file creation from a specified CPL within the IMF package
- Netflix/photon - Photon is a Java implementation of the Interoperable Master Format (IMF) standard. IMF is a SMPTE standard whose core constraints are defined in the specification st2067-2:2013
- DSpace/DSpace - (Official) The DSpace digital asset management system that powers your Institutional Repository
- LibraryOfCongress/bagger - The Bagger application packages data files according to the BagIt specification.
- usnationalarchives/File-Analyzer - NARA File Analyzer and Metadata Harvester
- Georgetown-University-Libraries/File-Analyzer - A Data Parsing/Data Manipulation Tool Supporting Digitization Projects and Other Data Analysis Projects
- atduskgreg/opencv-processing - OpenCV for Processing. A creative coding computer vision library based on the official OpenCV Java API
- archivist-liz/jhove - File validation and characterisation.
- apache/incubator-stormcrawler - A scalable, mature and versatile web crawler based on Apache Storm
- jhuckaby/performa-satellite - Remote data collector for Performa.
- edsu/whisper-transcript - A Lit web-component for viewing a Whisper JSON transcript file
- NginxProxyManager/nginx-proxy-manager - Docker container for managing Nginx proxy hosts with a simple, powerful interface
- RahulSChand/gpu_poor - Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
- gchq/CyberChef - The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis
- jhuckaby/performa - A multi-server monitoring system with a web based UI.
- alexpinel/Dot - Text-To-Speech, RAG, and LLMs. All local!
- Mintplex-Labs/anything-llm - The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
- datasette/datasette-extract - Import unstructured data (text and images) into structured tables
- HumanSignal/label-studio - Label Studio is a multi-type data labeling and annotation tool with standardized output format
- fchollet/ARC-AGI - The Abstraction and Reasoning Corpus
- marco-bertelli/medium-rag-frontend - Rag Chatbot React And Tyepscript base boilerplate
- open-webui/open-webui - User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
- louislam/uptime-kuma - A fancy self-hosted monitoring tool
- britishlibrary/peripleo-lanc -
- eslawski/react-iiif-viewer - A React component for displaying high resolution IIIF images with deep zooming capabilities on mobile and desktop.
- appbaseio/reactivesearch - Search UI components for React and Vue
- betagouv/react-elasticsearch - 🛁 React + Elasticsearch - UI components for building data-driven search experiences
- bradtraversy/feedback-app - React feedback app from React course
- varunshenoy/GraphGPT - Extrapolating knowledge graphs from unstructured text using GPT-3 🕵️♂️
- digipres/awesome-digital-preservation - Carefully curated list of awesome digital preservation resources.
- thiagopnts/clappr-video360 - 360 video plugin for Clappr
- tjenkinson/clappr-thumbnails-plugin - A plugin for clappr which will display thumbnails when hovering over the scrub bar. Thumbnails can either be individual images or a sprite sheet.
- clappr/clappr - 🎬 An extensible media player for the web.
- transitive-bullshit/ffmpeg-extract-frames - Extracts frames from a video using ffmpeg.
- transitive-bullshit/ffmpeg-generate-video-preview - Generates an attractive image strip or GIF preview from a video.
- cookpete/react-player - A React component for playing a variety of URLs, including file paths, YouTube, Facebook, Twitch, SoundCloud, Streamable, Vimeo, Wistia and DailyMotion
- samvera-labs/ramp - Interactive, IIIF powered audio/video media player React components library. Styleguidist Docs: https://samvera-labs.github.io/ramp/
- digirati-co-uk/canvas-panel - Prototype covering the specification of Canvas Panel, and supporting components for composing bespoke IIIF viewers and lightweight experiences, conforming to the IIIF Presentation 3 specification.
- digirati-co-uk/timeliner - IIIF Timeliner
- amnh-sciviz/collectionscope -
- elastic/app-search-reference-ui-react - A generic UI for use with any App Search Engine
- art-institute-of-chicago/aic-mirador-ui - A Mirador plugin for UI customizations
- glenrobson/SimpleAnnotationServer - A simple IIIF and Mirador compatible Annotation Server
- atomotic/iiif-annotation-studio - Mirador IIIF Viewer packaged as a desktop app with an embedded annotation endpoint
- ProjectMirador/mirador-desktop - A desktop wrapper for Mirador and its environment, allowing use of local images.
- o19s/pdf-discovery-demo - Demonstration of searching PDF document with Solr, Tika, and Tesseract
- mozilla/pdf.js - PDF Reader in JavaScript
- EIDR-ID/reshuffle-prod-runtime - Reshuffle Enterprise Production-Only (no studio sync) Runtime Environment
- mifi/editly - Slick, declarative command line video editing & API
- greenstick/interactor - Front-End Code for Tracking Interactions and Conversions on Websites.
- phivk/nonlinearvideo - Non-Linear Video in HTML5 Workshop
- cpietsch/vikus-viewer - Explore cultural collections along time, texture and themes
- rwhscott/uv-hello-world - Fork of UniversalViewer/uv-hello-world that incorporates the manifest selection functionality from UniversalViewer/examples.
- SatadruBhattacharjee/react-tv-epg - A HTML5 Canvas based EPG(TV Guide) React Component for TV and Set-top box
- TheScienceMuseum/entities-search-engine - Scripts and microservice to feed an ElasticSearch with Wikidata and Inventaire entities, and keep those up-to-date
- europeana/media-player - Media player developed under the Europeana Media Generic Services Project
- mejackreed/mirador-plugin-example -
- ProjectMirador/mirador-annotations - a Mirador 3 plugin that adds annotation creation tools to the user interface
- dbmdz/mirador-textoverlay - Text Overlay plugin for Mirador 3
- ProjectMirador/mirador - An open-source, web-based 'multi-up' viewer that supports zoom-pan-rotate functionality, ability to display/compare simple images, and images with annotations.
- UB-Mannheim/ocr-fileformat - Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)
- internetarchive/bookreader - The Internet Archive BookReader
- aeschylus/IIIFBookReader - A plugin for the Internet Archive BookReader that enables easy book viewing on top of a IIIF-compatible back end.
- meta-llama/llama-recipes - Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting a
- merveenoyan/siglip - Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration 🤗
- deepfates/memery - Search over large image datasets with natural language and computer vision!
- microsoft/generative-ai-for-beginners - 21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
- DataTalksClub/llm-zoomcamp - LLM Zoomcamp - a free online course about building a Q&A system
- iyaja/llama-fs - A self-organizing file system with llama 3
- alexfazio/crewAI-quickstart - A collection of notebooks, cookbooks, and recipes showcasing fun and effective ways to use CrewAI's agentic workflow implementations and tools.
- google-gemini/cookbook - Examples and guides for using the Gemini API
- WhisperSpeech/WhisperSpeech - An Open Source text-to-speech system built by inverting Whisper.
- vikhyat/moondream - tiny vision language model
- mlabonne/llm-course - Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
- mshumer/ai-journalist -
- MikeChan-HK/Gemini-agent-example - An examples code to make langchain agents without openai API key (Google Gemini), Completely free unlimited and open source, run it yourself on website. Ready to support ollama.... (Update when i am f
- Jl16ExA/Surya-OCR-Hardware-Benchmarking - Surya-OCR-Hardware-Benchmarking is a repository dedicated to evaluating and analyzing the performance of the Surya OCR model across different hardware configurations. It provides tools and scripts for
- nateraw/openai-vision-api-for-videos - Extract information, summarize, ask questions, and search videos using OpenAI's Vision API 🚀🎦
- poloclub/unitable - UniTable: Towards a Unified Table Foundation Model
- google-research/vision_transformer -
- weaviate/recipes - This repository shares end-to-end notebooks on how to use various Weaviate features and integrations!
- anthropics/anthropic-cookbook - A collection of notebooks/recipes showcasing some fun and effective ways of using Claude.
- LearnToCode180/Entity-Fishing-Tutorial - Entity Linking of text mentions with Wikidata entries using a tool called Entity Fishing.
- yandexdataschool/nlp_course - YSDA course in Natural Language Processing
- NousResearch/Hermes-Function-Calling -
- MahmoudAshraf97/whisper-diarization - Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
- video-db/PromptClip - Instantly create video clips from LLM prompts
- brevdev/notebooks - Collection of notebook guides created by the Brev.dev team!
- aigeek0x0/rag-with-langchain-colbert-and-ragatouille - Build a Streamlit Chatbot using Langchain, ColBERT, Ragatouille, and ChromaDB
- rohan-paul/LLM-FineTuning-Large-Language-Models - LLM (Large Language Model) FineTuning
- snexus/llm-search - Querying local documents, powered by LLM
- distant-viewing/dvt - Distant Viewing Toolkit for the Analysis of Visual Culture
- Macuyiko/royal-navy-ship-identification - This repository contains the source code accompanying the paper "Explainable Deep Learning to Classify Royal Navy Ships"
- Vaibhavs10/how-to-whisper -
- philschmid/document-ai-transformers -
- sanchit-gandhi/whisper-jax - JAX implementation of OpenAI's Whisper model for up to 70x speed-up on TPU.
- NielsRogge/Transformers-Tutorials - This repository contains demos I made with the Transformers library by HuggingFace.
- salesforce/LAVIS - LAVIS - A One-stop Library for Language-Vision Intelligence
- facebookresearch/seamless_communication - Foundational Models for State-of-the-Art Speech and Text Translation
- run-llama/llama-hub - A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain
- sroecker/LLM_AppDev-HandsOn - Repository and hands-on workshop on how to develop applications with local LLMs
- Vaibhavs10/insanely-fast-whisper -
- langchain-ai/langchain - 🦜🔗 Build context-aware reasoning applications
- leandromoreira/digital_video_introduction - A hands-on introduction to video technology: image, video, codec (av1, vp9, h265) and more (ffmpeg encoding). Translations: 🇺🇸 🇨🇳 🇯🇵 🇮🇹 🇰🇷 🇷🇺 🇧🇷 🇪🇸
- GLAM-Workbench/facial-detection -
- shawngraham/Identifying-Similar-Images-with-TensorFlow-notebooks -
- cpietsch/vikus-viewer-script - Scripts to generate sprite sheets and textures for VIKUS Viewer
- TheScienceMuseum/heritage-connector - Heritage Connector: Transforming text into data to extract meaning and make connections
- bfidatadigipres/bfi-iiif-logging - Solution for BFI National archive Universal Viewer deployment, to log users in, track their interactions with the IIIF resources in UV, and output to a log.
- digirati-co-uk/bfi-discovery - Prototyping, discovery and documentation for the BFI viewer project.
- Kong/kong - 🦍 The Cloud-Native API Gateway and AI Gateway.
- fr0gger/Awesome-GPT-Agents - A curated list of GPT agents for cybersecurity
- meta-llama/llama-stack-apps - Agentic components of the Llama Stack APIs
- arpitingle/gpu-alpha - High Quality Resources on GPU Programming/Architecture
- BradyFU/Video-MME - ✨✨Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
- RManLuo/Awesome-LLM-KG - Awesome papers about unifying LLMs and KGs
- hrishioa/tough-llm-tests - Some tough questions to test new models.
- watson/awesome-computer-history - An Awesome List of computer history videos, documentaries and related folklore
- archivematica/Issues - Issues repository for the Archivematica project
- LvanWissen/starred -
- linexjlin/GPTs - leaked prompts of GPTs
- 1mrat/gpt-stats - Stats for Custom Chat GPTs not created by OpenAI
- travistangvh/ChatGPT-Data-Science-Prompts - A repository of 60 useful data science prompts for ChatGPT
- brillout/awesome-react-components - Curated List of React Components & Libraries.
- kdeldycke/awesome-falsehood - 😱 Falsehoods Programmers Believe in
- transitive-bullshit/awesome-ffmpeg - 👻 A curated list of awesome FFmpeg resources.
- alebcay/awesome-shell - A curated list of awesome command-line frameworks, toolkits, guides and gizmos. Inspired by awesome-php.
- herrbischoff/awesome-command-line-apps - 🐚 Use your terminal shell to do awesome things.
- gchq/BoilingFrogs - GCHQ's internal Boiling Frogs research paper on software development and organisational change in the face of disruption #boilingfrogs
- nationalarchives/tdr-dev-documentation - Documentation for developers for the TDR project
- IIIF/iiif-av - The International Image Interoperability Framework (IIIF) Audio/Visual (A/V) Technical Specification Group aims to extend to A/V the benefits of interoperability and the growing ecosystem of clients a
- keshavbhatt/WonderWall-Packaging - Wonderwall Wallpaper manager, releases for Linux and Windows 10
- usnationalarchives/digital-preservation - NARA digital preservation file format risk analysis and preservation plans
- EIDR-ID/php - EIDR applications and source code examples written in PHP.
- EIDR-ID/python - EIDR applications and source code examples written in Python.
- ProjectMirador/mirador-awesome - An awesome list for Mirador's projects and plugins.
- IIIF/awesome-iiif - Awesome IIIF-related resources
- ncarboni/awesome-GLAM-semweb - A curated list of various semantic web and linked data resources for heritage, humanities and art history practitioners.
- kba/awesome-ocr - Links to awesome OCR projects
- MeMAD-project/mmca - MeMAD multimodal content analysis and machine translation: collection of tools and libraries
- MeMAD-project/interchange-formats - MeMAD Metadata Interchange Formats
- bnb/awesome-hyper - 🖥 Delightful Hyper plugins, themes, and resources
- exponential-decay/pronom-archive-and-skeleton-test-suite - Release repository for The Skeleton Test Suite. Contains an Archive of PRONOM, and skeleton files for testing DROID from The National Archives, UK.
- ross-spencer/brainscape-digital-preservation - An open source set of decks for learning about digital preservation.
- passbolt/passbolt_api - Passbolt Community Edition (CE) API. The JSON API for the open source password manager for teams!
- exponential-decay/the-format-registry - A mirror of the PRONOM file format registry in Linked Open Data format. The Format Registry is a linked (open) data file format repository. The work is the result of a four-day hack during November 20
- double-commander/doublecmd - Double commander, A twin panel (side by side) cross platform open source file manager
- get-iplayer/get_iplayer - A utility for downloading TV and radio programmes from BBC iPlayer and BBC Sounds
- Lightricks/ComfyUI-LTXVideo - LTX-Video Support for ComfyUI
- Lightricks/LTX-Video - Official repository for LTX-Video
- uscdr-mediapres/sous-chef - An easy-to-use application that encodes DPX sequences into MKV video streams, primarily for archival storage
- slhck/ffmpeg-normalize - Audio Normalization for Python/ffmpeg
- microsoft/markitdown - Python tool for converting files and office documents to Markdown.
- DS4SD/docling - Get your documents ready for gen AI
- katanaml/sparrow - Data processing with ML, LLM and Vision LLM
- DAI-Lab/RivaGAN - Robust video watermarking with non-differentiable adversaries.
- NatLibFi/Annif - Annif is a multi-algorithm automated subject indexing tool for libraries, archives and museums.
- chartbeat-labs/textacy - NLP, before and after spaCy
- boudinfl/pke - Python Keyphrase Extraction module
- LIAAD/yake - Single-document unsupervised keyword extraction
- Cinnamon/kotaemon - An open-source RAG-based tool for chatting with your documents.
- ServerlessLLM/ServerlessLLM - Serverless LLM Serving for Everyone.
- ucbepic/docetl - A system for agentic LLM-powered data processing and ETL
- usefulsensors/moonshine - Fast and accurate automatic speech recognition (ASR) for edge devices
- stanford-oval/storm - An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
- getomni-ai/zerox - PDF to Markdown with vision models
- gsu-library/whisper-scribe - An audio/video transcriber with diarization and transcription editing.
- JosefAlbers/whisper-turbo-mlx - Blazing fast whisper turbo for ASR (speech-to-text) tasks
- tenable/pyTenable - Python Library for interfacing into Tenable's platform APIs
- Shubhamsaboo/awesome-llm-apps - Collection of awesome LLM apps with RAG using OpenAI, Anthropic, Gemini and opensource models.
- revdotcom/reverb - Open source inference code for Rev's model
- microsoft/presidio - Context aware, pluggable and customizable data protection and de-identification SDK for text and images
- EdyVision/pii-codex - A research python package for detecting, categorizing, and assessing the severity of personal identifiable information (PII)
- LLaVA-VL/LLaVA-NeXT -
- vllm-project/vllm - A high-throughput and memory-efficient inference and serving engine for LLMs
- deekshaaneja/Qwen2-VL -
- exo-explore/exo - Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚
- OpenBMB/MiniCPM-V - MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
- aiola-lab/whisper-medusa - Whisper with Medusa heads
- black-forest-labs/flux - Official inference repo for FLUX.1 models
- ACMILabs/collection-chat - Uses LangChain and GPT-4 to chat with the ACMI Public API collection.
- X-PLUG/mPLUG-Owl - mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
- X-PLUG/mPLUG-DocOwl - mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding
- akashmjn/tinydiarize - Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens
- Avaiga/taipy - Turns Data and AI algorithms into production-ready web applications in no time.
- freedmand/semantra - Multi-tool for semantic search
- JSCU-NL/COATHANGER - IOCs and detection script for COATHANGER malware
- Doriandarko/gemini-ui-to-code - A Streamlit application to generate code from images
- microsoft/unilm - Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
- huggingface/datatrove - Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
- huggingface/optimum-nvidia -
- THUDM/CogVLM2 - GPT4V-level open-source multi-modal model based on Llama3-8B
- kadirnar/whisper-plus - WhisperPlus: Faster, Smarter, and More Capable 🚀
- m-bain/whisperX - WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
- JaesungHuh/SimpleDiarization - Simple Diarization model
- ScrapeGraphAI/Scrapegraph-ai - Python scraper based on AI
- HyperGAI/HPT - HPT - Open Multimodal LLMs from HyperGAI
- ollama/ollama-python - Ollama Python library
- Maximilian-Winter/llama-cpp-agent - The llama-cpp-agent framework is a tool designed for easy interaction with Large Language Models (LLMs). Allowing users to chat with LLM models, execute structured function calls and get structured ou
- stanfordnlp/dspy - DSPy: The framework for programming—not prompting—language models
- OpenGVLab/Ask-Anything - [CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
- InternLM/xtuner - An efficient, flexible and full-featured toolkit for fine-tuning LLM (InternLM2, Llama3, Phi3, Qwen, Mistral, ...)
- BAAI-DCAI/Bunny - A family of lightweight multimodal models.
- magic-research/PLLaVA - Official repository for the paper PLLaVA
- mem0ai/mem0 - The Memory layer for your AI apps
- artefactual-labs/amclient - Archivematica API client module
- microsoft/LLMLingua - [EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
- thunlp/LLaVA-UHD - LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer
- Blaizzy/mlx-vlm - MLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
- meta-llama/llama3 - The official Meta Llama 3 GitHub site
- armbues/SiLLM - SiLLM simplifies the process of training and running Large Language Models (LLMs) on Apple Silicon by leveraging the MLX framework.
- stanford-futuredata/ColBERT - ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
- nateraw/audiocraft - Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable mu
- arun-art06/trocr-large - Learn how to effortlessly convert handwritten text into editable digital text using the power of the Microsoft/Trocr-Large-Handwritten model from Hugging Face. With the help of Gradio, a user-friendly
- VikParuchuri/marker - Convert PDF to markdown + JSON quickly with high accuracy
- jina-ai/serve - ☁️ Build multimodal AI applications with cloud-native stack
- apple/ml-mobileclip - This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" CVPR 2024
- simonw/click-app - Cookiecutter template for creating new Click command-line tools
- simonw/files-to-prompt - Concatenate a directory full of files into a single prompt for use with LLMs
- theirstory/gliner-spacy - A spaCy wrapper for GliNER
- urchade/GLiNER - Generalist and Lightweight Model for Named Entity Recognition (Extract any entity types from texts) @ NAACL 2024
- yoheinakajima/mindgraph - proof of concept prototype for generating and querying against an ever-expanding knowledge graph with ai
- yoheinakajima/instagraph - Converts text input or URL into knowledge graph and displays
- instructor-ai/instructor - structured outputs for llms
- mustafaaljadery/lightning-whisper-mlx - An extremely fast implementation of whisper optimized for Apple Silicon using MLX.
- huggingface/pytorch-image-models - The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT)
- lucidrains/vit-pytorch - Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
- zhongyy/Face-Transformer - Face Transformer for Recognition
- anguyen8/face-vit -
- cohere-ai/BinaryVectorDB - Efficient vector database for hundred millions of embeddings.
- hirmeos/entity-fishing-client-python - Repository hosting the common code for the entity-fishing clients
- flairNLP/flair - A very simple framework for state-of-the-art Natural Language Processing (NLP)
- izuna385/Wikia-and-Wikipedia-EL-Dataset-Creator - You can create datasets from Wikia/Wikipedia that can be used for entity recognition and Entity Linking. Dumps for ja-wiki and VTuber-wiki are available!
- ml-explore/mlx-examples - Examples in the MLX framework
- mindee/doctr - docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.
- stitionai/devika - Devika is an Agentic AI Software Engineer that can understand high-level human instructions, break them down into steps, research relevant information, and write code to achieve the given objective. D
- facebookresearch/BELA - Bi-encoder entity linking architecture
- explosion/weasel - 🦦 weasel: A small and easy workflow system
- explosion/spacy-stanza - 💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
- explosion/spacy-curated-transformers - spaCy entry points for Curated Transformers
- explosion/curated-transformers - 🤖 A PyTorch library of curated Transformer models and their composable components
- explosion/spacy-huggingface-pipelines - 💥 Use Hugging Face text and token classification pipelines directly in spaCy
- explosion/spacy-transformers - 🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
- mustafaaljadery/mlxserver - Start a server from the MLX library.
- explosion/spacy-llm - 🦙 Integrating LLMs into structured NLP pipelines
- IBM/zshot - Zero and Few shot named entity & relationships recognition
- haotian-liu/LLaVA - [NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
- HumanSignal/label-studio-ml-backend - Configs and boilerplates for Label Studio's Machine Learning backend
- diicellman/dspy-rag-fastapi - FastAPI wrapper around DSPy
- SYSTRAN/faster-whisper - Faster Whisper transcription with CTranslate2
- qnguyen3/chat-with-mlx - An all-in-one LLMs Chat UI for Apple Silicon Mac using MLX Framework.
- charlax/professional-programming - A collection of learning resources for curious software engineers
- keirf/greaseweazle - Tools for accessing a floppy drive at the raw flux level
- phidatahq/phidata - Build multi-modal Agents with memory, knowledge, tools and reasoning. Chat with them using a beautiful Agent UI.
- run-llama/llama_parse - Parse files for optimal RAG
- artefactual/automation-tools - Tools to aid automation of Archivematica and AtoM.
- allenai/OLMo - Modeling, training, eval, and inference code for OLMo
- argmaxinc/whisperkittools - Python tools for WhisperKit: Model conversion, optimization and evaluation
- instillai/extract-audio-from-video-gpu - Extracting audio from video using GPU-accelerated FFMPEG
- KarelDO/xmc.dspy - In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.
- BlinkDL/RWKV-LM - RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, sa
- Peter-obi/Video_summarization_mlx - Transcribe and summarize videos using whisper and llms on apple mlx framework
- AnswerDotAI/RAGatouille - Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
- taketwo/llm-ollama - LLM plugin providing access to models running on an Ollama server
- vegaluisjose/mlx-rag - Explore a simple example of utilizing MLX for RAG application running locally on your Apple Silicon device.
- video-db/StreamRAG - Video Search and Streaming Agent 🕵️♂️
- letta-ai/letta - Letta (formerly MemGPT) is a framework for creating LLM services with memory.
- bertramlyons/DPXdpxDPX - DPX header editing gizmo
- da-z/mlx-ui - A simple UI / Web / Frontend for MLX mlx-lm using Streamlit.
- alphasecio/llama-index - A collection of apps powered by the LlamaIndex LLM framework.
- maguowei/starred - creating your own Awesome List by GitHub stars!
- bigscience-workshop/petals - 🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
- qurator-spk/dinglehopper - An OCR evaluation tool
- rsommerfeld/trocr - Powerful handwritten text recognition. A simple-to-use, unofficial implementation of the paper "TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models".
- facebookresearch/ImageBind - ImageBind One Embedding Space to Bind Them All
- marimo-team/marimo - A reactive notebook for Python — run reproducible experiments, execute as a script, deploy as an app, and version with git.
- h2oai/h2ogpt - Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/
- VikParuchuri/surya - OCR, layout analysis, reading order, table recognition in 90+ languages
- lllyasviel/Fooocus - Focus on prompting and generating
- abetlen/llama-cpp-python - Python bindings for llama.cpp
- riccardomusmeci/mlx-llm - Large Language Models (LLMs) applications and tools running on Apple Silicon in real-time with Apple MLX.
- Vaibhavs10/on-device-llm-playground - A repo with scripts to test and play around with Facebook's recent llama models! 🤗
- NVIDIA/NeMo - A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
- impira/docquery - An easy way to extract information from documents
- oobabooga/text-generation-webui - A Gradio web UI for Large Language Models with support for multiple inference backends.
- gpt-engineer-org/gpt-engineer - Platform to experiment with the AI Software Engineer. Terminal based. NOTE: Very different from https://gptengineer.app
- mlc-ai/mlc-llm - Universal LLM Deployment Engine with ML Compilation
- simonw/llm-mistral - LLM plugin providing access to Mistral models using the Mistral API
- Vishnunkumar/craft_hw_ocr - Recognition of handwritten text using CRAFT text detection and TrOCR
- fcakyon/craft-text-detector - Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
- simonw/llm - Access large language models from the command-line
- Yuliang-Liu/MultimodalOCR - On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)
- SALT-NLP/LLaVAR - Code/Data for the paper: "LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding"
- mapluisch/LLaVA-CLI-with-multiple-images - LLaVA inference with multiple images at once for cross-image analysis.
- LLaVA-VL/LLaVA-Interactive-Demo - LLaVA-Interactive-Demo
- ise-uiuc/magicoder - [ICML'24] Magicoder: Empowering Code Generation with OSS-Instruct
- stanford-oval/WikiChat - WikiChat is an improved RAG. It stops the hallucination of large language models by retrieving data from a corpus.
- axolotl-ai-cloud/axolotl - Go ahead and axolotl questions
- simonw/llm-llama-cpp - LLM plugin for running models using llama.cpp
- ablwr/lc-sdf-data-exploration -
- zylon-ai/private-gpt - Interact with your documents using the power of GPT, 100% privately, no data leaks
- AudiovisualMetadataPlatform/whisper - Wrapper for the Whisper Text-to-speech tool
- AudiovisualMetadataPlatform/amp_bootstrap - AMP system managment
- run-llama/rags - Build ChatGPT over your data, all with natural language
- microsoft/autogen - A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
- ochen1/insanely-fast-whisper-cli - The fastest Whisper optimization for automatic speech recognition as a command-line interface ⚡️
- m-bain/CondensedMovies-chall - Condensed Movies Challenge 2021
- m-bain/frozen-in-time - Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval [ICCV'21]
- simonw/webvtt-to-json - Convert WebVTT to JSON, optionally removing duplicate lines
- glut23/webvtt-py - Read, write, convert and segment WebVTT caption files in Python.
- facebookresearch/nougat - Implementation of Nougat Neural Optical Understanding for Academic Documents
- clovaai/donut - Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
- jhodges10/fioctl -
- Frameio/python-frameio-client - Python SDK for interacting with the Frame.io API. Documentation here - https://frameio.github.io/python-frameio-client/
- pipinstallyp/minigpt4-batch - Use miniGPT-4 batch to generate captions for a lot of images! You should be able to create the best captions you always wanted!
- theovercomer8/captionr - GIT/BLIP/CLIP Caption tool
- simonw/blip-caption - Generate captions for images with Salesforce BLIP
- OpenInterpreter/open-interpreter - A natural language interface for computers
- adbar/trafilatura - Python & Command-line tool to gather text and metadata on the Web: Crawling, scraping, extraction, output as CSV, JSON, HTML, MD, TXT, XML
- facebookresearch/GENRE - Autoregressive Entity Retrieval
- facebookresearch/BLINK - Entity Linker solution
- kuo77122/deep-face-detector -
- yiminglin-ai/imdb-clean - A cleaned version of IMDB-WIKI dataset for facial age estimation.
- divya21raj/Actor-Recognition-In-Movies - Recognizing actors in a movie clip or image, using OpenCV, DeepLearning and Python.
- ageitgey/face_recognition - The world's simplest facial recognition api for Python and the command line
- kermitt2/delft - a Deep Learning Framework for Text https://delft.readthedocs.io/
- Lucaterre/spacyfishing - A spaCy wrapper of Entity-Fishing (component) for named entity disambiguation and linking on Wikidata
- davidberenstein1957/spacy-dbpedia-spotlight - A spaCy wrapper for DBpedia Spotlight
- davidberenstein1957/classy-classification - This repository contains an easy and intuitive approach to few-shot classification using sentence-transformers or spaCy models, or zero-shot classification with Huggingface.
- davidberenstein1957/concise-concepts - This repository contains an easy and intuitive approach to few-shot NER using most similar expansion over spaCy embeddings. Now with entity scoring.
- UB-Mannheim/spacyopentapioca - A spaCy wrapper of OpenTapioca for named entity linking on Wikidata
- SapienzaNLP/extend - Entity Disambiguation as text extraction (ACL 2022)
- egerber/spaCy-entity-linker - spaCy module for linking text to Wikidata items
- openeventdata/es-geonames - Create a Geonames gazetteer index in Elasticsearch
- openeventdata/mordecai - Full text geoparsing as a Python library
- ina-foss/inaSpeechSegmenter - CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
- lead-ratings/gender-guesser - Guess gender from first name in Python 2 and 3
- Alialmanea/age-gender-detection-using-opencv-with-python - age & gender detection-using-opencv-with-python
- torbjornbp/video-ocr2srt - A simple script to extract text elements from video files
- openai/openai-python - The official Python library for the OpenAI API
- unifiedstreaming/streaming-load-testing - Load generation tool for evaluation of DASH and HLS video streaming setups
- alexwlchan/library-lookup - Finding books that are available in nearby branches of my public lending library
- UAlbanyArchives/mailbagit - A tool for creating and managing Mailbags, a package for preserving email using multiple preservation formats
- flavioribeiro/video-thumbnail-generator - 📷 Generate thumbnail sprites from videos.
- bfidatadigipres/STORA - Off-air TV recording system. Open source Python3 and bash shell code
- ozmartian/vidcutter - A modern yet simple multi-platform video cutter and joiner.
- athento/hocr-parser - HOCR Specification Python Parser
- jlieth/hocr-parser - Python parser for hOCR files using lxml
- lucaswarwick02/HOCkeR - Python package for combining .hocr files and images into searchable PDFs
- imdeepmind/hocrox - Hocrox: An image preprocessing and augmentation library with Keras like interface.
- explosion/spaCy - 💫 Industrial-strength Natural Language Processing (NLP) in Python
- clamsproject/apps - A repository to keep record of CLAMS apps
- clamsproject/clams-python - CLAMS SDK for python
- keighrim/concatrim - Python program to trim-and-join A/V media files using ffmpeg
- clamsproject/app-barsdetection -
- KenjiTakahashi/mpdecimate_trim - trim video clips based on mpdecimate output, keep audio synced
- nielstenboom/recurring-content-detector - Unsupervised detection of opening / closing credits, recaps, and previews in video files 🎥🍿🎬
- openai/whisper - Robust Speech Recognition via Large-Scale Weak Supervision
- artefactual/archivematica - Free and open-source digital preservation system designed to maintain standards-based, long-term access to collections of digital objects.
- CarnegieHall/quality-control - Carnegie Hall Archives maintains a series of small, portable scripts to expedite batch processes for quality control on our Digital Collections.
- FilmColors/VIAN -
- simonw/s3-ocr - Tools for running OCR against files stored in S3
- pyscript/pyscript - PyScript is an open source platform for Python in the browser. Try PyScript: https://pyscript.com Examples: https://tinyurl.com/pyscript-examples Community: https://discord.gg/HxvBtukrg2
- bfidatadigipres/transcoding - Open source automated transcoding scripts used at the BFI National Archive
- iiif-prezi/iiif-prezi3 - IIIF Presentation API 3 Python Library
- carevealed/md5tool - Python script to generate or check md5 checksums recursively for files in a directory tree.
- bitmovin/bitmovin-api-sdk-python - Python API SDK which enables you to seamlessly integrate the Bitmovin API into your projects
- toddbirchard/flasklogin-tutorial - 👨💻 🔑 Build Flask apps with user creation and log-in functionally.
- pytube/pytube - A lightweight, dependency-free Python library (and command-line utility) for downloading YouTube Videos.
- sudowork/fix_m1_rgb - Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.
- SpectraLogic/ds3_python_sdk -
- alexwlchan/concurrently - A snippet for running multiple, concurrent invocations of a Python function
- parallelencode/PyParallelEncode -
- BlinkenOSA/workflows - Blinken OSA AV Preservation workflows implemented with Airflow (https://airflow.apache.org)
- cs-afm/co-dot-py - A little cli tool for moving things around
- SpectraLogic/ds3_python3_sdk -
- NCSC-NL/log4shell - Operational information regarding the log4shell vulnerabilities in the Log4j logging library.
- fullhunt/log4j-scan - A fully automated, accurate, and extensive scanner for finding log4j RCE CVE-2021-44228
- giacomomarchioro/pyIIIFpres - Python module for easing the construction of JSON manifests compliant with IIIF API 3.0.
- KBNLresearch/tapeimgr - Simple tape imaging and extraction tool
- LibraryOfCongress/bagit-python - Work with BagIt packages from Python.
- boto/boto3 - AWS SDK for Python
- AdminTurnedDevOps/DevOps-The-Hard-Way-AWS - This repository contains free labs for setting up an entire workflow and DevOps environment from a real-world perspective in AWS
- mbennett-uoe/whiiif - Simple IIIF Search service for OCRed texts
- bfidatadigipres/dpx_encoding - BFI National Archive automated dpx preservation scripts written in BASH and Python for use with Media Area RAWcooked and other open source programmes.
- bfidatadigipres/title_article_split - Python script to split multiple language articles from full title.
- kfrn/ffmpeg-things - Scripts & notes about ffmpeg
- IIIF/presentation-validator - Validator for the Presentation API
- IIIF/prezi-2-to-3 - Libraries to upgrade Presentation API v2 to v3 automatically
- tomcrane/bbctextav -
- alexwlchan/lazyreader - Lazy reading of file objects for efficient batch processing
- alexwlchan/clipatron - A script to automate video clipping using ffmpeg ✂️ 📼 ✂️
- iiif-prezi/iiif-prezi - IIIF Presentation API implementation in Python
- corkami/collisions - Hash collisions and exploitations
- bfidatadigipres/checksum_scripts - Checksum speed test scripts using Python2 and Python3 MD5 and CRC32 algorithms.
- andersbll/neural_artistic_style - Neural Artistic Style in Python
- c0decracker/video-splitter - Simple Python script to split video into equal length chunks or chunks of equal size, duration, etc.
- nlnzcollservices/harvester_manager - Mostly Automated Social-media Harvester
- Digital-Preservation-Finland/fido - Format Identification for Digital Objects (FIDO) is a Python command-line tool to identify the file formats of digital objects. It is designed for simple integration into automated work-flows.
- TheScienceMuseum/elastic-wikidata - CLI for loading Wikidata subsets (or all of it) into Elasticsearch
- liiight/notifiers - The easy way to send notifications
- britishlibrary/mpt - A utility for staging files, calculating and validating file checksums, and comparing checksum values between storage locations.
- bodleian/iiif_manifest_server - Bodleian IIIF Manifest Microservice
- benjaminp/six - Python 2 and 3 compatibility library
- ryanfb/HocrConverter - Create PDFs and plain text from hOCR documents
- jbaiter/hocrviewer-mirador - View HOCR files with Mirador
- ocropus/hocr-tools - Tools for manipulating and evaluating the hOCR format for representing multi-lingual OCR results by embedding them into HTML.
- MeMAD-project/AudioTagger - Program for recognizing audio contents of sound files and videos.
- MeMAD-project/rdf-converter - MeMAD metadata converter that transforms legacy metadata from INA and Yle into RDF using the MeMAD and EBU Core ontologies
- pypa/pipenv - Python Development Workflow for Humans.
- ali1234/vhs-teletext - Software to recover teletext data from VHS recordings.
- MrS0m30n3/youtube-dl-gui - A cross platform front-end GUI of the popular youtube-dl written in wxPython.
- ytdl-org/youtube-dl - Command-line program to download videos from YouTube.com and other video sites
- Digital-Preservation-Finland/file-scraper - File detector, metadata collector and well-formedness checker tool
- Digital-Preservation-Finland/ffmpeg-python - Python bindings for FFmpeg - with complex filtering support
- Digital-Preservation-Finland/dpx-validator - DPX file format validator
- tw4l/brunnhilde - Siegfried-based characterization tool for directories and disk images
- Ymagis/ClairMeta - Clairmeta is a python package for Digital Cinema Package (DCP) probing and checking.
- kieranjol/IFIscripts - Detailed documentation is available here: http://ifiscripts.readthedocs.io/en/latest/index.html
- ahalterman/phoxy - R tools to download, ingest, and analyze the Phoenix dataset from the Open Event Data Alliance
- nypublicradio/transcript-editor - Web-based tool for correcting speech-to-text generated transcripts.
- WGBH-MLA/transcript-editor - Web-based tool for correcting speech-to-text generated transcripts.
- guerilla-di/depix - Read and write DPX file headers
- avalonmediasystem/avalon - Avalon Media System – Samvera Application
- athityakumar/colorls - A Ruby gem that beautifies the terminal's ls command, with color and font-awesome icons. 🎉
- BurntSushi/ripgrep - ripgrep recursively searches directories for a regex pattern while respecting your gitignore
- bionic-gpt/bionic-gpt - BionicGPT is an on-premise replacement for ChatGPT, offering the advantages of Generative AI while maintaining strict data confidentiality
- huggingface/candle - Minimalist ML framework for Rust
- awslabs/mountpoint-s3 - A simple, high-throughput file client for mounting an Amazon S3 bucket as a local file system.
- acdha/mountstatus - MountStatusMonitor: paranoid monitor for POSIX filesystem mounts (Linux, OS X, FreeBSD)
- ruffle-rs/ruffle - A Flash Player emulator written in Rust
- AI4LAM/awesome-ai4lam - A list of awesome AI in libraries, archives, and museum collections from around the world 🕶️
- JohnSnowLabs/spark-nlp - State of the Art Natural Language Processing
- healthyhost/audit-vps-script - Run a security scan on your server and identify common gaps. Get your VPS ready for production.
- Shuffle/Shuffle - Shuffle: A general purpose security automation platform. Our focus is on collaboration and resource sharing.
- mhasan49/package-manager - Installer script tailored for Debian/Ubuntu systems to installs necessary packages.
- artefactual/archivematica-docs - Archivematica documentation
- agarrharr/awesome-cli-apps - 🖥 📊 🕹 🛠 A curated list of command line apps
- dericed/dpxderiver - shell script for converting DPX+wav input to specific outputs of DNxHD, lossless h264 at 4:2:2 YUV 10 bit, and a streamable h264
- adi1090x/dynamic-wallpaper - A simple bash script to set wallpapers according to current time, using cron job scheduler.
- kfrn/rainbow-video - A script that takes a video and creates a hue-ordered mosaic of frame captures.
- bfidatadigipres/bfi-iiif-load-balancer - BFI's IIIF NGINX based load balancer application, to proxy user facing requests to backend applications.
- danielgrant/server-scripts - A collection of scripts for server management, health checking and reporting
- dericed/framemd5cmp - Present a comparison between framemd5 output of two video files.
- eddycolloton/INPT - These shell scripts are intended to automate several steps frequently performed by media conservators at the Hirshhorn Museum and Sculpture Garden (HMSG).
- ohmyzsh/ohmyzsh - 🙃 A delightful community-driven (with 2,400+ contributors) framework for managing your zsh configuration. Includes 300+ optional plugins (rails, git, macOS, hub, docker, homebrew, node, php, python,
- antespi/s3md5 - Bash script to calculate Etag/S3 MD5 sum for very big files uploaded using multipart S3 API
- freedmand/textra - A command-line application to convert images, PDFs, and audio files to text using Apple's APIs
- argmaxinc/WhisperKit - On-device Speech Recognition for Apple Silicon
- AugustDev/enchanted - Enchanted is iOS and macOS app for chatting with private self hosted language models such as Llama2, Mistral or Vicuna using Ollama.
- preternatural-explore/mlx-swift-chat - A multi-platform SwiftUI frontend for running local LLMs with Apple's MLX framework.
- deployradiant/pajama - A UI for Ollama on Mac
- exelban/stats - macOS system monitor in your menu bar
- vincentneo/LosslessSwitcher - Automated Apple Music Lossless Sample Rate Switching for Audio Devices on Macs.
- cs-afm/Check-Sammy - Python GUI for calculating and monitoring md5 checksums
- gristlabs/grist-core - Grist is the evolution of spreadsheets.
- nocodb/nocodb - 🔥 🔥 🔥 Open Source Airtable Alternative
- twentyhq/twenty - Building a modern alternative to Salesforce, powered by the community.
- piotrkulpinski/openalternative - A community driven list of open source alternatives to proprietary software and applications.
- immich-app/immich - High performance self-hosted photo and video management solution.
- AykutSarac/jsoncrack.com - ✨ Innovative and open-source visualization application that transforms various data formats, such as JSON, YAML, XML, CSV and more, into interactive graphs.
- yamadashy/repomix - 📦 Repomix (formerly Repopack) is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or o
- microsoft/data-formulator - 🪄 Create rich visualizations with AI
- n8n-io/n8n - Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
- Sh4yy/personal-ai -
- ax-llm/ax - The unofficial DSPy framework. Build LLM powered Agents and "Agentic workflows" based on the Stanford DSP paper.
- enricoros/big-AGI - AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. It features AI personas, AGI functions, multi-model chats, text-to-image, voice, response streaming, code highlight
- run-llama/create-llama - The easiest way to get started with LlamaIndex
- n4ze3m/page-assist - Use your locally running AI models to assist you in your web browsing
- ai-ng/2txt - Image to text, fast.
- supermemoryai/supermemory - Build your own second brain with supermemory. It's a ChatGPT for your bookmarks. Import tweets or save websites and content using the chrome extension.
- jina-ai/reader - Convert any URL to an LLM-friendly input with a simple prefix https://r.jina.ai/
- huggingface/llm-vscode - LLM powered development for VSCode
- hrishioa/lumentis - AI powered one-click comprehensive docs from transcripts and text.
- da-z/llamazing - A simple Web / UI / App / Frontend to Ollama.
- run-llama/sec-insights - A real world full-stack application using LlamaIndex
- wsxiaoys/bobtail.dev - Poor man's phind.com/perplexity.ai
- leptonai/search_with_lepton - Building a quick conversation-based search demo with Lepton AI.
- Nutlope/pdftochat - Chat with your PDFs with AI
- osmoscraft/osmosmemo - Turn GitHub into a bookmark manager
- janhq/jan - Jan is an open source alternative to ChatGPT that runs 100% offline on your computer. Multiple engine support (llama.cpp, TensorRT-LLM)
- hrishioa/wishful-search - Natural language search for complex JSON arrays, with AI Quickstart.
- fromsmash/smash-downloader-js - Official JavaScript library to download transfers using the Smash API & SDK 🚀
- karolkozer/planby -
- elastic/kibana - Your window into the Elastic Stack
- mifi/ezshare - Easily share files, folders and clipboard over LAN - Like Google Drive but without internet
- mifi/lossless-cut - The swiss army knife of lossless video/audio editing
- directus/directus - The flexible backend for all your projects 🐰 Turn your DB into a headless CMS, admin panels, or apps with a custom UI, instant APIs, auth & more.
- digirati-co-uk/iiif-manifest-editor - Create new IIIF Manifests. Modify existing manifests. Tell stories with IIIF.
- IIIF-Commons/parser - IIIF Presentation 2 + 3 parser
- muxinc/media-chrome - Custom elements (web components) for making audio and video player controls that look great in your website or app.
- gTile/gTile - A window tiling extension for Gnome.
- archival-IIIF/test-server -
- archival-IIIF/demo -
- SocialGouv/archifiltre-docs - Visualisez et améliorez vos arborescences de fichiers !
- archival-IIIF/viewer - IIIF compatible viewer for digital born file storages
- UniversalViewer/universalviewer - A community-developed open source project on a mission to help you share your 📚📜📰📽️📻🗿 with the 🌎
- freeCodeCamp/freeCodeCamp - freeCodeCamp.org's open-source codebase and curriculum. Learn to code for free.
- beeldengeluid/open-images-browser - MediaScape project researching the utility of Generous Interfaces for audiovisual archives
- preservica/automated-preservation-recommendations - This repository contains a Wiki of information related to recommendation preservation actions in support of Preservica Automated-Preservation functionality, as well as some basic tools for working wit
- dericed/xsl4metadata - various xsl to do this or that
To the extent possible under law, stephenmcconnachie has waived all copyright and related or neighboring rights to this work.