List of resources helping you become a better AI engineer.
- Microsoft's definition "Artificial intelligence (AI) engineers are responsible for developing, programming and training the complex networks of algorithms that make up AI so that they can function like a human brain. This role requires combined expertise in software development, programming, data science and data engineering"
- Coursera's definition "Artificial intelligence engineers are individuals who use AI and machine learning techniques to develop applications and systems that can help organizations increase efficiency, cut costs, increase profits, and make better business decisions."
- Tech Target "AI engineers develop, program and train the complex networks of algorithms that encompass AI so those algorithms can work like a human brain. AI engineers must be experts in software development, data science, data engineering and programming."
- Swyx podcast (17 April 2024)
- UpWork "AI engineers work on a broader set of tasks that encompass various forms of machine intelligence, like neural networks, to develop AI models for specific applications. In contrast, ML engineers focus more on ML algorithms and models that can self-tune to better learn and make predictions from large data sets."
- IEEE ChatGPT's summary of that page "AI engineers blend traditional software engineering skills with a deep understanding of machine learning and artificial intelligence to develop systems that enhance decision-making and automation within organizations. They are proficient in AI technologies and statistical analysis, focusing on building and integrating AI models into applications. On the other hand, software engineers focus broadly on designing, implementing, and maintaining software systems, with a comprehensive grasp of the software development lifecycle, from requirement analysis to deployment and maintenance. The distinction is further marked by the AI engineer's need to navigate emerging AI technologies, whereas software engineers adhere to established engineering principles and practices across various platforms and technologies"
This section covers useful stuff you can use to become a better AI engineer.
- ChatGPT
- Claude.ai
- Phind (dev focus, GPT4+own)
- Microsoft Copilot (GPT4+own)
- Perplexity.ai
- You.com
- groq.com
Try out open source models instantly.
- Perplexity Labs side by side comparison
- Groq chat demo a subset of models on Groq's proprietary inference hardware (LPUs)
- Vercel AI Playground
- ollama (go/open source)
- LocalAI (go/open source)
- msty.app
- Nitro.jan.ai
- Paddler scaling / load balancing of llama.cpp inference
- modal.com: on demand Serverless container +GPU execution runtime
- Predibase: LLM fine-tuning and hosting
- brev.dev:
- Replicate.com: models-as-a service
- Together.ai: Serverless LLM / multimodal inference
- Lambda Labs: Manual rental of GPUs / clusters
- Beam.cloud: Serverless generative AI fast standup
- Runpod
- Cloudflare Workers AI
- Coreweave: autoscale GPU + Serverless (knative)
- Mosaicml: (acquired by Databricks)
- mixedbread.ai: retrieval as a service (search, reranking, embedding)
- lamini.ai: LLM inference
- Anyscale + rai.ai scaling
- HF inference API
- massedcompute.com
- Salad.com
- Openpipe.ai
- Unsloth.ai
- Crusoe.ai GPU rental
- Akash
- Groq: ultra fast LLM for selected models
- BoltAI
- Saturn Cloud
- Fireworks.ai
- Inferless.com
- Banana.dev (defunct)
- pipeline.ai
- hyperstack.cloud
- Alibaba Elastic GPU service
- Cloudalize GPU Kubenetes Service
- Tensordock.com
- Fly GPU GPUs on demand
- Jarvis Labs GPUs on demand
- SGLang
- outlines
- Instructor
- Marginalia
- promptfoo
- Ollama grid search
- Uptrain
- Google Cloud GCP AutoSxS
- Lmsys.org
- Paloma
- LightEval
- Bayesian Evaluation
- Mozilla's experience
- Ruler (long context evaluation)
- OpenAI Simple Evals
- RLHF
- DPO
- TKO
- LIPO
- DORA
- SPO
- Longformer
- Reformer
- BigBird
- Attention Beacons
- RWKV
- Denseformer
- Microsoft SliceGPT remove up to 25% of layers
- DCFormer
- Lazy Axolotl
- Lit-GPT
- Predibase
- Fine Tune Llama 2 Colab (by HF)
- Openpipe.ai
- LISA
- Torchtune
- LASER layer reduction
- lmstudio.ai
- Predibase LORALand
- RoPE
- Ailibi
- LongRoPE
- Unsloth+RoPE
- InfiniAttention: a pathway to ultra long context windows with manageable memory consumption
- pinecone
- weaviate
- chroma (open source)
- lancedb (open source)
- postgresql + pgvector (open source)
- sqlite + vss (open source)
- faiss by meta
- Vespa.ai + binary embeddings
- Blueocean / paperspace for GPU
- AWS
- GCP
- Azure
- Hetzner GPU
- Cloudflare
- Lightning Studio
- Google Colab
- ChatGPT
- Julius.ai
- FlashAttentionv2
- HippoAttention
- RingAttention
- PagedAttention
- Efficient Linear Model Merging for LLMs
- Automerge
- Sakana Evolutionary Model Merge
- Adam
- AdamW
- Prodigy
- Schedule-free optimizers (April 2024)
- SymPy
- torch.autograd
- Autograd
- tf.GradientTape
- gomlx
- mitmproxy (via Show Me The Prompt)
- CrewAI
- Autogen
- OpenDevin
- SWE-agent
- Leda
- Devon open source pair programmer
- HuggingFace Agents
- Weaviate Verba: RAG solution using Weaviate
- Microsoft GitHub
- AWS Bedrock embeddings, streamlit, langchain, pinecone, claude, etc.
- AWS Serverless
- GCP
- Gemini for document processing
- AWS knowledge bases for bedrock
- FLARE dynamically replace low-probability tokens with RAG lookups
- Embedchain
- Llamaguard
- Llamaguard with streaming
- Guardrails for AWS Bedrock
- Amazon Titan Embeddings
- Huggingface
- Nomic + ollama
- Cohere multi-aspect embeddings
- LLM2Vec
- Amazon Kendra
- Colbert
- Binary quantization (BitNet)
For hosting multiple fine-tunes at once
- Punica
- Run.ai -- service for bare metal GPU cluster management now owned by Nvidia
- sst2 sentiment movie sentiment (HF)
- 650,000 English books
- Openwebtext
- Fineweb
- generator9000
- Groq
- Truffle-1
- NeMo-Curator
- Tinybox / tinygrad
- WOPR (7 x 4090)
- Argilla Distilabel
- Spacy Prodigy
- Snorkel
- Refuel-AI autolabel
- DVCorg
- WandB Weave