Skip to content
@apify

Apify

We're making the web more programmable.

Pinned Loading

  1. crawlee-python crawlee-python Public

    Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…

    Python 4.8k 326

  2. crawlee crawlee Public

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, an…

    TypeScript 16k 696

  3. proxy-chain proxy-chain Public

    Node.js implementation of a proxy server (think Squid) with support for SSL, authentication and upstream proxy chaining.

    JavaScript 855 146

  4. apify-sdk-js apify-sdk-js Public

    Apify SDK monorepo

    TypeScript 127 39

  5. got-scraping got-scraping Public

    HTTP client made for scraping based on got.

    TypeScript 561 45

  6. fingerprint-suite fingerprint-suite Public

    Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.

    TypeScript 1.1k 111

Repositories

Showing 10 of 133 repositories
  • crawlee Public

    Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

    apify/crawlee’s past year of commit activity
    TypeScript 15,998 Apache-2.0 696 121 (1 issue needs help) 15 Updated Dec 12, 2024
  • apify-shared-js Public

    Utilities and constants shared across Apify projects.

    apify/apify-shared-js’s past year of commit activity
    TypeScript 12 Apache-2.0 11 5 0 Updated Dec 12, 2024
  • apify-docs Public

    This project is the home of Apify's documentation.

    apify/apify-docs’s past year of commit activity
    API Blueprint 29 Apache-2.0 79 76 39 Updated Dec 11, 2024
  • rag-web-browser Public

    RAG Web Browser is an Apify Actor to feed your LLM applications and RAG pipelines with up-to-date text content scraped from the web.

    apify/rag-web-browser’s past year of commit activity
    TypeScript 5 Apache-2.0 0 3 0 Updated Dec 12, 2024
  • crawlee-python Public

    Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

    apify/crawlee-python’s past year of commit activity
    Python 4,790 Apache-2.0 326 86 14 Updated Dec 11, 2024
  • fingerprint-suite Public

    Browser fingerprinting tools for anonymizing your scrapers. Developed by Apify.

    apify/fingerprint-suite’s past year of commit activity
    TypeScript 1,071 Apache-2.0 111 19 7 Updated Dec 11, 2024
  • mcp-server-rag-web-browser Public

    A MCP Server for the RAG Web Browser Actor

    apify/mcp-server-rag-web-browser’s past year of commit activity
    JavaScript 1 Apache-2.0 0 0 0 Updated Dec 11, 2024
  • apify-sdk-python Public

    The Apify SDK for Python is the official library for creating Apify Actors in Python. It provides useful features like actor lifecycle management, local storage emulation, and actor event handling.

    apify/apify-sdk-python’s past year of commit activity
    Python 120 Apache-2.0 10 14 2 Updated Dec 11, 2024
  • .github Public

    Repository to define an organization (or team) wide Github Actions workflows

    apify/.github’s past year of commit activity
    0 0 0 1 Updated Dec 11, 2024
  • workflows Public

    Apify's reusable github workflows

    apify/workflows’s past year of commit activity
    Python 7 4 4 5 Updated Dec 10, 2024