Skip to content
@dell-research-harvard

dell-research-harvard

Popular repositories Loading

  1. AmericanStories AmericanStories Public

    The official Github for the American Stories dataset as in {link}

    Python 108 8

  2. linktransformer linktransformer Public

    A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

    Python 105 10

  3. effocr effocr Public

    A model(ing framework) for sample efficient OCR

    Python 53 5

  4. HJDataset HJDataset Public

    A Large Dataset of Historical Japanese Documents with Complex Layouts

    Jupyter Notebook 31 4

  5. NEWS-COPY NEWS-COPY Public

    Noise-robust de-duplication at scale

    Python 15

  6. newswire newswire Public

    Python 7

Repositories

Showing 10 of 29 repositories
  • newswire Public
    dell-research-harvard/newswire’s past year of commit activity
    Python 7 0 0 0 Updated Aug 15, 2024
  • linktransformer Public

    A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning

    dell-research-harvard/linktransformer’s past year of commit activity
    Python 105 GPL-3.0 10 4 1 Updated Jun 12, 2024
  • efficient_ocr Public

    Efficient OCR for Building a Diverse Digital History

    dell-research-harvard/efficient_ocr’s past year of commit activity
    Python 5 Apache-2.0 0 0 0 Updated Apr 12, 2024
  • newsdejavu Public

    Python package for News Deja Vu

    dell-research-harvard/newsdejavu’s past year of commit activity
    Python 4 MIT 0 0 0 Updated Apr 9, 2024
  • AmericanStories Public

    The official Github for the American Stories dataset as in {link}

    dell-research-harvard/AmericanStories’s past year of commit activity
    Python 108 8 7 0 Updated Mar 7, 2024
  • HomoglyphsCJKTraining Public

    Quantifying Character Similarity with Vision Transformers

    dell-research-harvard/HomoglyphsCJKTraining’s past year of commit activity
    Python 5 0 0 0 Updated Oct 27, 2023
  • HomoglyphsCJK Public

    An efficient and useful tool to fuzzy match Japanese, Korean, Simplified Chinese or Traditional Chinese words.

    dell-research-harvard/HomoglyphsCJK’s past year of commit activity
    Python 2 MIT 1 0 0 Updated Oct 13, 2023
  • Associating-Press Public

    Associating layout elements from newspapers into full articles

    dell-research-harvard/Associating-Press’s past year of commit activity
    1 0 0 0 Updated Sep 15, 2023
  • DPR Public Forked from facebookresearch/DPR

    Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

    dell-research-harvard/DPR’s past year of commit activity
    Python 1 308 0 0 Updated Aug 15, 2023
  • dell-research-harvard/linktransformer-readthedocs’s past year of commit activity
    Python 0 0 0 0 Updated Aug 6, 2023

Top languages

Loading…

Most used topics

Loading…