-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
5 changed files
with
124 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions
4
packages/docs/src/content/docs/components/loaders/overview.mdx
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
--- | ||
title: Loaders | ||
description: Components to load documents into Clusview. | ||
--- |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,110 @@ | ||
--- | ||
title: Welcome to Clusview! | ||
title: Getting Started | ||
description: Get started with Clusview. | ||
--- | ||
|
||
_Clusview_ is a [BERTopic](https://maartengr.github.io/BERTopic/index.html) based tool that aims to optimize and visualize document clustering leveraging a configurable cluster quality meter. Essentially, performing **_CLUS_**ter **_VIEW_**ing in a human-readable way. | ||
import { Steps } from '@astrojs/starlight/components'; | ||
import { Code } from '@astrojs/starlight/components'; | ||
import { FileTree } from '@astrojs/starlight/components'; | ||
import { Aside } from '@astrojs/starlight/components'; | ||
import { Tabs, TabItem } from '@astrojs/starlight/components'; | ||
|
||
## A peaceful morning | ||
Picture yourself as a curator in a museum. | ||
|
||
One peaceful morning, your museum receives a large crate of archives from Ancient Greece. | ||
|
||
![ancient_greece](../../../assets/images/ancient_greece.jpg) | ||
|
||
You are tasked with organizing and displaying these archives in a way that is both informative and coherent to visitors. | ||
|
||
How would you go about this? Grouping by style? By time period? By subject matter? And to make things more | ||
complicated, how many groups should you divide the archives into? Suddenly, the morning is not so peaceful anymore. | ||
|
||
As you might have noticed, there is no right answer for this. And even if you find the optimal way to do this, | ||
would that approach work for another crate with a completely different set of archives? Probably not. | ||
|
||
### Topic Modeling | ||
|
||
This is where the concept of [Topic Modeling](https://en.wikipedia.org/wiki/Topic_model) comes in to restore the peacefulness of your morning. | ||
|
||
Suppose you could feed the archives into a machine, and it would automatically group them into coherent topics. | ||
Well, recent techniques in Natural Language Processing, | ||
such as [BERTopic](https://maartengr.github.io/BERTopic/index.html), have made this possible | ||
leveraging the power of [transformers.](https://en.wikipedia.org/wiki/Transformer_(deep_learning_architecture)) | ||
Problem solved then, right? | ||
|
||
Not so fast! There is, however, no silver bullet for how to use these tools, my dear curator. | ||
No oracle can tell you the best way to group your archives. Natural language is fuzzy, and what constitutes a good topic | ||
for you might not be the same for someone else. | ||
|
||
### Clusview | ||
|
||
And so, we arrive at [Clusview](/). A tool designed to ease the process of exploring and understanding the topics | ||
with an interactive approach. So you can decide what is the best way to group your archives. | ||
|
||
Keep reading, your journey with Clusview starts here! | ||
|
||
## Installation | ||
Currently, Clusview has no hosted solution. The easiest way to use Clusview is to run it locally on your machine. | ||
|
||
<Steps> | ||
|
||
1. [Clone](https://git-scm.com/docs/git-clone/) the Clusview repository from [GitHub](https://github.com/gcalcedo/clusview). | ||
|
||
<Code code={"git clone https://github.com/gcalcedo/clusview.git"} lang="shell" /> | ||
|
||
2. Navigate to the core Clusview package. | ||
|
||
<Code code={"cd clusview/packages/clusview"} lang="shell" /> | ||
|
||
Clusview is organized in different packages under the **`packages`** directory. | ||
The core package (**`packages/clusview`**), implemented in Python, is where the main functionality is. | ||
This is also a standalone module, so you can use it in your own projects without the use of the UI. | ||
|
||
Here's a quick overview of the structure, with the core package highlighted: | ||
|
||
<FileTree> | ||
- ... | ||
- packages/ | ||
- **clusview/** # This is the core package. | ||
- docs/ # Documentation, this very site! | ||
- ui/ # UI to interact with the core package. | ||
- ... | ||
</FileTree> | ||
|
||
3. Create a [virtual environment](https://docs.python.org/3/library/venv.html). | ||
|
||
<Code code={"python -m venv venv"} lang="shell" /> | ||
|
||
The only pre-requisite you need to use Clusview is a Python installation. | ||
|
||
<Aside type="caution" title="Python Version Compatibility"> | ||
Clusview has been developed with **`Python 3.12.3`**. | ||
Older versions may work as well, but this is not yet tested. | ||
</Aside> | ||
|
||
4. Activate the [virtual environment](https://docs.python.org/3/library/venv.html). | ||
|
||
<Tabs> | ||
<TabItem label="Linux" icon="linux"> | ||
<Code code={"source venv/bin/activate"} lang="shell" /> | ||
</TabItem> | ||
<TabItem label="macOS" icon="apple"> | ||
<Code code={"source venv/bin/activate"} lang="shell" /> | ||
</TabItem> | ||
<TabItem label="Windows" icon="seti:windows"> | ||
<Code code={"venv/Scripts/activate"} lang="shell" /> | ||
</TabItem> | ||
</Tabs> | ||
|
||
5. Install dependencies in **`requirements.txt`** via **`pip`**. | ||
|
||
<Code code={"pip install -r requirements.txt"} lang="shell" /> | ||
|
||
6. Done! | ||
|
||
<Aside type="tip" title="3, 2, 1... Launch!"> | ||
Clusview is ready to work for you, start exploring your data! | ||
</Aside> | ||
</Steps> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters