Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
Caixc97 committed Feb 2, 2024
1 parent 62ff20f commit 68bd0a1
Show file tree
Hide file tree
Showing 3 changed files with 74 additions and 87 deletions.
49 changes: 0 additions & 49 deletions _posts/2020-02-26-flake-it-till-you-make-it.md

This file was deleted.

73 changes: 73 additions & 0 deletions _posts/Abstract.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
---
layout: page
title: Abstract
---
We introduce Uni-SMART (Universal Science Multimodal Analysis and Research Transformer), a pioneering scientific multimodal model by DP Technology, aimed at revolutionizing the efficiency of scientific document analysis.
In the realm of research, the exhaustive review of literature stands as pivotal yet notably labor-intensive tasks.
In drug discovery, for instance, practitioners must delve into an expansive corpus of literature to identify key functional areas of targets and active molecules.
Traditional rule-based databases, like Sci-Finder, offer search capabilities but still necessitate manual filtration and perusal of a vast number of documents.
Uni-SMART bridges this gap by combining multimodal search features with natural language processing, enabling automated, precise extraction of relevant data from domain database.
It surpasses conventional text-only models by integrating multimodal elements such as tables, equations, and chemical structures.
Such multimodal alignment innovation allows Uni-SMART to deliver a nuanced understanding of scientific literature, outperforming other unimodal language models in comprehension and response accuracy.
Uni-SMART represents a significant leap forward in the domain of scientific literature review, providing a powerful tool for researchers to rapidly assimilate and interact with complex multimodal data.

[//]: # (## Introduction)

[//]: # ()
[//]: # (**Background.**)

[//]: # (Scientific literature, encompassing patents and academic papers, is a treasure trove of valuable data, including but not limited to drug properties and activities, reaction pathways, manufacturing processes, and omics relationships. )

[//]: # (The extraction of this data, however, is notoriously labor-intensive and time-consuming. )

[//]: # (It requires meticulous manual reading, analysis, and extraction, processes that are not only slow but also prone to human error.)

[//]: # (Existing non-heuristic databases such as Sci-Finder and Reaxys rely heavily on human experts to perform these extractions. )

[//]: # (While they are effective in supporting certain types of data retrieval, such as chemical reactions, they lack the capacity for automatic extraction from newly published documents. )

[//]: # (This limitation poses a significant bottleneck in the timely utilization of scientific data, impeding research progress and the rapid application of new discoveries.)

[//]: # (Thus, researchers and practitioners are in need of an intelligent navigator that can swiftly guide through the complexities of latest scientific data, identify relevant information with precision, and present it in a digestible format.)

[//]: # ()
[//]: # (**Advent of LLMs.**)

[//]: # (The advent of large language models (LLMs) such as ChatGPT has heralded a new era of natural language processing, demonstrating remarkable proficiency in a myriad of natural language tasks. )

[//]: # (There has been a proliferation of literature assistance tools based on such models, like ChatPDF, which facilitate the extraction of text from PDF documents and engage in natural language question-answering. )

[//]: # (However, these tools are tailored predominantly for text extraction, and while they excel in processing and generating human-like text, they falter when confronted with the multimodal nature of scientific literature.)

[//]: # (Scientific documents are replete with multimodal information that extends beyond text, including but not limited to statistical tables, molecule graphs, and chemical reactions. )

[//]: # (The extraction and interpretation of such multimodal data require an understanding that transcends textual information and delves into the realm of visual and structural data representation. )

[//]: # ()
[//]: # (**Briefs of Uni-SMART.**)

[//]: # (To address these challenges, we have developed Uni-SMART (Universal Science Multimodal Analysis and Research Transformer), which extends the capabilities of LLMs beyond text, allowing for the interpretation of the rich visual and structural information that is paramount in scientific documentation. )

[//]: # ([Add some details about multi-modal abilities if possible.])

[//]: # (This innovative approach not only augments the automated and precise data extraction process but also enriches the interaction between researchers and the vast expanse of scientific knowledge, paving the way for a more holistic and efficient research methodology.)

[//]: # ()
[//]: # (**Evaluation of Uni-SMART.**)

[//]: # (To rigorously assess the multimodal capabilities of Uni-SMART, a comparative analysis was conducted against existing heuristic literature analysis tools, which are mostly based on LLMs.)

[//]: # (The tools included for comparison were ChatPDF, Claude, and GPT-4. )

[//]: # (Our evaluation focused on extensive functionalities crucial for scientific research: table information extraction, molecular formula recognition, Markush structure recognition, synonyms/IUPAC understanding, chart understanding, reaction equation recognition, multimodal understanding, multimodal reasoning, text understanding, and textual reasoning.)

[//]: # ([Add summary of evaluation results.])

[//]: # ()
[//]: # (**Outlines of this report.**)

[//]: # (In the following sections, we first provide an overview of the model architecture.)

[//]: # (We then present detailed evaluations of the Uni-SMART.)

[//]: # ([Briefs for more chapters if possible.])
39 changes: 1 addition & 38 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -4,42 +4,5 @@
subtitle: Universal Science Multimodal Analysis and Research Transformer
---

<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Uni-SMART</title>
<!-- 引入marked库 -->
<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
</head>
<body>
<img src="assets/img/Uni-SMART-framework.png" alt="Descriptive Alt Text" style="width:100%;">
<img src="assets/img/Uni-SMART-framework.png" alt="Descriptive Alt Text" style="width:100%;">

<!-- Markdown内容的容器 -->
<div id="markdown-content"></div>

<!-- Markdown内容 -->
<script type="text/markdown" id="markdown">
# Page Title

## sub title

![Image](assets/img/Uni-SMART-framework.png)
Here is some content from `page.md`:

- List item 1
- List item 2

Replace this script content with your actual markdown content from `page.md`.
</script>

<!-- 使用JavaScript转换并显示Markdown -->
<script>
document.addEventListener('DOMContentLoaded', (event) => {
const markdownText = document.getElementById('markdown').innerText;
const htmlContent = marked.parse(markdownText);
document.getElementById('markdown-content').innerHTML = htmlContent;
});
</script>
</body>
</html>

0 comments on commit 68bd0a1

Please sign in to comment.