Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge dev into main #26

Merged
merged 2 commits into from
Dec 6, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
</a>
<!-- Coveralls report -->
<a alt='Coveralls report' href='https://coveralls.io/github/WenjieDu/PyPOTS'>
<img src='https://coveralls.io/repos/github/WenjieDu/PyPOTS/badge.svg'>
<img src='https://img.shields.io/coverallsCoverage/github/WenjieDu/PyPOTS?branch=main&logo=coveralls&labelColor=3F5767'>
</a>
<!-- PyPI download number -->
<a alt='PyPI download number' href='https://pepy.tech/project/pypots'>
Expand All @@ -36,6 +36,10 @@
<a alt='CODE_OF_CONDUCT' href='CODE_OF_CONDUCT.md'>
<img src='https://img.shields.io/badge/Contributor%20Covenant-v2.1-4baaaa.svg'>
</a>
<!-- Slack Workspace -->
<a alt='Slack Workspace' href='https://join.slack.com/t/pypots-dev/shared_invite/zt-1gq6ufwsi-p0OZdW~e9UW_IA4_f1OfxA'>
<img src='https://img.shields.io/badge/Slack-PyPOTS-grey?logo=slack&labelColor=4A154B&color=62BCE5'>
</a>
</p>

⦿ `Motivation`: Due to all kinds of reasons like failure of collection sensors, communication error, and unexpected malfunction, missing values are common to see in time series from the real-world environment. This makes partially-observed time series (POTS) a pervasive problem in open-world modeling and prevents advanced data analysis. Although this problem is important, the area of data mining on POTS still lacks a dedicated toolkit. PyPOTS is created to fill in this blank.
Expand Down Expand Up @@ -85,7 +89,7 @@ or
## ❖ Attention 👀
The documentation and tutorials are under construction. And a short paper introducing PyPOTS is on the way! 🚀 Stay tuned please!

‼️ PyPOTS is currently under developing. If you like it and look forward to its growth, <ins>please give PyPOTS a star and watch it to keep you posted on its progress and to let me know that its development is meaningful</ins>. If you have any feedback, or want to contribute ideas/suggestions or share time-series related algorithms/papers, please join PyPOTS community and <a alt='GitHub Discussions' href='https://github.com/WenjieDu/PyPOTS/discussions'><img align='center' src='https://img.shields.io/badge/Chat-in_Discussions-green?logo=github&color=60A98D'></a>, or create an issue.
‼️ PyPOTS is currently under developing. If you like it and look forward to its growth, <ins>please give PyPOTS a star and watch it to keep you posted on its progress and to let me know that its development is meaningful</ins>. If you have any feedback, or want to contribute ideas/suggestions or share time-series related algorithms/papers, please join PyPOTS community and chat on <a alt='Slack Workspace' href='https://join.slack.com/t/pypots-dev/shared_invite/zt-1gq6ufwsi-p0OZdW~e9UW_IA4_f1OfxA'><img align='center' src='https://img.shields.io/badge/Slack-PyPOTS-grey?logo=slack&labelColor=4A154B&color=62BCE5'></a>, or create an issue. If you have any additional questions or have interests in collaboration, please take a look at [my GitHub profile](https://github.com/WenjieDu) and feel free to contact me 😃.

Thank you all for your attention! 😃

Expand Down
2 changes: 1 addition & 1 deletion pypots/data/load_specific_datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ def preprocess_physionet2012(data):
def apply_func(df_temp): # pad and truncate to set the max length of samples as 48
missing = list(set(range(0, 48)).difference(set(df_temp["Time"])))
missing_part = pd.DataFrame({"Time": missing})
df_temp = df_temp.append(missing_part, ignore_index=False, sort=False) # pad
df_temp = pd.concat([df_temp, missing_part], ignore_index=False, sort=False) # pad
df_temp = df_temp.set_index("Time").sort_index().reset_index()
df_temp = df_temp.iloc[:48] # truncate
return df_temp
Expand Down
16 changes: 0 additions & 16 deletions pypots/tests/unified_data_for_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@
# Created by Wenjie Du <wenjay.du@gmail.com>
# License: GLP-v3

import pandas as pd
import torch
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
Expand Down Expand Up @@ -75,21 +74,6 @@ def gene_physionet2012():
# generate samples
df = load_specific_dataset("physionet_2012")
X = df["X"]
X = X.drop(df["static_features"], axis=1)

def apply_func(df_temp):
missing = list(set(range(0, 48)).difference(set(df_temp["Time"])))
missing_part = pd.DataFrame({"Time": missing})
df_temp = df_temp.append(missing_part, ignore_index=False, sort=False)
df_temp = df_temp.set_index("Time").sort_index().reset_index()
df_temp = df_temp.iloc[:48]
return df_temp

X = X.groupby("RecordID").apply(apply_func)
X = X.drop("RecordID", axis=1)
X = X.reset_index()
X = X.drop(["level_1", "Time"], axis=1)

y = df["y"]
all_recordID = X["RecordID"].unique()
train_set_ids, test_set_ids = train_test_split(all_recordID, test_size=0.2)
Expand Down