Skip to content

This repository contains the python implementation for paper "Second-Order Unsupervised Feature Selection via Knowledge Contrastive Distillation".

Notifications You must be signed in to change notification settings

brandeis-machine-learning/Second-Order-Unsupervised-Feature-Selection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SOFT

This repository contains the python implementation for paper "Second-Order Unsupervised Feature Selection via Knowledge Contrastive Distillation".

Paper Abstract

Unsupervised feature selection aims to select a subset from the original features that are most useful for the downstream tasks without external guidance information. While most unsupervised feature selection methods focus on ranking features based on the intrinsic properties of data, they do not pay much attention to the relationships between features, which often leads to redundancy among the selected features. In this paper, we propose a two-stage Second-Order unsupervised Feature selection via knowledge contrastive disTillation (SOFT) model that incorporates the second-order covariance matrix with the first-order data matrix for unsupervised feature selection. In the first stage, we learn a sparse attention matrix that can represent second-order relations between features. In the second stage, we build a relational graph based on the learned attention matrix and adopt graph segmentation for feature selection. Experimental results on 12 public datasets demonstrate the effectiveness of our proposed method.

Requirements

  • faiss
  • h5py
  • networkx
  • numpy
  • nxmetis
  • pandas
  • pytorch
  • scipy
  • skfeature

File Description

  • clustering.py: clustering method to get pseudo labels
  • evaluate.py: the second stage of SOFT to select features
  • model.py: pytorch implementation of SOFT model
  • selector.py: the first stage of SOFT to learn second-order feature relations
  • util.py: support functions

Datasets

Methods for Comparison

How to Run

In the first stage, SOFT learns the second-order feature relation matrix by running:

python selector.py

In the second stage, SOFT selects features and is evaluated by running:

python evaluate.py

About

This repository contains the python implementation for paper "Second-Order Unsupervised Feature Selection via Knowledge Contrastive Distillation".

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages