PREDICTING A PULSAR STAR

HTRU2 is a data set which describes a sample of pulsar candidates collected during the High Time Resolution Universe Survey .

Pulsars are a rare type of Neutron star that produce radio emission detectable here on Earth. They are of considerable scientific interest as probes of space-time, the inter-stellar medium, and states of matter .

As pulsars rotate, their emission beam sweeps across the sky, and when this crosses our line of sight, produces a detectable pattern of broadband radio emission. As pulsars rotate rapidly, this pattern repeats periodically. Thus pulsar search involves looking for periodic radio signals with large radio telescopes.

Each pulsar produces a slightly different emission pattern, which varies slightly with each rotation . Thus a potential signal detection known as a 'candidate', is averaged over many rotations of the pulsar, as determined by the length of an observation. In the absence of additional info, each candidate could potentially describe a real pulsar. However in practice almost all detections are caused by radio frequency interference (RFI) and noise, making legitimate signals hard to find.

Machine learning tools are now being used to automatically label pulsar candidates to facilitate rapid analysis. Classification systems in particular are being widely adopted, which treat the candidate data sets as binary classification problems. Here the legitimate pulsar examples are a minority positive class, and spurious examples the majority negative class.

The data set shared here contains 16,259 spurious examples caused by RFI/noise, and 1,639 real pulsar examples. These examples have all been checked by human annotators.

Each row lists the variables first, and the class label is the final entry. The class labels used are 0 (negative) and 1 (positive).

Attribute Information:

Each candidate is described by 8 continuous variables, and a single class variable. The first four are simple statistics obtained from the integrated pulse profile (folded profile). This is an array of continuous variables that describe a longitude-resolved version of the signal that has been averaged in both time and frequency . The remaining four variables are similarly obtained from the DM-SNR curve . These are summarised below:

Mean of the integrated profile. Standard deviation of the integrated profile. Excess kurtosis of the integrated profile. Skewness of the integrated profile. Mean of the DM-SNR curve. Standard deviation of the DM-SNR curve. Excess kurtosis of the DM-SNR curve. Skewness of the DM-SNR curve. Class HTRU 2 Summary 17,898 total examples. 1,639 positive examples. 16,259 negative examples.

One can found the original source of this dataset at https://www.kaggle.com/pavanraj159/predicting-a-pulsar-star/home.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
README.md		README.md
confusion_matrix_wcw.png		confusion_matrix_wcw.png
plot_wcw.png		plot_wcw.png
predicting_pulsar_stars.ipynb		predicting_pulsar_stars.ipynb
pulsar_stars.csv		pulsar_stars.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PREDICTING A PULSAR STAR

Attribute Information:

About

Releases

Packages

Languages

alexandrehsd/predicting-pulsar-stars

Folders and files

Latest commit

History

Repository files navigation

PREDICTING A PULSAR STAR

Attribute Information:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages