Skip to content

Latest commit

 

History

History
86 lines (72 loc) · 2.61 KB

README.rst

File metadata and controls

86 lines (72 loc) · 2.61 KB

Brute Force Plotter

[Work in progress] Tool to visualize data quickly with no brain usage for plot creation

Installation

will be packaged soon

Example

It was tested on python3 only

$ git clone https://github.com/eyadsibai/brute_force_plotter.git
$ cd brute_force_plotter
$ pip3 install -r requirements.txt
$ python3 brute_force_plotter.py example/titanic.csv example/titanic_dtypes.json example/output
  • json.dump({k:v.name for k,v in df.dtypes.to_dict().items()},open('dtypes.json','w'))
  • the first argument is the input file (csv file with data) example/titanic.csv
  • second argument is a json file with the data types of each columns (c for category, n for numeric, i for ignore) example/titanic_dtypes.json
{
"Survived": "c",
"Pclass": "c",
"Sex": "c",
"Age": "n",
"SibSp": "n",
"Parch": "n",
"Fare": "n",
"Embarked": "c",
"PassengerId": "i",
"Ticket": "i",
"Cabin": "i",
"Name": "i"
}
  • third argument is the output directory
  • c stands for category, i stands for ignore, n for numeric

Age Distribution (Histogram with Kernel Density Estimation, Violin Plot)

Heatmap for Sex and Pclass

Pclass vs Survived

Survived vs Age

Age vs Fare

TODO

  • target variable support
  • Clean up part of the code
  • More documentation
  • Tests?
  • Support 3 variables (contour plots/ etc)
  • Fallback for large datasets
  • Figure out the data type or suggest some
  • Map visualization (if geocoordinates)
  • Minimize the number of plots
  • Support for Time Series