Merge pull request #780 from pandas-profiling/develop

v3.0.0
ydataai · May 11, 2021 · 22da35e · 22da35e
2 parents 02ed31a + 62f8e3f
commit 22da35e
Show file tree

Hide file tree

Showing 156 changed files with 3,335 additions and 2,825 deletions.
diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -1,69 +1,69 @@
----
-name: Bug report
-about: Create a report to help us improve
-title: ''
-labels: bug
-assignees: ''
-
----
-
-**Describe the bug**
-
-<!--
-A clear and concise description of what the bug is.
-If the description consists of multiple non-related bugs, you are encouraged to create separate issues.
--->
-
-**To Reproduce**
-
-<!--
-We would need to reproduce your scenario before being able to resolve it. 
-
-_Data:_
-Please share your dataframe. 
-If the data is confidential, for example when it contains company-sensitive information, provide us with a synthetic or open dataset that produces the same error. 
-You should provide the DataFrame structure, for example by reporting the output of `df.info()`. 
-You can anonymize the column names if necessary.
-
-_Code:_ Preferably, use this code format:
-```python
-"""
-Test for issue XXX:
-https://github.com/pandas-profiling/pandas-profiling/issues/XXX
-"""
-import pandas as pd
-import pandas_profiling
-
-
-def test_issueXXX():
-    df = pd.read_csv(r'<file>')
-
-    # Minimal reproducible code
-```
---> 
-
-**Version information:**
-
-<!--
-Version information is essential in reproducing and resolving bugs. Please report:
-
-* _Python version_: Your exact Python version.
-* _Environment_: Where do you run the code? Command line, IDE (PyCharm, Spyder, IDLE etc.), Jupyter Notebook (Colab or local)
-* _`pip`_: If you are using `pip`, run `pip freeze` in your environment and report the results. The list of packages can be rather long, you can use the snippet below to collapse the output.
-
-<details><summary>Click to expand <strong><em>Version information</em></strong></summary>
-<p>
-
-```
-<<< Put your version information here >>>
-```
-
-</p>
-</details>
--->
-
-**Additional context**
-
-<!--
-Add any other context about the problem here.
--->
+---
+name: Bug report
+about: Create a report to help us improve
+title: ''
+labels: bug
+assignees: ''
+
+---
+
+**Describe the bug**
+
+<!--
+A clear and concise description of what the bug is.
+If the description consists of multiple non-related bugs, you are encouraged to create separate issues.
+-->
+
+**To Reproduce**
+
+<!--
+We would need to reproduce your scenario before being able to resolve it. 
+
+_Data:_
+Please share your dataframe. 
+If the data is confidential, for example when it contains company-sensitive information, provide us with a synthetic or open dataset that produces the same error. 
+You should provide the DataFrame structure, for example by reporting the output of `df.info()`. 
+You can anonymize the column names if necessary.
+
+_Code:_ Preferably, use this code format:
+```python
+"""
+Test for issue XXX:
+https://github.com/pandas-profiling/pandas-profiling/issues/XXX
+"""
+import pandas as pd
+import pandas_profiling
+
+
+def test_issueXXX():
+    df = pd.read_csv(r"<file>")
+
+    # Minimal reproducible code
+```
+--> 
+
+**Version information:**
+
+<!--
+Version information is essential in reproducing and resolving bugs. Please report:
+
+* _Python version_: Your exact Python version.
+* _Environment_: Where do you run the code? Command line, IDE (PyCharm, Spyder, IDLE etc.), Jupyter Notebook (Colab or local)
+* _`pip`_: If you are using `pip`, run `pip freeze` in your environment and report the results. The list of packages can be rather long, you can use the snippet below to collapse the output.
+
+<details><summary>Click to expand <strong><em>Version information</em></strong></summary>
+<p>
+
+```
+<<< Put your version information here >>>
+```
+
+</p>
+</details>
+-->
+
+**Additional context**
+
+<!--
+Add any other context about the problem here.
+-->
diff --git a/.github/workflows/pypi.yml b/.github/workflows/pypi.yml
@@ -37,9 +37,11 @@ jobs:
     - name: Install
       run: make install
 
+    - name: Lint
+      run: make lint
+
     - name: Make distribution
       run: |
-        check-manifest
         python setup.py sdist bdist_wheel
         twine check dist/*
 

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -1,6 +1,6 @@
 repos:
 -   repo: https://github.com/psf/black
-    rev: 21.4b2
+    rev: 21.5b0
     hooks:
     - id: black
       language_version: python3.8
@@ -14,7 +14,7 @@ repos:
     - id: nbqa-pyupgrade
       args: [ --nbqa-mutate, --py36-plus ]
 -   repo: https://github.com/asottile/pyupgrade
-    rev: v2.14.0
+    rev: v2.15.0
     hooks:
     -   id: pyupgrade
         args: ['--py36-plus','--exit-zero-even-if-changed']
@@ -29,7 +29,7 @@ repos:
     hooks:
     -   id: check-manifest
 -   repo: https://github.com/PyCQA/flake8
-    rev: "3.9.1"
+    rev: "3.9.2"
     hooks:
     -   id: flake8
         args: [ "--ignore=E203,E501,W291,W503,SFS301,SIM106" ]
@@ -39,6 +39,36 @@ repos:
           - flake8-simplify
           - flake8-eradicate
           - flake8-print
+-   repo: https://github.com/PyCQA/flake8
+    rev: "3.9.2"
+    hooks:
+    -   id: flake8
+        name: flake8-annotations
+        args: [ "--select=ANN001,ANN201,ANN202,ANN205,ANN206,ANN301" ]
+        additional_dependencies:
+          - flake8-annotations
+#          - flake8-annotations-complexity
+#          - flake8-type-checking
+        exclude: |
+          (?x)(
+            ^tests/|
+            ^docsrc/|
+            ^src/pandas_profiling/utils/common.py|
+            ^src/pandas_profiling/model/imghdr_patch.py
+          )
+
+-   repo: https://github.com/asottile/blacken-docs
+    rev: v1.10.0
+    hooks:
+    -   id: blacken-docs
+-   repo: https://github.com/pre-commit/pygrep-hooks
+    rev: v1.8.0
+    hooks:
+    -   id: rst-backticks
+-   repo: https://github.com/pre-commit/mirrors-mypy
+    rev: 'v0.812'
+    hooks:
+    -   id: mypy
 
 ci:
   autoupdate_commit_msg: 'ci: pre-commit-config update'
diff --git a/Makefile b/Makefile
@@ -15,36 +15,23 @@ test:
 	pytest tests/unit/
 	pytest tests/issues/
 	pytest --nbval tests/notebooks/
-	flake8 . --select=E9,F63,F7,F82 --show-source --statistics
 	pandas_profiling -h
-	make typing
 
 test_cov:
 	pytest --cov=. tests/unit/
 	pytest --cov=. --cov-append tests/issues/
 	pytest --cov=. --cov-append --nbval tests/notebooks/
 	pandas_profiling -h
-	make typing
 
 examples:
 	find ./examples -maxdepth 2 -type f -name "*.py" -execdir python {} \;
 
-pypi_package:
-	make install
-	check-manifest
-	python setup.py sdist bdist_wheel
-	twine check dist/*
-	twine upload --skip-existing dist/*
-
 install:
 	pip install -e .[notebook]
 
 lint:
 	pre-commit run --all-files
 
-typing:
-	pytest --mypy -m mypy .
-
 clean:
 	git rm --cached `git ls-files -i --exclude-from=.gitignore`
 
@@ -54,4 +41,3 @@ all:
 	make examples
 	make docs
 	make test
-	make typing
diff --git a/README.md b/README.md
@@ -37,10 +37,12 @@ For each column the following statistics - if relevant for the column type - are
 
 ## Announcements
 
-**Version v2.13.0 released** featuring an exciting integration with Great Expectations that many of you requested (see details below).
+**Version v3.0.0 released** in which the report configuration was completely overhauled, providing a more intuitive API and fixing issues inherent to the previous global config. 
+
+This is the first release to adhere to the [Semver](https://semver.org/) and [Conventional Commits](https://conventionalcommits.org/) specifications.
 
 **Spark backend in progress**: We can happily announce that we're nearing v1 for the Spark backend for generating profile reports.
-Stay tuned.
+Beta testers wanted! The Spark backend will be released as a pre-release for this package.
 
 ### Support `pandas-profiling`
 
@@ -51,10 +53,10 @@ It's extra exciting that GitHub **matches your contribution** for the first year
 
 Find more information here:
 
- - [Changelog v2.13.0](https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/pages/changelog.html#changelog)
+ - [Changelog v3.0.0](https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/pages/changelog.html#changelog)
  - [Sponsor the project on GitHub](https://github.com/sponsors/sbrugman)
 
-_May 8, 2021 💘_
+_May 9, 2021 💘_
 
 ---
 
@@ -149,10 +151,7 @@ import numpy as np
 import pandas as pd
 from pandas_profiling import ProfileReport
 
-df = pd.DataFrame(
-    np.random.rand(100, 5),
-    columns=["a", "b", "c", "d", "e"]
-)
+df = pd.DataFrame(np.random.rand(100, 5), columns=["a", "b", "c", "d", "e"])
 ```
 To generate the report, run:
 ```python
@@ -164,7 +163,7 @@ profile = ProfileReport(df, title="Pandas Profiling Report")
 You can configure the profile report in any way you like. The example code below loads the [explorative configuration file](https://github.com/pandas-profiling/pandas-profiling/blob/master/src/pandas_profiling/config_explorative.yaml), that includes many features for text (length distribution, unicode information), files (file size, creation time) and images (dimensions, exif information). If you are interested what exact settings were used, you can compare with the [default configuration file](https://github.com/pandas-profiling/pandas-profiling/blob/master/src/pandas_profiling/config_default.yaml).
 
 ```python
-profile = ProfileReport(df, title='Pandas Profiling Report', explorative=True)
+profile = ProfileReport(df, title="Pandas Profiling Report", explorative=True)
 ```
 
 Learn more about configuring `pandas-profiling` on the [Advanced usage](https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/pages/advanced_usage.html) page.
@@ -248,7 +247,9 @@ You find the configuration docs on the advanced usage page [here](https://pandas
 
 **Example**
 ```python
-profile = df.profile_report(title='Pandas Profiling Report', plot={'histogram': {'bins': 8}})
+profile = df.profile_report(
+    title="Pandas Profiling Report", plot={"histogram": {"bins": 8}}
+)
 profile.to_file("output.html")
 ```
 

diff --git a/docsrc/source/conf.py b/docsrc/source/conf.py
@@ -44,13 +44,13 @@ def _GetApiWrapperVersion():
 # ones.
 extensions = [
     "recommonmark",
-    # "sphinx_multiversion",
     "sphinx.ext.autodoc",
     "sphinx.ext.autosummary",
     "sphinx.ext.coverage",
     "sphinx.ext.napoleon",
     "sphinx_autodoc_typehints",
     "sphinx.ext.viewcode",
+    "sphinxcontrib.autodoc_pydantic",
 ]
 
 # Add any paths that contain templates here, relative to this directory.
@@ -82,3 +82,5 @@ def _GetApiWrapperVersion():
 autodoc_mock_imports = [""]
 autoclass_content = "both"
 autosummary_generate = True
+
+autodoc_pydantic_model_show_json = False