Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master' into bug/categorical-i…
Browse files Browse the repository at this point in the history
…ndexing-1row-df

* upstream/master: (194 commits)
  DOC Remove Python 2 specific comments from documentation (pandas-dev#31198)
  Follow up PR: pandas-dev#28097 Simplify branch statement (pandas-dev#29243)
  BUG: DatetimeIndex.snap incorrectly setting freq (pandas-dev#31188)
  Move DataFrame.info() to live with similar functions (pandas-dev#31317)
  ENH: accept a dictionary in plot colors (pandas-dev#31071)
  PERF: add shortcut to Timestamp constructor (pandas-dev#30676)
  CLN/MAINT: Clean and annotate stata reader and writers (pandas-dev#31072)
  REF: define _get_slice_axis in correct classes (pandas-dev#31304)
  BUG: DataFrame.floordiv(ser, axis=0) not matching column-wise bheavior (pandas-dev#31271)
  PERF: optimize is_scalar, is_iterator (pandas-dev#31294)
  BUG: Series rolling count ignores min_periods (pandas-dev#30923)
  xfail sparse warning; closes pandas-dev#31310 (pandas-dev#31311)
  REF: DatetimeIndex.get_value wrap DTI.get_loc (pandas-dev#31314)
  CLN: internals.managers (pandas-dev#31316)
  PERF: avoid copies if possible in fill_binop (pandas-dev#31300)
  Add test for multiindex json (pandas-dev#31307)
  BUG: passing TDA and wrong freq to TimedeltaIndex (pandas-dev#31268)
  BUG: inconsistency between PeriodIndex.get_value vs get_loc (pandas-dev#31172)
  CLN: remove _set_subtyp (pandas-dev#31301)
  CI: Updated version of macos image (pandas-dev#31292)
  ...
  • Loading branch information
keechongtan committed Jan 27, 2020
2 parents 241bd7c + ca3bfcc commit 41e6ce4
Show file tree
Hide file tree
Showing 403 changed files with 8,078 additions and 12,046 deletions.
28 changes: 28 additions & 0 deletions .devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
// For format details, see https://aka.ms/vscode-remote/devcontainer.json or the definition README at
// https://github.com/microsoft/vscode-dev-containers/tree/master/containers/python-3-miniconda
{
"name": "pandas",
"context": ".",
"dockerFile": "Dockerfile",

// Use 'settings' to set *default* container specific settings.json values on container create.
// You can edit these settings after create using File > Preferences > Settings > Remote.
"settings": {
"terminal.integrated.shell.linux": "/bin/bash",
"python.condaPath": "/opt/conda/bin/conda",
"python.pythonPath": "/opt/conda/bin/python",
"python.formatting.provider": "black",
"python.linting.enabled": true,
"python.linting.flake8Enabled": true,
"python.linting.pylintEnabled": false,
"python.linting.mypyEnabled": true,
"python.testing.pytestEnabled": true,
"python.testing.cwd": "pandas/tests"
},

// Add the IDs of extensions you want installed when the container is created in the array below.
"extensions": [
"ms-python.python",
"ms-vscode.cpptools"
]
}
6 changes: 3 additions & 3 deletions .github/CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,10 +54,10 @@ incident.

This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 1.3.0, available at
[http://contributor-covenant.org/version/1/3/0/][version],
[https://www.contributor-covenant.org/version/1/3/0/][version],
and the [Swift Code of Conduct][swift].

[homepage]: http://contributor-covenant.org
[version]: http://contributor-covenant.org/version/1/3/0/
[homepage]: https://www.contributor-covenant.org
[version]: https://www.contributor-covenant.org/version/1/3/0/
[swift]: https://swift.org/community/#code-of-conduct

2 changes: 1 addition & 1 deletion .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ If you notice a bug in the code or documentation, or have suggestions for how we

## Contributing to the Codebase

The code is hosted on [GitHub](https://www.github.com/pandas-dev/pandas), so you will need to use [Git](http://git-scm.com/) to clone the project and make changes to the codebase. Once you have obtained a copy of the code, you should create a development environment that is separate from your existing Python environment so that you can make and test changes without compromising your own work environment. For more information, please refer to the "[Working with the code](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#working-with-the-code)" section.
The code is hosted on [GitHub](https://www.github.com/pandas-dev/pandas), so you will need to use [Git](https://git-scm.com/) to clone the project and make changes to the codebase. Once you have obtained a copy of the code, you should create a development environment that is separate from your existing Python environment so that you can make and test changes without compromising your own work environment. For more information, please refer to the "[Working with the code](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#working-with-the-code)" section.

Before submitting your changes for review, make sure to check that your changes do not break any tests. You can find more information about our test suites in the "[Test-driven development/code writing](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#test-driven-development-code-writing)" section. We also have guidelines regarding coding style that will be enforced during testing, which can be found in the "[Code standards](https://github.com/pandas-dev/pandas/blob/master/doc/source/development/contributing.rst#code-standards)" section.

Expand Down
11 changes: 5 additions & 6 deletions .github/workflows/assign.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,8 @@ jobs:
one:
runs-on: ubuntu-latest
steps:
- name:
run: |
if [[ "${{ github.event.comment.body }}" == "take" ]]; then
echo "Assigning issue ${{ github.event.issue.number }} to ${{ github.event.comment.user.login }}"
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"assignees": ["${{ github.event.comment.user.login }}"]}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/assignees
fi
- if: github.event.comment.body == 'take'
name:
run: |
echo "Assigning issue ${{ github.event.issue.number }} to ${{ github.event.comment.user.login }}"
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"assignees": ["${{ github.event.comment.user.login }}"]}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/assignees
14 changes: 7 additions & 7 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ repos:
rev: v0.730
hooks:
- id: mypy
# We run mypy over all files because of:
# * changes in type definitions may affect non-touched files.
# * Running it with `mypy pandas` and the filenames will lead to
# spurious duplicate module errors,
# see also https://github.com/pre-commit/mirrors-mypy/issues/5
pass_filenames: false
args:
- pandas
# As long as a some files are excluded from check-untyped-defs
# we have to exclude it from the pre-commit hook as the configuration
# is based on modules but the hook runs on files.
- --no-check-untyped-defs
- --follow-imports
- skip
files: pandas/
33 changes: 16 additions & 17 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ python: 3.7
# travis cache --delete inside the project directory from the travis command line client
# The cache directories will be deleted if anything in ci/ changes in a commit
cache:
ccache: true
directories:
- $HOME/.cache # cython cache
- $HOME/.ccache # compiler cache
ccache: true
directories:
- $HOME/.cache # cython cache
- $HOME/.ccache # compiler cache

env:
global:
Expand All @@ -20,30 +20,30 @@ env:
- secure: "EkWLZhbrp/mXJOx38CHjs7BnjXafsqHtwxPQrqWy457VDFWhIY1DMnIR/lOWG+a20Qv52sCsFtiZEmMfUjf0pLGXOqurdxbYBGJ7/ikFLk9yV2rDwiArUlVM9bWFnFxHvdz9zewBH55WurrY4ShZWyV+x2dWjjceWG5VpWeI6sA="

git:
# for cloning
depth: false
# for cloning
depth: false

matrix:
fast_finish: true
exclude:
# Exclude the default Python 3.5 build
- python: 3.5
fast_finish: true

include:
include:
- env:
- JOB="3.8" ENV_FILE="ci/deps/travis-38.yaml" PATTERN="(not slow and not network)"
- JOB="3.8" ENV_FILE="ci/deps/travis-38.yaml" PATTERN="(not slow and not network and not clipboard)"

- env:
- JOB="3.7" ENV_FILE="ci/deps/travis-37.yaml" PATTERN="(not slow and not network)"
- JOB="3.7" ENV_FILE="ci/deps/travis-37.yaml" PATTERN="(not slow and not network and not clipboard)"

- env:
- JOB="3.6, locale" ENV_FILE="ci/deps/travis-36-locale.yaml" PATTERN="((not slow and not network) or (single and db))" LOCALE_OVERRIDE="zh_CN.UTF-8" SQL="1"
- JOB="3.6, locale" ENV_FILE="ci/deps/travis-36-locale.yaml" PATTERN="((not slow and not network and not clipboard) or (single and db))" LOCALE_OVERRIDE="zh_CN.UTF-8" SQL="1"
services:
- mysql
- postgresql

- env:
- JOB="3.6, coverage" ENV_FILE="ci/deps/travis-36-cov.yaml" PATTERN="((not slow and not network) or (single and db))" PANDAS_TESTING_MODE="deprecate" COVERAGE=true SQL="1"
# Enabling Deprecations when running tests
# PANDAS_TESTING_MODE="deprecate" causes DeprecationWarning messages to be displayed in the logs
# See pandas/_testing.py for more details.
- JOB="3.6, coverage" ENV_FILE="ci/deps/travis-36-cov.yaml" PATTERN="((not slow and not network and not clipboard) or (single and db))" PANDAS_TESTING_MODE="deprecate" COVERAGE=true SQL="1"
services:
- mysql
- postgresql
Expand Down Expand Up @@ -73,7 +73,6 @@ before_install:
# This overrides travis and tells it to look nowhere.
- export BOTO_CONFIG=/dev/null


install:
- echo "install start"
- ci/prep_cython_cache.sh
Expand All @@ -90,5 +89,5 @@ script:
after_script:
- echo "after_script start"
- source activate pandas-dev && pushd /tmp && python -c "import pandas; pandas.show_versions();" && popd
- ci/print_skipped.py
- ci/print_skipped.py
- echo "after_script done"
2 changes: 1 addition & 1 deletion AUTHORS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ About the Copyright Holders
The PyData Development Team is the collection of developers of the PyData
project. This includes all of the PyData sub-projects, including pandas. The
core team that coordinates development on GitHub can be found here:
http://github.com/pydata.
https://github.com/pydata.

Full credits for pandas contributors can be found in the documentation.

Expand Down
47 changes: 47 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
FROM continuumio/miniconda3

# if you forked pandas, you can pass in your own GitHub username to use your fork
# i.e. gh_username=myname
ARG gh_username=pandas-dev
ARG pandas_home="/home/pandas"

# Avoid warnings by switching to noninteractive
ENV DEBIAN_FRONTEND=noninteractive

# Configure apt and install packages
RUN apt-get update \
&& apt-get -y install --no-install-recommends apt-utils dialog 2>&1 \
#
# Verify git, process tools, lsb-release (common in install instructions for CLIs) installed
&& apt-get -y install git iproute2 procps iproute2 lsb-release \
#
# Install C compilers (gcc not enough, so just went with build-essential which admittedly might be overkill),
# needed to build pandas C extensions
&& apt-get -y install build-essential \
#
# cleanup
&& apt-get autoremove -y \
&& apt-get clean -y \
&& rm -rf /var/lib/apt/lists/*

# Switch back to dialog for any ad-hoc use of apt-get
ENV DEBIAN_FRONTEND=dialog

# Clone pandas repo
RUN mkdir "$pandas_home" \
&& git clone "https://github.com/$gh_username/pandas.git" "$pandas_home" \
&& cd "$pandas_home" \
&& git remote add upstream "https://github.com/pandas-dev/pandas.git" \
&& git pull upstream master

# Because it is surprisingly difficult to activate a conda environment inside a DockerFile
# (from personal experience and per https://github.com/ContinuumIO/docker-images/issues/89),
# we just update the base/root one from the 'environment.yml' file instead of creating a new one.
#
# Set up environment
RUN conda env update -n base -f "$pandas_home/environment.yml"

# Build C extensions and pandas
RUN cd "$pandas_home" \
&& python setup.py build_ext --inplace -j 4 \
&& python -m pip install -e .
4 changes: 3 additions & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
BSD 3-Clause License

Copyright (c) 2008-2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Copyright (c) 2008-2011, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
All rights reserved.

Copyright (c) 2011-2020, Open source contributors.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

Expand Down
2 changes: 1 addition & 1 deletion RELEASE.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ Release Notes

The list of changes to Pandas between each release can be found
[here](https://pandas.pydata.org/pandas-docs/stable/whatsnew/index.html). For full
details, see the commit logs at http://github.com/pandas-dev/pandas.
details, see the commit logs at https://github.com/pandas-dev/pandas.
1 change: 1 addition & 0 deletions asv_bench/asv.conf.json
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
"matplotlib": [],
"sqlalchemy": [],
"scipy": [],
"numba": [],
"numexpr": [],
"pytables": [null, ""], // platform dependent, see excludes below
"tables": [null, ""],
Expand Down
33 changes: 33 additions & 0 deletions asv_bench/benchmarks/attrs_caching.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,18 @@
import numpy as np

import pandas as pd
from pandas import DataFrame

try:
from pandas.util import cache_readonly
except ImportError:
from pandas.util.decorators import cache_readonly

try:
from pandas.core.construction import extract_array
except ImportError:
extract_array = None


class DataFrameAttributes:
def setup(self):
Expand All @@ -20,6 +26,33 @@ def time_set_index(self):
self.df.index = self.cur_index


class SeriesArrayAttribute:

params = [["numeric", "object", "category", "datetime64", "datetime64tz"]]
param_names = ["dtype"]

def setup(self, dtype):
if dtype == "numeric":
self.series = pd.Series([1, 2, 3])
elif dtype == "object":
self.series = pd.Series(["a", "b", "c"], dtype=object)
elif dtype == "category":
self.series = pd.Series(["a", "b", "c"], dtype="category")
elif dtype == "datetime64":
self.series = pd.Series(pd.date_range("2013", periods=3))
elif dtype == "datetime64tz":
self.series = pd.Series(pd.date_range("2013", periods=3, tz="UTC"))

def time_array(self, dtype):
self.series.array

def time_extract_array(self, dtype):
extract_array(self.series)

def time_extract_array_numpy(self, dtype):
extract_array(self.series, extract_numpy=True)


class CacheReadonly:
def setup(self):
class Foo:
Expand Down
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/pandas_vb_common.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@
def setup(*args, **kwargs):
# This function just needs to be imported into each benchmark file to
# set up the random seed before each function.
# http://asv.readthedocs.io/en/latest/writing_benchmarks.html
# https://asv.readthedocs.io/en/latest/writing_benchmarks.html
np.random.seed(1234)


Expand Down
3 changes: 3 additions & 0 deletions asv_bench/benchmarks/reshape.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,9 @@ def time_pivot_table_categorical_observed(self):
observed=True,
)

def time_pivot_table_margins_only_column(self):
self.df.pivot_table(columns=["key2", "key3"], margins=True)


class Crosstab:
def setup(self):
Expand Down
21 changes: 21 additions & 0 deletions asv_bench/benchmarks/rolling.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,27 @@ def time_rolling(self, constructor, window, dtype, function, raw):
self.roll.apply(function, raw=raw)


class Engine:
params = (
["DataFrame", "Series"],
["int", "float"],
[np.sum, lambda x: np.sum(x) + 5],
["cython", "numba"],
)
param_names = ["constructor", "dtype", "function", "engine"]

def setup(self, constructor, dtype, function, engine):
N = 10 ** 3
arr = (100 * np.random.random(N)).astype(dtype)
self.data = getattr(pd, constructor)(arr)

def time_rolling_apply(self, constructor, dtype, function, engine):
self.data.rolling(10).apply(function, raw=True, engine=engine)

def time_expanding_apply(self, constructor, dtype, function, engine):
self.data.expanding().apply(function, raw=True, engine=engine)


class ExpandingMethods:

params = (
Expand Down
12 changes: 10 additions & 2 deletions asv_bench/benchmarks/tslibs/timedelta.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@


class TimedeltaConstructor:
def setup(self):
self.nptimedelta64 = np.timedelta64(3600)
self.dttimedelta = datetime.timedelta(seconds=3600)
self.td = Timedelta(3600, unit="s")

def time_from_int(self):
Timedelta(123456789)

Expand All @@ -28,10 +33,10 @@ def time_from_components(self):
)

def time_from_datetime_timedelta(self):
Timedelta(datetime.timedelta(days=1, seconds=1))
Timedelta(self.dttimedelta)

def time_from_np_timedelta(self):
Timedelta(np.timedelta64(1, "ms"))
Timedelta(self.nptimedelta64)

def time_from_string(self):
Timedelta("1 days")
Expand All @@ -42,6 +47,9 @@ def time_from_iso_format(self):
def time_from_missing(self):
Timedelta("nat")

def time_from_pd_timedelta(self):
Timedelta(self.td)


class TimedeltaProperties:
def setup_cache(self):
Expand Down
Loading

0 comments on commit 41e6ce4

Please sign in to comment.