Skip to content

Commit

Permalink
Merge branch 'main' into depr-index-insert
Browse files Browse the repository at this point in the history
  • Loading branch information
jbrockmendel committed Oct 10, 2023
2 parents 0561ae8 + 66a54a3 commit eddf3a6
Show file tree
Hide file tree
Showing 326 changed files with 4,363 additions and 2,580 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/code-checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ jobs:
run: |
cd asv_bench
asv machine --yes
asv run --quick --dry-run --durations=30 --python=same
asv run --quick --dry-run --durations=30 --python=same --show-stderr
build_docker_dev_environment:
name: Build Docker Dev Environment
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/docbuild-and-upload.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ jobs:
run: cp doc/cheatsheet/Pandas_Cheat_Sheet* web/build/

- name: Upload web
run: rsync -az --delete --exclude='pandas-docs' --exclude='docs' web/build/ web@${{ secrets.server_ip }}:/var/www/html
run: rsync -az --delete --exclude='pandas-docs' --exclude='docs' --exclude='benchmarks' web/build/ web@${{ secrets.server_ip }}:/var/www/html
if: github.event_name == 'push' && github.ref == 'refs/heads/main'

- name: Upload dev docs
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,7 @@ jobs:
. ~/virtualenvs/pandas-dev/bin/activate
python -m pip install --no-cache-dir -U pip wheel setuptools meson[ninja]==1.2.1 meson-python==0.13.1
python -m pip install numpy --config-settings=setup-args="-Dallow-noblas=true"
python -m pip install --no-cache-dir versioneer[toml] cython python-dateutil pytz pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-asyncio>=0.17 hypothesis>=6.46.1
python -m pip install --no-cache-dir versioneer[toml] "cython<3.0.3" python-dateutil pytz pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-asyncio>=0.17 hypothesis>=6.46.1
python -m pip install --no-cache-dir --no-build-isolation -e .
python -m pip list --no-cache-dir
export PANDAS_CI=1
Expand Down Expand Up @@ -274,7 +274,7 @@ jobs:
/opt/python/cp311-cp311/bin/python -m venv ~/virtualenvs/pandas-dev
. ~/virtualenvs/pandas-dev/bin/activate
python -m pip install --no-cache-dir -U pip wheel setuptools meson-python==0.13.1 meson[ninja]==1.2.1
python -m pip install --no-cache-dir versioneer[toml] cython numpy python-dateutil pytz pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-asyncio>=0.17 hypothesis>=6.46.1
python -m pip install --no-cache-dir versioneer[toml] "cython<3.0.3" numpy python-dateutil pytz pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-asyncio>=0.17 hypothesis>=6.46.1
python -m pip install --no-cache-dir --no-build-isolation -e .
python -m pip list --no-cache-dir
Expand Down Expand Up @@ -347,7 +347,7 @@ jobs:
python -m pip install --upgrade pip setuptools wheel meson[ninja]==1.2.1 meson-python==0.13.1
python -m pip install --pre --extra-index-url https://pypi.anaconda.org/scientific-python-nightly-wheels/simple numpy
python -m pip install versioneer[toml]
python -m pip install python-dateutil pytz tzdata cython hypothesis>=6.46.1 pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-cov pytest-asyncio>=0.17
python -m pip install python-dateutil pytz tzdata "cython<3.0.3" hypothesis>=6.46.1 pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-cov pytest-asyncio>=0.17
python -m pip install -ve . --no-build-isolation --no-index
python -m pip list
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/wheels.yml
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ jobs:
run: echo "sdist_name=$(cd ./dist && ls -d */)" >> "$GITHUB_ENV"

- name: Build wheels
uses: pypa/cibuildwheel@v2.16.0
uses: pypa/cibuildwheel@v2.16.2
with:
package-dir: ./dist/${{ matrix.buildplat[1] == 'macosx_*' && env.sdist_name || needs.build_sdist.outputs.sdist_file }}
env:
Expand Down
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
.mesonpy-native-file.ini
MANIFEST
compile_commands.json
debug
.debug

# Python files #
################
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ repos:
'--filter=-readability/casting,-runtime/int,-build/include_subdir,-readability/fn_size'
]
- repo: https://github.com/pylint-dev/pylint
rev: v3.0.0a7
rev: v3.0.0b0
hooks:
- id: pylint
stages: [manual]
Expand Down
18 changes: 17 additions & 1 deletion asv_bench/benchmarks/algorithms.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from importlib import import_module

import numpy as np
import pyarrow as pa

import pandas as pd

Expand Down Expand Up @@ -72,7 +73,16 @@ class Duplicated:
params = [
[True, False],
["first", "last", False],
["int", "uint", "float", "string", "datetime64[ns]", "datetime64[ns, tz]"],
[
"int",
"uint",
"float",
"string",
"datetime64[ns]",
"datetime64[ns, tz]",
"timestamp[ms][pyarrow]",
"duration[s][pyarrow]",
],
]
param_names = ["unique", "keep", "dtype"]

Expand All @@ -87,6 +97,12 @@ def setup(self, unique, keep, dtype):
"datetime64[ns, tz]": pd.date_range(
"2011-01-01", freq="H", periods=N, tz="Asia/Tokyo"
),
"timestamp[ms][pyarrow]": pd.Index(
np.arange(N), dtype=pd.ArrowDtype(pa.timestamp("ms"))
),
"duration[s][pyarrow]": pd.Index(
np.arange(N), dtype=pd.ArrowDtype(pa.duration("s"))
),
}[dtype]
if not unique:
data = data.repeat(5)
Expand Down
3 changes: 0 additions & 3 deletions asv_bench/benchmarks/series_methods.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,9 +28,6 @@ def time_constructor_dict(self):
def time_constructor_no_data(self):
Series(data=None, index=self.idx)

def time_constructor_fastpath(self):
Series(self.array, index=self.idx2, name="name", fastpath=True)


class ToFrame:
params = [["int64", "datetime64[ns]", "category", "Int64"], [None, "foo"]]
Expand Down
2 changes: 1 addition & 1 deletion asv_bench/benchmarks/tslibs/period.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def time_now(self, freq):
self.per.now(freq)

def time_asfreq(self, freq):
self.per.asfreq("A")
self.per.asfreq("Y")

def time_str(self, freq):
str(self.per)
Expand Down
10 changes: 0 additions & 10 deletions ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -63,16 +63,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then

MSG='Partially validate docstrings (EX03)' ; echo $MSG
$BASE_DIR/scripts/validate_docstrings.py --format=actions --errors=EX03 --ignore_functions \
pandas.Series.loc \
pandas.Series.iloc \
pandas.Series.pop \
pandas.Series.describe \
pandas.Series.skew \
pandas.Series.var \
pandas.Series.last \
pandas.Series.tz_convert \
pandas.Series.tz_localize \
pandas.Series.dt.month_name \
pandas.Series.dt.day_name \
pandas.Series.str.len \
pandas.Series.cat.set_categories \
Expand Down
2 changes: 1 addition & 1 deletion ci/deps/actions-310.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ dependencies:

# build dependencies
- versioneer[toml]
- cython>=0.29.33
- cython>=0.29.33, <3.0.3
- meson[ninja]=1.2.1
- meson-python=0.13.1

Expand Down
2 changes: 1 addition & 1 deletion ci/deps/actions-311-downstream_compat.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ dependencies:

# build dependencies
- versioneer[toml]
- cython>=0.29.33
- cython>=0.29.33, <3.0.3
- meson[ninja]=1.2.1
- meson-python=0.13.1

Expand Down
2 changes: 1 addition & 1 deletion ci/deps/actions-311-numpydev.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ dependencies:
- versioneer[toml]
- meson[ninja]=1.2.1
- meson-python=0.13.1
- cython>=0.29.33
- cython>=0.29.33, <3.0.3

# test dependencies
- pytest>=7.3.2
Expand Down
2 changes: 1 addition & 1 deletion ci/deps/actions-311-pyarrownightly.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ dependencies:
# build dependencies
- versioneer[toml]
- meson[ninja]=1.2.1
- cython>=0.29.33
- cython>=0.29.33, <3.0.3
- meson-python=0.13.1

# test dependencies
Expand Down
2 changes: 1 addition & 1 deletion ci/deps/actions-311.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ dependencies:

# build dependencies
- versioneer[toml]
- cython>=0.29.33
- cython>=0.29.33, <3.0.3
- meson[ninja]=1.2.1
- meson-python=0.13.1

Expand Down
2 changes: 1 addition & 1 deletion ci/deps/actions-39-minimum_versions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ dependencies:

# build dependencies
- versioneer[toml]
- cython>=0.29.33
- cython>=0.29.33, <3.0.3
- meson[ninja]=1.2.1
- meson-python=0.13.1

Expand Down
2 changes: 1 addition & 1 deletion ci/deps/actions-39.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ dependencies:

# build dependencies
- versioneer[toml]
- cython>=0.29.33
- cython>=0.29.33, <3.0.3
- meson[ninja]=1.2.1
- meson-python=0.13.1

Expand Down
2 changes: 1 addition & 1 deletion ci/deps/actions-pypy-39.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ dependencies:

# build dependencies
- versioneer[toml]
- cython>=0.29.33
- cython>=0.29.33, <3.0.3
- meson[ninja]=1.2.1
- meson-python=0.13.1

Expand Down
2 changes: 1 addition & 1 deletion ci/deps/circle-310-arm64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ dependencies:

# build dependencies
- versioneer[toml]
- cython>=0.29.33
- cython>=0.29.33, <3.0.3
- meson[ninja]=1.2.1
- meson-python=0.13.1

Expand Down
1 change: 0 additions & 1 deletion doc/redirects.csv
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,6 @@ generated/pandas.api.types.is_number,../reference/api/pandas.api.types.is_number
generated/pandas.api.types.is_numeric_dtype,../reference/api/pandas.api.types.is_numeric_dtype
generated/pandas.api.types.is_object_dtype,../reference/api/pandas.api.types.is_object_dtype
generated/pandas.api.types.is_period_dtype,../reference/api/pandas.api.types.is_period_dtype
generated/pandas.api.types.is_period,../reference/api/pandas.api.types.is_period
generated/pandas.api.types.is_re_compilable,../reference/api/pandas.api.types.is_re_compilable
generated/pandas.api.types.is_re,../reference/api/pandas.api.types.is_re
generated/pandas.api.types.is_scalar,../reference/api/pandas.api.types.is_scalar
Expand Down
2 changes: 1 addition & 1 deletion doc/source/development/contributing_codebase.rst
Original file line number Diff line number Diff line change
Expand Up @@ -528,7 +528,7 @@ If a test is known to fail but the manner in which it fails
is not meant to be captured, use ``pytest.mark.xfail`` It is common to use this method for a test that
exhibits buggy behavior or a non-implemented feature. If
the failing test has flaky behavior, use the argument ``strict=False``. This
will make it so pytest does not fail if the test happens to pass.
will make it so pytest does not fail if the test happens to pass. Using ``strict=False`` is highly undesirable, please use it only as a last resort.

Prefer the decorator ``@pytest.mark.xfail`` and the argument ``pytest.param``
over usage within a test so that the test is appropriately marked during the
Expand Down
30 changes: 28 additions & 2 deletions doc/source/development/debugging_extensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,8 @@ For Python developers with limited or no C/C++ experience this can seem a daunti
2. `Fundamental Python Debugging Part 2 - Python Extensions <https://willayd.com/fundamental-python-debugging-part-2-python-extensions.html>`_
3. `Fundamental Python Debugging Part 3 - Cython Extensions <https://willayd.com/fundamental-python-debugging-part-3-cython-extensions.html>`_

Generating debug builds
-----------------------
Debugging locally
-----------------

By default building pandas from source will generate a release build. To generate a development build you can type::

Expand All @@ -27,6 +27,32 @@ By default building pandas from source will generate a release build. To generat

By specifying ``builddir="debug"`` all of the targets will be built and placed in the debug directory relative to the project root. This helps to keep your debug and release artifacts separate; you are of course able to choose a different directory name or omit altogether if you do not care to separate build types.

Using Docker
------------

To simplify the debugging process, pandas has created a Docker image with a debug build of Python and the gdb/Cython debuggers pre-installed. You may either ``docker pull pandas/pandas-debug`` to get access to this image or build it from the ``tooling/debug`` folder locallly.

You can then mount your pandas repository into this image via:

.. code-block:: sh
docker run --rm -it -w /data -v ${PWD}:/data pandas/pandas-debug
Inside the image, you can use meson to build/install pandas and place the build artifacts into a ``debug`` folder using a command as follows:

.. code-block:: sh
python -m pip install -ve . --no-build-isolation --config-settings=builddir="debug" --config-settings=setup-args="-Dbuildtype=debug"
If planning to use cygdb, the files required by that application are placed within the build folder. So you have to first ``cd`` to the build folder, then start that application.

.. code-block:: sh
cd debug
cygdb
Within the debugger you can use `cygdb commands <https://docs.cython.org/en/latest/src/userguide/debugging.html#using-the-debugger>`_ to navigate cython extensions.

Editor support
--------------

Expand Down
22 changes: 13 additions & 9 deletions doc/source/development/maintaining.rst
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,9 @@ reading.
Issue triage
------------

Triage is an important first step in addressing issues reported by the community, and even
partial contributions are a great way to help maintain pandas. Only remove the "Needs Triage"
tag once all of the steps below have been completed.

Here's a typical workflow for triaging a newly opened issue.

Expand All @@ -67,9 +70,9 @@ Here's a typical workflow for triaging a newly opened issue.
3. **Is this a duplicate issue?**

We have many open issues. If a new issue is clearly a duplicate, label the
new issue as "Duplicate" assign the milestone "No Action", and close the issue
with a link to the original issue. Make sure to still thank the reporter, and
encourage them to chime in on the original issue, and perhaps try to fix it.
new issue as "Duplicate" and close the issue with a link to the original issue.
Make sure to still thank the reporter, and encourage them to chime in on the
original issue, and perhaps try to fix it.

If the new issue provides relevant information, such as a better or slightly
different example, add it to the original issue as a comment or an edit to
Expand All @@ -90,15 +93,20 @@ Here's a typical workflow for triaging a newly opened issue.
If a reproducible example is provided, but you see a simplification,
edit the original post with your simpler reproducible example.

Ensure the issue exists on the main branch and that it has the "Needs Triage" tag
until all steps have been completed. Add a comment to the issue once you have
verified it exists on the main branch, so others know it has been confirmed.

5. **Is this a clearly defined feature request?**

Generally, pandas prefers to discuss and design new features in issues, before
a pull request is made. Encourage the submitter to include a proposed API
for the new feature. Having them write a full docstring is a good way to
pin down specifics.

We'll need a discussion from several pandas maintainers before deciding whether
the proposal is in scope for pandas.
Tag new feature requests with "Needs Discussion", as we'll need a discussion
from several pandas maintainers before deciding whether the proposal is in
scope for pandas.

6. **Is this a usage question?**

Expand All @@ -117,10 +125,6 @@ Here's a typical workflow for triaging a newly opened issue.
If the issue is clearly defined and the fix seems relatively straightforward,
label the issue as "Good first issue".

Typically, new issues will be assigned the "Contributions welcome" milestone,
unless it's know that this issue should be addressed in a specific release (say
because it's a large regression).

Once you have completed the above, make sure to remove the "needs triage" label.

.. _maintaining.regressions:
Expand Down
1 change: 1 addition & 0 deletions doc/source/reference/extensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ objects.
api.extensions.ExtensionArray.copy
api.extensions.ExtensionArray.view
api.extensions.ExtensionArray.dropna
api.extensions.ExtensionArray.duplicated
api.extensions.ExtensionArray.equals
api.extensions.ExtensionArray.factorize
api.extensions.ExtensionArray.fillna
Expand Down
2 changes: 1 addition & 1 deletion doc/source/user_guide/10min.rst
Original file line number Diff line number Diff line change
Expand Up @@ -451,7 +451,7 @@ Merge
Concat
~~~~~~

pandas provides various facilities for easily combining together :class:`Series`` and
pandas provides various facilities for easily combining together :class:`Series` and
:class:`DataFrame` objects with various kinds of set logic for the indexes
and relational algebra functionality in the case of join / merge-type
operations.
Expand Down
2 changes: 1 addition & 1 deletion doc/source/user_guide/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -976,7 +976,7 @@ of :ref:`frequency aliases <timeseries.offset_aliases>` with datetime-like inter
pd.interval_range(start=pd.Timestamp("2017-01-01"), periods=4, freq="W")
pd.interval_range(start=pd.Timedelta("0 days"), periods=3, freq="9H")
pd.interval_range(start=pd.Timedelta("0 days"), periods=3, freq="9h")
Additionally, the ``closed`` parameter can be used to specify which side(s) the intervals
are closed on. Intervals are closed on the right side by default.
Expand Down
14 changes: 0 additions & 14 deletions doc/source/user_guide/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -408,20 +408,6 @@ raise a ValueError:
pd.Series(['foo', 'bar', 'baz']) == pd.Series(['foo'])
Note that this is different from the NumPy behavior where a comparison can
be broadcast:

.. ipython:: python
np.array([1, 2, 3]) == np.array([2])
or it can return False if broadcasting can not be done:

.. ipython:: python
:okwarning:
np.array([1, 2, 3]) == np.array([1, 2])
Combining overlapping data sets
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
Loading

0 comments on commit eddf3a6

Please sign in to comment.