diff --git a/doc/parameter.rst b/doc/parameter.rst
index 97f6232d8397..bb2666737c5b 100644
--- a/doc/parameter.rst
+++ b/doc/parameter.rst
@@ -16,10 +16,13 @@ Before running XGBoost, we must set three types of parameters: general parameter
   :backlinks: none
   :local:
 
+
+.. _global_config:
+
 ********************
 Global Configuration
 ********************
-The following parameters can be set in the global scope, using ``xgb.config_context()`` (Python) or ``xgb.set.config()`` (R).
+The following parameters can be set in the global scope, using :py:func:`xgboost.config_context()` (Python) or ``xgb.set.config()`` (R).
 
 * ``verbosity``: Verbosity of printing messages. Valid values of 0 (silent), 1 (warning), 2 (info), and 3 (debug).
 * ``use_rmm``: Whether to use RAPIDS Memory Manager (RMM) to allocate GPU memory. This option is only applicable when XGBoost is built (compiled) with the RMM plugin enabled. Valid values are ``true`` and ``false``.
diff --git a/doc/python/callbacks.rst b/doc/python/callbacks.rst
index b3302d7f7304..7cb257a819ed 100644
--- a/doc/python/callbacks.rst
+++ b/doc/python/callbacks.rst
@@ -2,10 +2,11 @@
 Callback Functions
 ##################
 
-This document gives a basic walkthrough of callback function used in XGBoost Python
-package.  In XGBoost 1.3, a new callback interface is designed for Python package, which
-provides the flexibility of designing various extension for training.  Also, XGBoost has a
-number of pre-defined callbacks for supporting early stopping, checkpoints etc.
+This document gives a basic walkthrough of :ref:`callback API <callback_api>` used in
+XGBoost Python package.  In XGBoost 1.3, a new callback interface is designed for Python
+package, which provides the flexibility of designing various extension for training.
+Also, XGBoost has a number of pre-defined callbacks for supporting early stopping,
+checkpoints etc.
 
 
 Using builtin callbacks
@@ -14,8 +15,8 @@ Using builtin callbacks
 By default, training methods in XGBoost have parameters like ``early_stopping_rounds`` and
 ``verbose``/``verbose_eval``, when specified the training procedure will define the
 corresponding callbacks internally.  For example, when ``early_stopping_rounds`` is
-specified, ``EarlyStopping`` callback is invoked inside iteration loop.  You can also pass
-this callback function directly into XGBoost:
+specified, :py:class:`EarlyStopping <xgboost.callback.EarlyStopping>` callback is invoked
+inside iteration loop.  You can also pass this callback function directly into XGBoost:
 
 .. code-block:: python
 
@@ -54,6 +55,7 @@ this callback function directly into XGBoost:
 Defining your own callback
 --------------------------
 
-XGBoost provides an callback interface class: ``xgboost.callback.TrainingCallback``, user
-defined callbacks should inherit this class and override corresponding methods.  There's a
-working example in `demo/guide-python/callbacks.py <https://github.com/dmlc/xgboost/tree/master/demo/guide-python/callbacks.py>`_
+XGBoost provides an callback interface class: :py:class:`TrainingCallback
+<xgboost.callback.TrainingCallback>`, user defined callbacks should inherit this class and
+override corresponding methods.  There's a working example in
+:ref:`sphx_glr_python_examples_callbacks.py`.
diff --git a/doc/python/python_api.rst b/doc/python/python_api.rst
index 2d0b1ed9f960..9f077edbc0df 100644
--- a/doc/python/python_api.rst
+++ b/doc/python/python_api.rst
@@ -77,15 +77,29 @@ Plotting API
 
 Callback API
 ------------
-.. autofunction:: xgboost.callback.TrainingCallback
+.. automodule:: xgboost.callback
+.. autoclass:: xgboost.callback.TrainingCallback
+    :members:
 
-.. autofunction:: xgboost.callback.EvaluationMonitor
+.. autoclass:: xgboost.callback.EvaluationMonitor
+    :members:
+    :inherited-members:
+    :show-inheritance:
 
-.. autofunction:: xgboost.callback.EarlyStopping
+.. autoclass:: xgboost.callback.EarlyStopping
+    :members:
+    :inherited-members:
+    :show-inheritance:
 
-.. autofunction:: xgboost.callback.LearningRateScheduler
+.. autoclass:: xgboost.callback.LearningRateScheduler
+    :members:
+    :inherited-members:
+    :show-inheritance:
 
-.. autofunction:: xgboost.callback.TrainingCheckPoint
+.. autoclass:: xgboost.callback.TrainingCheckPoint
+    :members:
+    :inherited-members:
+    :show-inheritance:
 
 .. _dask_api:
 
diff --git a/doc/treemethod.rst b/doc/treemethod.rst
index f47b8c027322..e22a8c7095bf 100644
--- a/doc/treemethod.rst
+++ b/doc/treemethod.rst
@@ -1,6 +1,6 @@
-####################
-XGBoost Tree Methods
-####################
+############
+Tree Methods
+############
 
 For training boosted tree models, there are 2 parameters used for choosing algorithms,
 namely ``updater`` and ``tree_method``.  XGBoost has 4 builtin tree methods, namely
diff --git a/doc/tutorials/custom_metric_obj.rst b/doc/tutorials/custom_metric_obj.rst
index b84229599364..c5bdb2d6f3c6 100644
--- a/doc/tutorials/custom_metric_obj.rst
+++ b/doc/tutorials/custom_metric_obj.rst
@@ -146,7 +146,8 @@ We will be able to see XGBoost printing something like:
 Notice that the parameter ``disable_default_eval_metric`` is used to suppress the default metric
 in XGBoost.
 
-For fully reproducible source code and comparison plots, see `custom_rmsle.py <https://github.com/dmlc/xgboost/tree/master/demo/guide-python/custom_rmsle.py>`_.
+For fully reproducible source code and comparison plots, see
+:ref:`sphx_glr_python_examples_custom_rmsle.py`.
 
 *********************
 Reverse Link Function
@@ -261,8 +262,7 @@ available in XGBoost:
 We use ``multi:softmax`` to illustrate the differences of transformed prediction.  With
 ``softprob`` the output prediction array has shape ``(n_samples, n_classes)`` while for
 ``softmax`` it's ``(n_samples, )``. A demo for multi-class objective function is also
-available at `demo/guide-python/custom_softmax.py
-<https://github.com/dmlc/xgboost/tree/master/demo/guide-python/custom_softmax.py>`_
+available at :ref:`sphx_glr_python_examples_custom_softmax.py`.
 
 
 **********************
diff --git a/python-package/xgboost/callback.py b/python-package/xgboost/callback.py
index c94f0930474a..901724a67d00 100644
--- a/python-package/xgboost/callback.py
+++ b/python-package/xgboost/callback.py
@@ -1,7 +1,11 @@
 # coding: utf-8
 # pylint: disable=invalid-name, too-many-statements, no-self-use
 # pylint: disable=too-many-arguments
-"""Training Library containing training routines."""
+"""Callback library containing training routines.  See :doc:`Callback Functions
+</python/callbacks>` for a quick introduction.
+
+"""
+
 from abc import ABC
 import collections
 import os
diff --git a/python-package/xgboost/config.py b/python-package/xgboost/config.py
index 3ff9f97942a9..427ea4ea3915 100644
--- a/python-package/xgboost/config.py
+++ b/python-package/xgboost/config.py
@@ -30,8 +30,8 @@ def config_doc(*, header=None, extra_note=None, parameters=None, returns=None,
     {header}
 
     Global configuration consists of a collection of parameters that can be applied in the
-    global scope. See https://xgboost.readthedocs.io/en/stable/parameter.html for the full
-    list of parameters supported in the global configuration.
+    global scope. See :ref:`global_config` for the full list of parameters supported in
+    the global configuration.
 
     {extra_note}
 
diff --git a/python-package/xgboost/core.py b/python-package/xgboost/core.py
index e4187a7108ec..24ded0c614f2 100644
--- a/python-package/xgboost/core.py
+++ b/python-package/xgboost/core.py
@@ -1817,9 +1817,8 @@ def predict(
 
         .. note::
 
-            See `Prediction
-            <https://xgboost.readthedocs.io/en/latest/prediction.html>`_
-            for issues like thread safety and a summary of outputs from this function.
+            See :doc:`Prediction </prediction>` for issues like thread safety and a
+            summary of outputs from this function.
 
         Parameters
         ----------
@@ -1945,8 +1944,8 @@ def inplace_predict(
         base_margin: Any = None,
         strict_shape: bool = False
     ):
-        """Run prediction in-place, Unlike ``predict`` method, inplace prediction does
-        not cache the prediction result.
+        """Run prediction in-place, Unlike :py:meth:`predict` method, inplace prediction does not
+        cache the prediction result.
 
         Calling only ``inplace_predict`` in multiple threads is safe and lock
         free.  But the safety does not hold when used in conjunction with other
@@ -1971,7 +1970,7 @@ def inplace_predict(
             ``predictor`` to ``gpu_predictor`` for running prediction on CuPy
             array or CuDF DataFrame.
         iteration_range :
-            See :py:meth:`xgboost.Booster.predict` for details.
+            See :py:meth:`predict` for details.
         predict_type :
             * `value` Output model prediction values.
             * `margin` Output the raw untransformed margin value.
@@ -2127,9 +2126,8 @@ def save_model(self, fname: Union[str, os.PathLike]) -> None:
         The model is saved in an XGBoost internal format which is universal among the
         various XGBoost interfaces. Auxiliary attributes of the Python Booster object
         (such as feature_names) will not be saved when using binary format.  To save those
-        attributes, use JSON instead. See: `Model IO
-        <https://xgboost.readthedocs.io/en/stable/tutorials/saving_model.html>`_ for more
-        info.
+        attributes, use JSON instead. See :doc:`Model IO </tutorials/saving_model>` for
+        more info.
 
         Parameters
         ----------
@@ -2165,9 +2163,8 @@ def load_model(self, fname: Union[str, bytearray, os.PathLike]) -> None:
         The model is loaded from XGBoost format which is universal among the various
         XGBoost interfaces. Auxiliary attributes of the Python Booster object (such as
         feature_names) will not be loaded when using binary format.  To save those
-        attributes, use JSON instead.  See: `Model IO
-        <https://xgboost.readthedocs.io/en/stable/tutorials/saving_model.html>`_ for more
-        info.
+        attributes, use JSON instead.  See :doc:`Model IO </tutorials/saving_model>` for
+        more info.
 
         Parameters
         ----------
@@ -2215,7 +2212,7 @@ def num_features(self) -> int:
         return features.value
 
     def dump_model(self, fout, fmap='', with_stats=False, dump_format="text"):
-        """Dump model into a text or JSON file.  Unlike `save_model`, the
+        """Dump model into a text or JSON file.  Unlike :py:meth:`save_model`, the
         output format is primarily used for visualization or interpretation,
         hence it's more human readable but cannot be loaded back to XGBoost.
 
@@ -2258,9 +2255,9 @@ def get_dump(
         with_stats: bool = False,
         dump_format: str = "text"
     ) -> List[str]:
-        """Returns the model dump as a list of strings.  Unlike `save_model`, the
-        output format is primarily used for visualization or interpretation,
-        hence it's more human readable but cannot be loaded back to XGBoost.
+        """Returns the model dump as a list of strings.  Unlike :py:meth:`save_model`, the output
+        format is primarily used for visualization or interpretation, hence it's more
+        human readable but cannot be loaded back to XGBoost.
 
         Parameters
         ----------
diff --git a/python-package/xgboost/dask.py b/python-package/xgboost/dask.py
index cb972afa3e8c..4b70a4fbee09 100644
--- a/python-package/xgboost/dask.py
+++ b/python-package/xgboost/dask.py
@@ -3,9 +3,8 @@
 # pylint: disable=too-many-lines, fixme
 # pylint: disable=too-few-public-methods
 # pylint: disable=import-error
-"""Dask extensions for distributed training. See
-https://xgboost.readthedocs.io/en/latest/tutorials/dask.html for simple
-tutorial.  Also xgboost/demo/dask for some examples.
+"""Dask extensions for distributed training. See :doc:`Distributed XGBoost with Dask
+</tutorials/dask>` for simple tutorial.  Also xgboost/demo/dask for some examples.
 
 There are two sets of APIs in this module, one is the functional API including
 ``train`` and ``predict`` methods.  Another is stateful Scikit-Learner wrapper
diff --git a/python-package/xgboost/sklearn.py b/python-package/xgboost/sklearn.py
index 949dae7b46c7..a0f6b3f7340e 100644
--- a/python-package/xgboost/sklearn.py
+++ b/python-package/xgboost/sklearn.py
@@ -122,10 +122,10 @@ def inner(y_score: np.ndarray, dmatrix: DMatrix) -> Tuple[str, float]:
     booster: Optional[str]
         Specify which booster to use: gbtree, gblinear or dart.
     tree_method: Optional[str]
-        Specify which tree method to use.  Default to auto.  If this parameter
-        is set to default, XGBoost will choose the most conservative option
-        available.  It's recommended to study this option from the parameters
-        document: https://xgboost.readthedocs.io/en/latest/treemethod.html.
+        Specify which tree method to use.  Default to auto.  If this parameter is set to
+        default, XGBoost will choose the most conservative option available.  It's
+        recommended to study this option from the parameters document :doc:`tree method
+        </treemethod>`
     n_jobs : Optional[int]
         Number of parallel threads used to run xgboost.  When used with other Scikit-Learn
         algorithms like grid search, you may choose which algorithm to parallelize and
@@ -167,14 +167,14 @@ def inner(y_score: np.ndarray, dmatrix: DMatrix) -> Tuple[str, float]:
     num_parallel_tree: Optional[int]
         Used for boosting random forest.
     monotone_constraints : Optional[Union[Dict[str, int], str]]
-        Constraint of variable monotonicity.  See tutorial for more
-        information.
+        Constraint of variable monotonicity.  See :doc:`tutorial </tutorials/monotonic>`
+        for more information.
     interaction_constraints : Optional[Union[str, List[Tuple[str]]]]
         Constraints for interaction representing permitted interactions.  The
-        constraints must be specified in the form of a nest list, e.g. [[0, 1],
-        [2, 3, 4]], where each inner list is a group of indices of features
-        that are allowed to interact with each other.  See tutorial for more
-        information
+        constraints must be specified in the form of a nested list, e.g. ``[[0, 1], [2,
+        3, 4]]``, where each inner list is a group of indices of features that are
+        allowed to interact with each other.  See :doc:`tutorial
+        </tutorials/feature_interaction_constraint>` for more information
     importance_type: Optional[str]
         The feature importance type for the feature_importances\\_ property:
 
@@ -216,9 +216,8 @@ def inner(y_score: np.ndarray, dmatrix: DMatrix) -> Tuple[str, float]:
         For advanced usage on Early stopping like directly choosing to maximize instead of
         minimize, see :py:obj:`xgboost.callback.EarlyStopping`.
 
-        See `Custom Objective and Evaluation Metric
-        <https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html>`_ for
-        more.
+        See :doc:`Custom Objective and Evaluation Metric </tutorials/custom_metric_obj>`
+        for more.
 
         .. note::
 
@@ -243,7 +242,7 @@ def inner(y_score: np.ndarray, dmatrix: DMatrix) -> Tuple[str, float]:
 
         Activates early stopping. Validation metric needs to improve at least once in
         every **early_stopping_rounds** round(s) to continue training.  Requires at least
-        one item in **eval_set** in :py:meth:`xgboost.sklearn.XGBModel.fit`.
+        one item in **eval_set** in :py:meth:`fit`.
 
         The method returns the model from the last iteration (not the best one).  If
         there's more than one item in **eval_set**, the last entry will be used for early
@@ -251,7 +250,8 @@ def inner(y_score: np.ndarray, dmatrix: DMatrix) -> Tuple[str, float]:
         will be used for early stopping.
 
         If early stopping occurs, the model will have three additional fields:
-        ``clf.best_score``, ``clf.best_iteration`` and ``clf.best_ntree_limit``.
+        :py:attr:`best_score`, :py:attr:`best_iteration` and
+        :py:attr:`best_ntree_limit`.
 
         .. note::
 
@@ -268,9 +268,8 @@ def inner(y_score: np.ndarray, dmatrix: DMatrix) -> Tuple[str, float]:
                                                     save_best=True)]
 
     kwargs : dict, optional
-        Keyword arguments for XGBoost Booster object.  Full documentation of
-        parameters can be found here:
-        https://github.com/dmlc/xgboost/blob/master/doc/parameter.rst.
+        Keyword arguments for XGBoost Booster object.  Full documentation of parameters
+        can be found :doc:`here </parameter>`.
         Attempting to set a parameter via the constructor args and \\*\\*kwargs
         dict simultaneously will result in a TypeError.
 
@@ -1102,6 +1101,7 @@ def evals_result(self) -> Dict[str, Dict[str, List[float]]]:
 
     @property
     def n_features_in_(self) -> int:
+        """Number of features seen during :py:meth:`fit`."""
         booster = self.get_booster()
         return booster.num_features()
 
@@ -1116,10 +1116,15 @@ def _early_stopping_attr(self, attr: str) -> Union[float, int]:
 
     @property
     def best_score(self) -> float:
+        """The best score obtained by early stopping."""
         return float(self._early_stopping_attr('best_score'))
 
     @property
     def best_iteration(self) -> int:
+        """The best iteration obtained by early stopping.  This attribute is 0-based,
+        for instance if the best iteration is the first round, then best_iteration is 0.
+
+        """
         return int(self._early_stopping_attr('best_iteration'))
 
     @property
diff --git a/python-package/xgboost/training.py b/python-package/xgboost/training.py
index 3cb6b2936c54..9066124a4cd6 100644
--- a/python-package/xgboost/training.py
+++ b/python-package/xgboost/training.py
@@ -76,9 +76,8 @@ def train(
         List of validation sets for which metrics will evaluated during training.
         Validation metrics will help us track the performance of the model.
     obj
-        Custom objective function.  See `Custom Objective
-        <https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html>`_ for
-        details.
+        Custom objective function.  See :doc:`Custom Objective
+        </tutorials/custom_metric_obj>` for details.
     feval :
         .. deprecated:: 1.6.0
             Use `custom_metric` instead.
@@ -134,9 +133,8 @@ def train(
 
         .. versionadded 1.6.0
 
-        Custom metric function.  See `Custom Metric
-        <https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html>`_ for
-        details.
+        Custom metric function.  See :doc:`Custom Metric </tutorials/custom_metric_obj>`
+        for details.
 
     Returns
     -------
@@ -387,9 +385,8 @@ def cv(params, dtrain, num_boost_round=10, nfold=3, stratified=False, folds=None
         Evaluation metrics to be watched in CV.
     obj :
 
-        Custom objective function.  See `Custom Objective
-        <https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html>`_ for
-        details.
+        Custom objective function.  See :doc:`Custom Objective
+        </tutorials/custom_metric_obj>` for details.
 
     feval : function
         .. deprecated:: 1.6.0
@@ -434,9 +431,8 @@ def cv(params, dtrain, num_boost_round=10, nfold=3, stratified=False, folds=None
 
         .. versionadded 1.6.0
 
-        Custom metric function.  See `Custom Metric
-        <https://xgboost.readthedocs.io/en/latest/tutorials/custom_metric_obj.html>`_ for
-        details.
+        Custom metric function.  See :doc:`Custom Metric </tutorials/custom_metric_obj>`
+        for details.
 
     Returns
     -------