Simplify Python getting started example #8153

ravicodelabs · 2022-08-10T16:11:32Z

This PR aims to resolve issue #8146.

Current behavior:

The current Python getting started example requires the user to manually set the local file path to the Agaricus data set in libsvm format before being able to run the example code.

New behavior:

This PR leverages sklearn to load the data set. Hence, as long as the user has sklearn installed, after a pip install xgboost, the user should be able to run the getting started example as is (reducing friction, especially for new users).

Additional Details:

The well-known Iris data set is used, since the Agaricus data set is not available in sklearn.datasets.
The xgboost.XGBClassifier class is used here rather than the xgboost.fit function to train the model as the latter would require an extra step of converting form numpy arrays to xgboost.DMatrix.

Load data set via `sklearn` rather than a local file path.

trivialfis

This looks great! Personally, I want to move XGBoost closer to sklearn and introduce the native interface only if necessary. ;-)

Simplify Python getting started example

46b65a6

Load data set via `sklearn` rather than a local file path.

trivialfis approved these changes Aug 11, 2022

View reviewed changes

trivialfis merged commit 20d1bba into dmlc:master Aug 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify Python getting started example #8153

Simplify Python getting started example #8153

ravicodelabs commented Aug 10, 2022

trivialfis left a comment

Simplify Python getting started example #8153

Simplify Python getting started example #8153

Conversation

ravicodelabs commented Aug 10, 2022

Current behavior:

New behavior:

Additional Details:

trivialfis left a comment

Choose a reason for hiding this comment