Incorrect prediction logic, allows buffer overflow #7903

drorspei · 2022-05-14T19:55:31Z

Hey hey, thanks for this amazing package.
There is a small issue arising from an inconsistency between the prediction logic and loading models logic:
When loading a model from json we can specify the indices of the left and right children of each node in each tree however we want.
But the prediction logic in predictor/predict_fn.h:GetNextNode assumes that the right child of a node is always exactly one after the left child of the node. (given that the current feature isn't categorical).
An example is given below.
Since there is access to the node that is one after the left child, this can be a buffer overflow if there is no such node, which can cause a segmentation fault.
The easiest solution would be to update a single line in predictor/predict_fn.h so that it does the correct thing, without making assumptions about the tree structure - since they are not enforced anywhere else.
I've put up a pull request here: #7902

The following is an example in python. We load a model with a single tree with 3 nodes, but order them as root, right child, left child.
The leaf value of the left node is 2, but the result is either 1 or a segfault.

import xgboost
import json
import pandas as pd

xgboost.Booster(model_file=bytearray(json.dumps({
    'learner': {
        'attributes': {
            'best_iteration': '0',
            'best_ntree_limit': '1'
        },
        'feature_names': ['0'],
        'feature_types': ['float'],
        'gradient_booster': {
            'model': {
                'gbtree_model_param': {
                    'num_parallel_tree': '1',
                    'num_trees': '1',
                    'size_leaf_vector': '0'
                },
                'tree_info': [0],
                'trees': [
                    {
                        'categories': [],
                        'categories_nodes': [],
                        'categories_segments': [],
                        'categories_sizes': [],
                        'id': 0,
                        'tree_param': {
                            'num_deleted': '0',
                            'num_feature': '1',
                            'num_nodes': '3',
                            'size_leaf_vector': '0'
                        },
                        'split_type': [0, 0, 0],
                        'split_indices': [0, 0, 0],
                        'default_left': [0, 0, 0],
                        'split_conditions': [0., 1., 2.],
                        'loss_changes': [0., 0., 0., ],
                        'parents': [2147483647, 0, 0],
                        'left_children': [2, -1, -1],
                        'right_children': [1, -1, -1],
                        'sum_hessian': [0.0, 0.0, 0.0],
                        'base_weights': [0., 0., 0.]}
                ]
            },
            'name': 'gbtree'
        },
        'learner_model_param': {
            'base_score': '0',
            'num_class': '0',
            'num_feature': '1',
            'num_target': '1'
        },
        'objective': {
            'name': 'reg:squarederror',
            'reg_loss_param': {'scale_pos_weight': '1'}
        }
    },
    'version': [1, 6, 1]
}).encode())).predict(xgboost.DMatrix(pd.DataFrame([[1.]], columns=['0'])))

The text was updated successfully, but these errors were encountered:

drorspei mentioned this issue May 14, 2022

Remove tree structure assumption in GetNextNode #7902

Closed

hcho3 mentioned this issue May 18, 2022

[Doc] Warn against loading JSON from external source #7918

Merged

hcho3 closed this as completed in #7918 May 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect prediction logic, allows buffer overflow #7903

Incorrect prediction logic, allows buffer overflow #7903

drorspei commented May 14, 2022 •

edited

Loading

Incorrect prediction logic, allows buffer overflow #7903

Incorrect prediction logic, allows buffer overflow #7903

Comments

drorspei commented May 14, 2022 • edited Loading

drorspei commented May 14, 2022 •

edited

Loading