Skip to content

Commit

Permalink
Merge pull request #19 from rapidsai/branch-0.17
Browse files Browse the repository at this point in the history
sync with upstream
  • Loading branch information
daxiongshu authored Nov 17, 2020
2 parents e3b7848 + 238a8de commit e6d8ec3
Show file tree
Hide file tree
Showing 201 changed files with 6,489 additions and 3,112 deletions.
22 changes: 22 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,15 @@
## New Features

## Improvements
- PR #3077: Improve runtime for test_kmeans
- PR #3070: Speed up dask/test_datasets tests
- PR #3075: Speed up test_linear_model tests
- PR #3078: Speed up test_incremental_pca tests
- PR #2902: `matrix/matrix.cuh` in RAFT namespacing
- PR #2903: Moving linalg's gemm, gemv, transpose to RAFT namespaces
- PR #2905: `stats` prims `mean_center`, `sum` to RAFT namespaces
- PR #2904: Moving `linalg` basic math ops to RAFT namespaces
- PR #2956: Follow cuML array conventions in ARIMA and remove redundancy
- PR #3000: Pin cmake policies to cmake 3.17 version, bump project version to 0.17
- PR #3083: Improving test_make_blobs testing time
- PR #2906: Moving `linalg` decomp to RAFT namespaces
Expand All @@ -18,8 +20,20 @@
- PR #3044: Move leftover `linalg` and `stats` to RAFT namespaces
- PR #3067: Deleting prims moved to RAFT and updating header paths
- PR #3074: Reducing dask coordinate descent test runtime
- PR #3096: Avoid memory transfers in CSR WeakCC for DBSCAN
- PR #3088: More readable and robust FIL C++ test management
- PR #3052: Speeding up MNMG KNN Cl&Re testing
- PR #3115: Speeding up MNMG UMAP testing
- PR #3112: Speed test_array
- PR #3111: Adding Cython to Code Coverage
- PR #3129: Update notebooks README
- PR #3040: Improved Array Conversion with CumlArrayDescriptor and Decorators
- PR #3134: Improving the Deprecation Message Formatting in Documentation

## Bug Fixes
- PR #3069: Prevent conversion of DataFrames to Series in preprocessing
- PR #3065: Refactoring prims metrics function names from camelcase to underscore format
- PR #3033: Splitting ml metrics to individual files
- PR #3072: Fusing metrics and score directories in src_prims
- PR #3037: Avoid logging deadlock in multi-threaded C code
- PR #2983: Fix seeding of KISS99 RNG
Expand All @@ -28,10 +42,17 @@
- PR #3012: Increasing learning rate for SGD log loss and invscaling pytests
- PR #3021: Fix a hang in cuML RF experimental backend
- PR #3039: Update RF and decision tree parameter initializations in benchmark codes
- PR #3060: Speed up test suite `test_fil`
- PR #3061: Handle C++ exception thrown from FIL predict
- PR #3073: Update mathjax CDN URL for documentation
- PR #3062: Bumping xgboost version to match cuml version
- PR #3084: Fix artifacts in t-SNE results
- PR #3086: Reverting FIL Notebook Testing
- PR #3114: Fixed a typo in SVC's predict_proba AttributeError
- PR #3117: Fix two crashes in experimental RF backend
- PR #3119: Fix memset args for benchmark
- PR #3130: Return Python string from `dump_as_json()` of RF
- PR #3136: Fix stochastic gradient descent example

# cuML 0.16.0 (Date TBD)

Expand Down Expand Up @@ -90,6 +111,7 @@
- PR #2928: Updating Estimators Derived from Base for Consistency
- PR #2942: Adding `cuml.experimental` to the Docs
- PR #3010: Improve gpuCI Scripts
- PR #3141: Move DistanceType enum to RAFT

## Bug Fixes
- PR #2973: Allow data imputation for nan values
Expand Down
15 changes: 10 additions & 5 deletions build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ ARGS=$*
REPODIR=$(cd $(dirname $0); pwd)

VALIDTARGETS="clean libcuml cuml cpp-mgtests prims bench prims-bench cppdocs pydocs"
VALIDFLAGS="-v -g -n --allgpuarch --buildfaiss --buildgtest --singlegpu --nvtx --show_depr_warn -h --help "
VALIDFLAGS="-v -g -n --allgpuarch --buildfaiss --buildgtest --singlegpu --nvtx --show_depr_warn --codecov -h --help "
VALIDARGS="${VALIDTARGETS} ${VALIDFLAGS}"
HELP="$0 [<target> ...] [<flag> ...]
where <target> is:
Expand All @@ -43,6 +43,8 @@ HELP="$0 [<target> ...] [<flag> ...]
--singlegpu - Build libcuml and cuml without multigpu components
--nvtx - Enable nvtx for profiling support
--show_depr_warn - show cmake deprecation warnings
--codecov - Enable code coverage support by compiling with Cython linetracing
and profiling enabled (WARNING: Impacts performance)
-h - print this text
default action (no args) is to build and install 'libcuml', 'cuml', and 'prims' targets only for the detected GPU arch
Expand All @@ -58,7 +60,7 @@ BUILD_TYPE=Release
INSTALL_TARGET=install
BUILD_ALL_GPU_ARCH=0
SINGLEGPU_CPP_FLAG=""
SINGLEGPU_PYTHON_FLAG=""
BUILD_PYTHON_ARGS=${BUILD_PYTHON_ARGS:=""}
NVTX=OFF
CLEAN=0
BUILD_DISABLE_DEPRECATION_WARNING=ON
Expand Down Expand Up @@ -115,7 +117,7 @@ if hasArg --allgpuarch; then
BUILD_ALL_GPU_ARCH=1
fi
if hasArg --singlegpu; then
SINGLEGPU_PYTHON_FLAG="--singlegpu"
BUILD_PYTHON_ARGS="${BUILD_PYTHON_ARGS} --singlegpu"
SINGLEGPU_CPP_FLAG=ON
fi
if hasArg cpp-mgtests; then
Expand All @@ -133,6 +135,9 @@ fi
if hasArg --show_depr_warn; then
BUILD_DISABLE_DEPRECATION_WARNING=OFF
fi
if hasArg --codecov; then
BUILD_PYTHON_ARGS="${BUILD_PYTHON_ARGS} --linetrace=1 --profile"
fi
if hasArg clean; then
CLEAN=1
fi
Expand Down Expand Up @@ -224,9 +229,9 @@ fi
if completeBuild || hasArg cuml || hasArg pydocs; then
cd ${REPODIR}/python
if [[ ${INSTALL_TARGET} != "" ]]; then
python setup.py build_ext -j${PARALLEL_LEVEL:-1} ${SINGLEGPU_PYTHON_FLAG} --library-dir=${LIBCUML_BUILD_DIR} install --single-version-externally-managed --record=record.txt
python setup.py build_ext -j${PARALLEL_LEVEL:-1} ${BUILD_PYTHON_ARGS} --library-dir=${LIBCUML_BUILD_DIR} install --single-version-externally-managed --record=record.txt
else
python setup.py build_ext -j${PARALLEL_LEVEL:-1} --library-dir=${LIBCUML_BUILD_DIR} ${SINGLEGPU_PYTHON_FLAG}
python setup.py build_ext -j${PARALLEL_LEVEL:-1} ${BUILD_PYTHON_ARGS} --library-dir=${LIBCUML_BUILD_DIR}
fi

if hasArg pydocs; then
Expand Down
4 changes: 2 additions & 2 deletions ci/gpu/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ if [[ -z "$PROJECT_FLASH" || "$PROJECT_FLASH" == "0" ]]; then
################################################################################

gpuci_logger "Build from source"
$WORKSPACE/build.sh clean libcuml cuml prims bench -v
$WORKSPACE/build.sh clean libcuml cuml prims bench -v --codecov

gpuci_logger "Resetting LD_LIBRARY_PATH"

Expand Down Expand Up @@ -190,7 +190,7 @@ else
conda install -c $WORKSPACE/ci/artifacts/cuml/cpu/conda-bld/ libcuml

gpuci_logger "Building cuml"
"$WORKSPACE/build.sh" -v cuml
"$WORKSPACE/build.sh" -v cuml --codecov

gpuci_logger "Python pytest for cuml"
cd $WORKSPACE/python
Expand Down
13 changes: 12 additions & 1 deletion cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -401,8 +401,19 @@ if(BUILD_CUML_CPP_LIBRARY)
src/holtwinters/holtwinters.cu
src/kmeans/kmeans.cu
src/knn/knn.cu
src/metrics/metrics.cu
src/metrics/accuracy_score.cu
src/metrics/adjusted_rand_index.cu
src/metrics/completeness_score.cu
src/metrics/entropy.cu
src/metrics/homogeneity_score.cu
src/metrics/kl_divergence.cu
src/metrics/mutual_info_score.cu
src/metrics/pairwise_distance.cu
src/metrics/r2_score.cu
src/metrics/rand_index.cu
src/metrics/silhouette_score.cu
src/metrics/trustworthiness.cu
src/metrics/v_measure.cu
src/pca/pca.cu
src/randomforest/randomforest.cu
src/random_projection/rproj.cu
Expand Down
4 changes: 2 additions & 2 deletions cpp/bench/common/ml_benchmark.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -55,14 +55,14 @@ struct CudaEventTimer {
CUDA_CHECK(cudaEventCreate(&stop));
// flush L2?
if (ptr != nullptr && l2CacheSize > 0) {
CUDA_CHECK(cudaMemsetAsync(ptr, sizeof(char) * l2CacheSize, 0, s));
CUDA_CHECK(cudaMemsetAsync(ptr, 0, sizeof(char) * l2CacheSize, s));
CUDA_CHECK(cudaStreamSynchronize(stream));
}
CUDA_CHECK(cudaEventRecord(start, stream));
}
CudaEventTimer() = delete;

/**
/**
* @brief The dtor stops the timer and performs a synchroniazation. Time of
* the benchmark::State object provided to the ctor will be set to the
* value given by `cudaEventElapsedTime()`.
Expand Down
2 changes: 1 addition & 1 deletion cpp/bench/prims/distance_common.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ struct Params {
int m, n, k;
}; // struct Params

template <typename T, ML::Distance::DistanceType DType>
template <typename T, raft::distance::DistanceType DType>
struct Distance : public Fixture {
Distance(const std::string& name, const Params& p)
: Fixture(name, std::shared_ptr<deviceAllocator>(
Expand Down
2 changes: 1 addition & 1 deletion cpp/bench/prims/distance_cosine.cu
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ namespace Bench {
namespace Distance {

DIST_BENCH_REGISTER(DistanceCosine,
ML::Distance::DistanceType::EucExpandedCosine);
raft::distance::DistanceType::EucExpandedCosine);

} // namespace Distance
} // namespace Bench
Expand Down
4 changes: 2 additions & 2 deletions cpp/bench/prims/distance_exp_l2.cu
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,9 @@ namespace MLCommon {
namespace Bench {
namespace Distance {

DIST_BENCH_REGISTER(DistanceL2Sq, ML::Distance::DistanceType::EucExpandedL2);
DIST_BENCH_REGISTER(DistanceL2Sq, raft::distance::DistanceType::EucExpandedL2);
DIST_BENCH_REGISTER(DistanceL2Sqrt,
ML::Distance::DistanceType::EucExpandedL2Sqrt);
raft::distance::DistanceType::EucExpandedL2Sqrt);

} // namespace Distance
} // namespace Bench
Expand Down
2 changes: 1 addition & 1 deletion cpp/bench/prims/distance_l1.cu
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ namespace MLCommon {
namespace Bench {
namespace Distance {

DIST_BENCH_REGISTER(DistanceL1, ML::Distance::DistanceType::EucUnexpandedL1);
DIST_BENCH_REGISTER(DistanceL1, raft::distance::DistanceType::EucUnexpandedL1);

} // namespace Distance
} // namespace Bench
Expand Down
4 changes: 2 additions & 2 deletions cpp/bench/prims/distance_unexp_l2.cu
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@ namespace Bench {
namespace Distance {

DIST_BENCH_REGISTER(DistanceUnexpL2Sq,
ML::Distance::DistanceType::EucUnexpandedL2);
raft::distance::DistanceType::EucUnexpandedL2);
DIST_BENCH_REGISTER(DistanceUnexpL2Sqrt,
ML::Distance::DistanceType::EucUnexpandedL2Sqrt);
raft::distance::DistanceType::EucUnexpandedL2Sqrt);

} // namespace Distance
} // namespace Bench
Expand Down
4 changes: 2 additions & 2 deletions cpp/include/cuml/cluster/kmeans.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ struct KMeansParams {
int seed = 0;

// Metric to use for distance computation. Any metric from
// ML::Distance::DistanceType can be used
// raft::distance::DistanceType can be used
int metric = 0;

// Number of instance k-means algorithm will be run with different seeds.
Expand Down Expand Up @@ -184,7 +184,7 @@ void predict(const raft::handle_t &handle, const KMeansParams &params,
* sample in 'X' (it should be same as the dimension for each cluster centers in
* 'centroids').
* @param[in] metric Metric to use for distance computation. Any
* metric from ML::Distance::DistanceType can be used
* metric from raft::distance::DistanceType can be used
* @param[out] X_new X transformed in the new space..
*/
void transform(const raft::handle_t &handle, const KMeansParams &params,
Expand Down
23 changes: 0 additions & 23 deletions cpp/include/cuml/distance/distance_type.h

This file was deleted.

3 changes: 3 additions & 0 deletions cpp/include/cuml/fil/fil.h
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,9 @@ enum output_t {
/** output class label: either apply threshold to the output of the previous stage (for binary classification),
or select the class with the most votes to get the class label (for multi-class classification). */
CLASS = 0x100,
SIGMOID_CLASS = SIGMOID | CLASS,
AVG_CLASS = AVG | CLASS,
AVG_SIGMOID_CLASS = AVG | SIGMOID | CLASS,
};

/** storage_type_t defines whether to import the forests as dense or sparse */
Expand Down
Loading

0 comments on commit e6d8ec3

Please sign in to comment.