Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ctr table depends #36465

Merged
merged 7 commits into from
Oct 21, 2021
Merged

Conversation

zhaocaibei123
Copy link
Contributor

@zhaocaibei123 zhaocaibei123 commented Oct 15, 2021

PR types

New features

PR changes

Others

Describe

add memory sparse table dependency:

  • kv data structure definition: feature_value.h
  • sparse_sgd_rule for param optimization

@CLAassistant
Copy link

CLAassistant commented Oct 15, 2021

CLA assistant check
All committers have signed the CLA.

@paddle-bot-old
Copy link

Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

// See the License for the specific language governing permissions and
// limitations under the License.

#ifndef PADDLE_FLUID_DISTRIBUTED_COMMON_LOCAL_RANDOM_H_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend using #pragma once

namespace distributed {

// Get time in seconds.
inline double current_realtime() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

建议和编程风格保持一致,其他一致,采用驼峰式命名
image

size_t beta2_pow_index() { return beta1_pow_index() + 1; }

protected:
float _learning_rate;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这些也不太符合我们编程规范,建议修改
image

@@ -0,0 +1,167 @@
// Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2020 -> 2021


public:
map_type values_[CTR_SPARSE_SHARD_BUCKET_NUM];
std::hash<uint64_t> _hasher;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同上

@@ -31,8 +31,9 @@ struct PullSparseValue {
feasigns_(nullptr),
frequencies_(nullptr) {}

explicit PullSparseValue(std::vector<uint64_t> feasigns,
std::vector<uint32_t> frequencies, int dim) {
explicit PullSparseValue(std::vector<uint64_t>& feasigns, // NOLINT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use nolint? can we use const& here?

@@ -0,0 +1,55 @@
/* Copyright (c) 2020 PaddlePaddle Authors. All Rights Reserved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2020->2021


auto* feature_value = shard->Init(key);
feature_value->resize(vec.size());
memcpy(const_cast<float*>(feature_value->data()), vec.data(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

看起来好像feature_value->data()并不是const的?
image

@@ -0,0 +1,183 @@
/***************************************************************************
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

请修改为标准的license

optional uint32 fea_dim = 4 [ default = 11 ];
optional uint32 embedx_dim = 5 [ default = 8 ];
optional uint32 embedx_threshold = 6 [ default = 10 ];
optional DownpourTableAccessorParameter downpour_accessor_param = 7;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CtrAccessorParameter

repeated float weight_bounds = 4;
}

message SparseAdamSGDParameter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGDRule

namespace paddle {
namespace distributed {

void CtrSparseNaiveSGDRule::load_config(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

去掉ctr,另外是和proto保持一致;proto里面添加注释

@@ -150,3 +181,39 @@ message TableAccessorSaveParameter {
optional string converter = 2;
optional string deconverter = 3;
}

message SparseSGDRuleParameter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

加注释,对应c++类名

std::vector<float> data_;
};

class CtrValueBlock {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xxxshard

Copy link
Contributor

@yaoxuefeng6 yaoxuefeng6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yaoxuefeng6 yaoxuefeng6 merged commit d64f7b3 into PaddlePaddle:develop Oct 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants