This repository has been archived by the owner on Feb 7, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thanks - merging. I tend to still call it DeviceContext because it is not only used in Op - for example a network might have a device context protobuf as well. As the context mainly wraps over a specific device rather than an operator, I feel that DeviceContext might be more accurate. |
bwasti
added a commit
that referenced
this pull request
Feb 16, 2017
lukeyeager
pushed a commit
to lukeyeager/caffe2
that referenced
this pull request
Apr 12, 2017
…tal_opt Seq2seq experimental opt
facebook-github-bot
pushed a commit
that referenced
this pull request
Aug 30, 2017
…porting Summary: Code generator for and high-performance emebding look-up kernels, supporting Sum, WeightedSum, and Mean reducers. Achieve at least 1.5x speedup on float and over 2x speedup for float16, compared to existing code These are results on Broadwell, using sparse_lengths_sum_benchmar.par benchmark Old ============== [root@fblearner001.01.ftw1 /home/msmelyan]# numactl -m 0 -C 0 ./sparse_lengths_sum_benchmark.par --iteration 10000 Preparing lookup table. 2017-08-08 00:10:23.101848 Preparation finished. 2017-08-08 00:10:27.955680 I0808 00:10:27.955732 30700 net.cc:177] Starting benchmark. I0808 00:10:27.955759 30700 net.cc:178] Running warmup runs. I0808 00:10:27.956367 30700 net.cc:188] Main runs. I0808 00:10:31.839035 30700 net.cc:199] Main run finished. Milliseconds per iter: 0.388264. Iters per second: 2575.56 I0808 00:10:35.704169 30700 net.cc:233] Operator #0 (indices, Python) 0.0583264 ms/iter I0808 00:10:35.704210 30700 net.cc:233] Operator #1 (Y, SparseLengthsSum) 0.327694 ms/iter I0808 00:10:35.704213 30700 net.cc:237] Time per operator type: I0808 00:10:35.704217 30700 net.cc:246] 0.327694 SparseLengthsSum I0808 00:10:35.704221 30700 net.cc:246] 0.0583264 Python [root@fblearner001.01.ftw1 /home/msmelyan]# numactl -m 0 -C 0 ./sparse_lengths_sum_benchmark.par --iteration 10000 --dtype float16 Preparing lookup table. 2017-08-08 00:10:59.047159 Preparation finished. 2017-08-08 00:11:05.140565 I0808 00:11:05.140612 31725 net.cc:177] Starting benchmark. I0808 00:11:05.140635 31725 net.cc:178] Running warmup runs. I0808 00:11:05.141104 31725 net.cc:188] Main runs. I0808 00:11:08.371510 31725 net.cc:199] Main run finished. Milliseconds per iter: 0.323039. Iters per second: 3095.6 I0808 00:11:11.671450 31725 net.cc:233] Operator #0 (indices, Python) 0.0609876 ms/iter I0808 00:11:11.671489 31725 net.cc:233] Operator #1 (Y, SparseLengthsSum) 0.26856 ms/iter I0808 00:11:11.671494 31725 net.cc:237] Time per operator type: I0808 00:11:11.671497 31725 net.cc:246] 0.26856 SparseLengthsSum I0808 00:11:11.671500 31725 net.cc:246] 0.0609876 Python New (Misha's) ============== [root@fblearner001.01.ftw1 /home/msmelyan]# numactl -m 0 -C 0 ./sparse_lengths_sum_benchmark.par --iteration 10000 Preparing lookup table. 2017-08-07 23:44:55.897748 Preparation finished. 2017-08-07 23:45:00.708896 I0807 23:45:00.708945 4178361 net.cc:177] Starting benchmark. I0807 23:45:00.708971 4178361 net.cc:178] Running warmup runs. I0807 23:45:00.709444 4178361 net.cc:188] Main runs. I0807 23:45:03.608551 4178361 net.cc:199] Main run finished. Milliseconds per iter: 0.289909. Iters per second: 3449.36 I0807 23:45:06.536182 4178361 net.cc:233] Operator #0 (indices, Python) 0.0572399 ms/iter I0807 23:45:06.536224 4178361 net.cc:233] Operator #1 (Y, SparseLengthsSum) 0.23512 ms/iter I0807 23:45:06.536228 4178361 net.cc:237] Time per operator type: I0807 23:45:06.536232 4178361 net.cc:246] 0.23512 SparseLengthsSum I0807 23:45:06.536236 4178361 net.cc:246] 0.0572399 Python [root@fblearner001.01.ftw1 /home/msmelyan]# numactl -m 0 -C 0 ./sparse_lengths_sum_benchmark.par --iteration 10000 --dtype float16 Preparing lookup table. 2017-08-07 23:45:17.191579 Preparation finished. 2017-08-07 23:45:23.173668 I0807 23:45:23.173715 4179316 net.cc:177] Starting benchmark. I0807 23:45:23.173743 4179316 net.cc:178] Running warmup runs. I0807 23:45:23.174090 4179316 net.cc:188] Main runs. I0807 23:45:24.939749 4179316 net.cc:199] Main run finished. Milliseconds per iter: 0.176564. Iters per second: 5663.67 I0807 23:45:26.698885 4179316 net.cc:233] Operator #0 (indices, Python) 0.0557303 ms/iter I0807 23:45:26.698923 4179316 net.cc:233] Operator #1 (Y, SparseLengthsSum) 0.119794 ms/iter I0807 23:45:26.698927 4179316 net.cc:237] Time per operator type: I0807 23:45:26.698931 4179316 net.cc:246] 0.119794 SparseLengthsSum I0807 23:45:26.698935 4179316 net.cc:246] 0.0557303 Python Reviewed By: salexspb Differential Revision: D5582172 fbshipit-source-id: d71f5a55580b734a51b8f30852b75f379acfdaf2
facebook-github-bot
pushed a commit
that referenced
this pull request
Sep 14, 2017
Summary: UBSan report: ``` UndefinedBehaviorSanitizer: dynamic-type-mismatch caffe2/caffe2/core/tensor.h:786:22 in caffe2/caffe2/core/tensor.h:787:19: runtime error: member call on address 0x60c01f610440 which does not point to an object of type 'caffe2::Tensor<caffe2::Tensor<caffe2::CPUContext> >' *** Aborted at 1505298367 (Unix time, try 'date -d 1505298367') *** *** Signal 6 (SIGABRT) (0xf2) received by PID 242 (pthread TID 0x7fb376f06700) (linux TID 33215) (maybe from PID 242, UID 0), stack trace: *** 0x60c01f610440: note: object is of type 'N6caffe26TensorINS_10CPUContextEEE' 07 5e 81 60 c8 47 13 35 00 00 00 00 90 f3 73 80 20 60 00 00 98 f3 73 80 20 60 00 00 a0 f3 73 80 ^~~~~~~~~~~~~~~~~~~~~~~ vptr for 'N6caffe26TensorINS_10CPUContextEEE' #0 0x1f0d1c22 in std::vector<long, std::allocator<long> > caffe2::GetTensorInfo<caffe2::Tensor<caffe2::CPUContext> >(void const*, bool*, unsigned long*, caffe2::DeviceOption*) caffe2/caffe2/core/tensor.h:787:19 #1 0x9a5e0a1 in caffe2::FacebookOperatorObserver::log() caffe2/caffe2/fb/init/net_observer.cpp:300:15 #2 0x9a5b49d in caffe2::FacebookOperatorObserver::Stop() caffe2/caffe2/fb/init/net_observer.cpp:229:11 #3 0x447d046 in caffe2::Operator<caffe2::CPUContext>::Run(int) caffe2/caffe2/core/operator.h:308:20 #4 0x1ecedb2f in caffe2::SimpleNet::Run() caffe2/caffe2/core/net_simple.cc:51:14 #5 0x1f1ba169 in caffe2::Workspace::RunNet(std::basic_fbstring<char, std::char_traits<char>, std::allocator<char>, std::fbstring_core<char> > const&) caffe2/caffe2/core/workspace.cc:211:26 ... ``` The bug is that `GetTensorType` and `GetTensorType` take context as template argument, not tensor itself. Reviewed By: bddppq Differential Revision: D5826781 fbshipit-source-id: 9cfd2ca1aaef6f8ee8a556ce7b553c0a4f43a100
facebook-github-bot
pushed a commit
that referenced
this pull request
Sep 25, 2017
Summary: Exposed by UBSAN: ```lang=bash caffe2/caffe2/core/qtensor.h:61:40: runtime error: load of value 190, which is not a valid value for type 'bool' #0 0x7fb4fc09c289 in caffe2::QTensor<caffe2::CPUContext>::Resize(std::vector<int, std::allocator<int> >) caffe2/caffe2/core/qtensor.h:61 #1 0x7fb4fc090403 in caffe2::QuantizedFullyConnectedOp<float, caffe2::CPUContext, caffe2::DefaultEngine>::RunOnDevice() caffe2/caffe2/fb/operators/quantized_fully_connected_op.h:93 #2 0x7fb4fc08d5ee in caffe2::Operator<caffe2::CPUContext>::Run(int) caffe2/caffe2/core/operator.h:306 #3 0x426d8a in caffe2::QFCTest(float, float, float, int, int, int, int) caffe2/caffe2/fb/operators/quantized_fully_connected_op_test.cc:78 #4 0x4295f6 in caffe2::QuantizedFullyConnectedTest_Test_Test::TestBody() caffe2/caffe2/fb/operators/quantized_fully_connected_op_test.cc:110 #5 0x7fb4eee3b6a1 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) /home/engshare/third-party2/googletest/master/src/googletest/googletest/src/gtest.cc:2458 #6 0x7fb4eee2cbe1 in testing::Test::Run() /home/engshare/third-party2/googletest/master/src/googletest/googletest/src/gtest.cc:2475 #7 0x7fb4eee2cd27 in testing::TestInfo::Run() /home/engshare/third-party2/googletest/master/src/googletest/googletest/src/gtest.cc:2656 #8 0x7fb4eee2ce34 in testing::TestCase::Run() /home/engshare/third-party2/googletest/master/src/googletest/googletest/src/gtest.cc:2774 #9 0x7fb4eee2eb8b in testing::internal::UnitTestImpl::RunAllTests() /home/engshare/third-party2/googletest/master/src/googletest/googletest/src/gtest.cc:4649 #10 0x7fb4eee2ef3c in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) /home/engshare/third-party2/googletest/master/src/googletest/googletest/src/gtest.cc:2458 #11 0x7fb4eee2ef3c in testing::UnitTest::Run() /home/engshare/third-party2/googletest/master/src/googletest/googletest/src/gtest.cc:4257 #12 0x7fb4fbee2ed0 in RUN_ALL_TESTS() third-party-buck/gcc-5-glibc-2.23/build/googletest/include/gtest/gtest.h:2233 #13 0x7fb4fbee2d60 in main common/gtest/LightMain.cpp:12 #14 0x7fb4e0ef7857 in __libc_start_main /home/engshare/third-party2/glibc/2.23/src/glibc-2.23/csu/../csu/libc-start.c:289 #15 0x424e08 in _start /home/engshare/third-party2/glibc/2.23/src/glibc-2.23/csu/../sysdeps/x86_64/start.S:118 UndefinedBehaviorSanitizer: invalid-bool-load caffe2/caffe2/core/qtensor.h:61:40 ``` Reviewed By: yfeldblum Differential Revision: D5898877 fbshipit-source-id: e32b1732a1946fdafaec67b3fbc072dc93bcd917
orionr
added a commit
that referenced
this pull request
Mar 29, 2018
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Just some random thought:
In side of DeviceContext, Can we call it OpContext, as in one of header, it seems the context is mainly for operators.
Also under operators directory, everything have another suffix op, it is a bit redundant, my two cents.
Otherwise.
Xiaoyun