Skip to content
This repository has been archived by the owner on Dec 8, 2021. It is now read-only.

server: check open file ulimit for local backend #343

Merged
merged 7 commits into from
Jul 14, 2020
Merged

Conversation

glorv
Copy link
Contributor

@glorv glorv commented Jul 6, 2020

What problem does this PR solve?

Check process max open file limit for local backend

What is changed and how it works?

In local backend mode, pebble need to write a lot of L0 sst files. So If data source is fairly big, the max_open_file limit may not be enough. So we check the RLIMIT_NOFILE base on source file size, and try to update it if not big enough.

close #342

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

Related changes

@glorv glorv requested review from kennytm and 3pointer July 6, 2020 03:42
@@ -96,3 +105,47 @@ func main() {
os.Exit(1)
}
}

func checkSystemRequirement(cfg *config.GlobalConfig) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) This should be checked using the task config before every task starts
(2) This should be ignorable using --check-requirements=false

rLimit.Cur = estimateMaxFiles
err = syscall.Setrlimit(syscall.RLIMIT_NOFILE, &rLimit)
if err != nil {
return errors.Wrap(err, fmt.Sprintf("the maximum number of open file descriptors is too small, got %d, expect greater or equal to %d", prevLimit, estimateMaxFiles))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return errors.Wrap(err, fmt.Sprintf("the maximum number of open file descriptors is too small, got %d, expect greater or equal to %d", prevLimit, estimateMaxFiles))
return errors.Annotatef(err, "the maximum number of open file descriptors is too small, got %d, expect greater or equal to %d", prevLimit, estimateMaxFiles)


// we estimate each sst as 50MB, this is a fairly coarse predict, the actual fd needed is
// depend on chunk&import concurrency and each data&index key-value size
estimateMaxFiles := totalSize / (50 * 1024 * 1024)
Copy link
Collaborator

@kennytm kennytm Jul 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how this "50 MiB" is determined?

also, for a 10 TB data set this means Pebble will open up to 200000 files simultaneously. This sounds more like a bug to me.

Copy link
Contributor Author

@glorv glorv Jul 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the tpcc benchmaerk, with a memtable size of 512MB, the generated L0 sst file sizes ranges from 40-450MB, so I'm not sure how to estimate the avg size. And In the common case, L0 sst files will be compressed to L1-Ln level ssts, thus total files should be very large, but we prevent the compression phase in favor of better performance thus the generated files numbers will be quite big. Because index engine can't be split, so if a table is huge, and there are many indices, then maybe will generate handard of thousand of L0 sst files.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use the largest N total size of each table (N = index concurrency) instead of the entire data source?

from the pebble source code it seems 50 MiB = 25 × 2 MiB, so maybe we could increase the TargetFileSize? 🤔 In TiKV Importer this is set to 1 GiB.

Copy link
Contributor Author

@glorv glorv Jul 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the L0 sst file size is depend on the memory table size and the conf MaxOpenFiles is used for compaction. After explict set MaxOpenFiles option to 1GB, the table order_line index engine still generates a lots of sst files with size of 54MB. And almost each engine generates sst files with different size. Maybe we should use MemoryTableSize as each sst file's size, as memory table accounts for the raw size of the kv-pairs. The sst file size depends on the flush compaction ratio
So we can estimate by: {top N table size} / {MemoryTableSize} as the needed fds

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

Copy link
Collaborator

@kennytm kennytm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

{
Tables: []*mydump.MDTableMeta{
{
TotalSize: 150_800 << 20, //150_288MB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure what these comments mean but the numbers looks way off.

@kennytm kennytm added the status/LGT1 One reviewer already commented LGTM (LGTM1) label Jul 8, 2020
Copy link
Contributor

@3pointer 3pointer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@glorv
Copy link
Contributor Author

glorv commented Jul 14, 2020

/run-all-tests

@glorv
Copy link
Contributor Author

glorv commented Jul 14, 2020

/run-all-tests

@kennytm
Copy link
Collaborator

kennytm commented Jul 14, 2020

Blocked by #348.

@CLAassistant
Copy link

CLAassistant commented Jul 14, 2020

CLA assistant check
All committers have signed the CLA.

@kennytm kennytm merged commit 73e48bb into master Jul 14, 2020
@kennytm kennytm deleted the check-ulimit branch July 14, 2020 08:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status/LGT1 One reviewer already commented LGTM (LGTM1)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

local backend error: too many open files
5 participants