-
Notifications
You must be signed in to change notification settings - Fork 0
/
mysql.note
351 lines (326 loc) · 9.79 KB
/
mysql.note
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
1. Memory
a) buffer pool
* data page
* index page
* insert buffer
* lock info
* ...
b) redo log_buffer
* @@ innodb_log_buffer_size = 8M (default)
* flush to disk when:
. master thread flush per second
. transaction commit flush
. free size < 1/2 log_buffer flush
c) additional memory pool
* frame buffer
* buffer control block(LRU, lock, wait)
* when addition memory pool full, use buffer pool
2. Buffer Pool Management
default page size = 16KB
a) LRU List
|_______5/8________|____3/8___|
(head) (midpoint) (tail)
* @@ innodb_old_blocks_pct
* @@ innodb_old_blocks_time
* LRU: contains unzip_LRU pages
* unzip_LRU: buddy algorithm
b) Free List
c) Flush List
* dirty pages in both Flush List and LRU List
3. Checkpoint:
a) Sharp Checkpoint
b) Fuzzy Checkpoint
* Master Thread Checkpoint
* FLUSH_LRU_LIST Checkpoint
* Async/Sync Flush Checkpoint
* Dirty Page too much Checkpoint
4. Master Thread
a) before InnoDB 1.0.x
* do per second:
. [Always] log buffer flush to disk per second, no matter transaction commited or not
. [Maybe] merge insert buffer, if io frequence < 5 io/s
. [Maybe] flush 100 dirty pages to disk, if buf_get_modified_ratio_pct > innodb_max_dirty_pct(90% default)
. [Maybe] goto background loop, if no user active
* do per 10 seconds:
. [Maybe] flush 100 dirty pages to disk, if io frequence < 200 io/s
. [Always] merge insert buffer
. [Always] log buffer flush to disk
. [Always] delete useless undo log
. [Always] flush 100 dirty pages(if buf_get_modified_ratio_pct > 70%)
or 10 dirty pages(if buf_get_modfied_ratio_pct < 70%) to disk
* background loop
. [Always] delete useless undo log
. [Always] merge insert buffer
. [Always] goto main loop
. [Maybe] flush 100 pages
5. InnoDB
a) Insert Buffer
* secondary index
* not unique
* B+ tree(global unique, ibdata1 share space)
. inner nodes' search key: space(4 Byte), marker(1B), offset(4B)
. leaf nodes: search key(9B), metadata(4B), record
* Insert Buffer Bitmap
* Merge Insert Buffer happens on:
. secondary index page load to buffer pool
. Insert Buffer Bitmap detect secondary index page is full
. Master Thread
b) Double Write
* doublewrite buffer(2MB = 128 * 16K)
* [1st]doublewrite buffer ---memcpy---> share table space, once 1 MB
* [2nd]doublewrite buffer ---flush---> .idb(table space file)
* recover .idb from share table space
c) Adaptive Hash Index
* generated from buffer pool's B+ tree leaf
* build hash requirements:
. access page in continuous same mode(same query conditions)
. over 100 times in the same access mode
. over N times in the same access mode(N = records of page/16)
. query by equal condition instead of range conditions
d) Async IO
* Native AIO(libaio)
e) Flush Neighbor Page
6. Log
a) error log
@@ log_error
b) slow log
@@ log_slow_queries = OFF (default)
@@ long_query_time = 10 (default)
@@ log_queries_not_using_indexes = ON (default)
@@ long_query_io
@@ slow_query_type:
. 0 -- not use slow log
. 1 -- use slow log according to time
. 2 -- use slow log according to io
. 3 -- use slow log according to time & io
* mysqldumpslow slow.log
c) general log
d) binary log
update operation log
* functionality:
. recovery
. replication
. audit
@@ binlog_format:
. STATEMENT
. ROW
. MIXED
7. redo log
a) binlog VS redo log
* binlog contains all db update operations of all engines
redo log only belong to innodb's transaction
* binlog content is logic operation(transaction's detail)
redo log content is modifaction of every Page
* binlog write before transaction commit and write only once
redo log write several times during transaction lifetime
b) redo log format
* redo_log_type
* space
* page_no
* redo_log_body
c) @@ innodb_flush_log_at_trx_commit
* 0 -- do not write redo log to disk when transaction commit
* 1 -- (default) sync write redo log to disk when transaction commit
* 2 -- async write redo log to disk when transaction commit
8. Compact Record Format
a) length(1/2 Byte)
b) NULL flag(1B)
c) record header: 3B + 2B(next_recorder)
d) col1, col2, ...
e) TransactionID
f) Roll Pointer
9. varchar max size 65535 for all varchar column length sum
row flow data store in BLOB Page
if one Page cannot store two rows, using BLOB Page
10. Dynamic Record Format
20 Byte per row, containing offset to Off Page(real data)
11. char type lenght is character counts, not Byte counts
12. Page
a) File Header(38 Byte)
b) Page Header(56 Byte)
c) Infimun & Supremum Records
* virtual records for each page
* infimun < user records < supremum
d) User Records
e) Free Space
f) Page Directory
g) File Trailer(8 Byte)
* checksum(4 Byte)
* FILE_PAGE_LSN(4 Byte)
13. constraint
a) create
* > alter table table_name add unique key uk_column_name (column_name);
b) query
* > select constraint_name, constraint_type from information_schema.TABLE_CONSTRAINTS where table_schema='db name' and table_name='**';
* > select * from information_schema.REFERENTIAL_CONSTRAINTS where constraint_schema='db name';
c) not null | date type
* > set sql_mode = 'STRICT_TRANS_TABLES';
d) foreign key on update | delete
* CASCADE
update child table when parent table update
* SET NULL
set child table null when parent table update
* NO ACTION
raise error when parent table update, unless transaction meets constraints
* RESTRICT(default)
raise error when parent table update
14. partition
a) RANGE
successive range
b) LIST
discrete range
c) HASH
user defined hash function
d) KEY
mysql defined hash function
e) belong to unique key
f) partition conditions:
* interger
g) COLUMNS
* all interger
* date, datetime
* char, varchar, binary
* [not support] blob, text
h) exchange data between partition and non-partition table
15. DB application
a) OLTP
b) OLAP
16. InnoDB index
a) B+tree index
* clustered index
. primary key
. index with data
* secondary index
. Fast Index Create(FIC)
S Lock
. Online Schema Change(OSC)
. Online DDL
secondary index create/drop
modify auto_increase
foreign key create/drop
rename column
* split
. if insert randomly, split in the middle of page
. if insert in direction, split in current page tail
* union index
* cover index
. > select primary key keyname, union_key_name from table
* index choose
. > force index
. > use index
b) Full-text Search
* inverted index
. inverted file index
. full inverted index
* FTS Index Cache
. RB Tree
c) Hash index
* Key = space_id<<20 + space_id + offset
* hash link
17. Cardinality
a) update strategy
* 1/16 columns of table is updated
* stat_modified_counter> 2 000 000 000
b) update method
* A = all leaf counts
* randomly get 8 pages, count distinct keys in each page(P1, P2, ..., P8)
* Cardinality = (P1+P2+...+P8)*A/8
18. optimize index
a) Multi-Range Read
* secondary index query result(key-value) ---> cache
* sort by RowID
* fetch data file in RowID order
b) Index Condition Pushdown(ICP)
* index query with where condition in engine layer
* where condition partially overlap with index
19. LOCK
a) latch
* mutex
* rwlock
b) lock
* target is transaction
* table lock
. IS Lock
. IX Lock
. AUTO-INC Locking
. general table lock
* row lock
. Type
- S Lock
- X Lock
. algorithm
- Record Lock
- Gap Lock
- Next-Key Lock: Record Lock + Gap Lock
solve Phantom Problem
. turn off Gap Lock
- READ COMMITTED
- innodb_locks_unsafe_for_binlog = 1
. Next-Key Lock ---> Record Lock happens on
- query unique index
* choose table lock
. read
. read + write(unique index)
. more select + insert, less delete + udpate
. group by
* not choose row lock
. version
. copy on demand
. application level locks
20. Transaction Isolation Level
a) READ COMMITTED
* consistent nonlocking read
. MVCC
* newest data version
* locking read
. record lock
b) REPEATABLE READ(default)
* consistent nonlocking read
* version on transaction beginning
* locking read
. next-key lock
. > select for update (X Lock)
. > select lock in share mode (S Lock)
c) READ UNCOMMITTED
d) SERIALIZABLE
21. transaction block for lock and raise timeout exception, uncommitted update impact current version, so commit or rollback explicitly
* @@ innodb_lock_wait_timeout
innodb do not rollback major exception, except dead lock
22. avoid dead lock
* wait timeout
* wait-for graph
23. Three normal form for relation model[from wiki]
* A relation is in first normal form if and only if the domain of each attribute contains only atomic (indivisible) values, and the value of each attribute contains only a single value from that domain.
* A table is in 2NF if it is in 1NF and no non-prime attribute is dependent on any proper subset of any candidate key of the table. A non-prime attribute of a table is an attribute that is not a part of any candidate key of the table.
* Third normal form is a normal form that is used in normalizing a database design to reduce the duplication of data and ensure referential integrity by ensuring that (1) the entity is in second normal form, and (2) all the attributes in a table are determined only by the candidate keys of that table and not by any non-prime attributes.
24. Transaction
* type
. Flat Transactions
. Flat Transactions with Savepoints
. Chained Transactions
. Nested Transactions
. Distributed Transactions
* Mechanism
. redo log
. undo log
. binlog
* SQL
. > start transaction | begin
. > commit
. > rollback
. > ...
25. Optimize DB
* architecture
* hardware
* os
* DBMS
. thread count
. buffer pool
. flush disk
. LRU
. lock timeout
. compact
. connection pool
* app
. logic
. SQL(CRUD)