-
Notifications
You must be signed in to change notification settings - Fork 71
/
ghostferry.tla
785 lines (680 loc) · 40.5 KB
/
ghostferry.tla
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
----------------------------- MODULE ghostferry -----------------------------
(***************************************************************************
This is the model of the primary Ghostferry algorithm of copying data. Note
that this is a model that has many simplifying assumptions and thus it will
have a lot of differences with the actual implementation.
The comment here describes this model and justify how it is applicable to
to the real world.
## Process Model ##
A standard, parallel Ghostferry run is something along the following lines:
1. Ghostferry begins to tail and apply the binlogs via the BinlogStreamer.
2. Ghostferry begins to iterate through the tables, applying the rows in
batches via the DataIterator. Ghostferry will find the current minimum
and maximum primary key for each source table and copies all rows within
the ranges.
3. The API consumer of Ghostferry (e.g. copydb) waits until the data copying
is completed.
4. The application/source database is locked/set as read only and copydb wait
until all writes are completed (cutover stage).
5. The binlog streaming is instructed to stop. A target binlog position is set
to be the current binlog position of the master. The binlog streamer will
catch up to that position and then quit.
6. At this point, the source and target database are expected to be identical.
Note that until the source database is set to read only, something
(the application/dba/whatever) will perform INSERT/UPDATE/DELETEs on the source
database concurrently with respect to Ghostferry.
This whole process is modeled as five separate processes:
1. TableIterator: Performs the job of the DataIterator, but only on one table
(see Data Model for justification).
2. BinlogStreamer: Tails and queues the binlog to be written by the
BinlogWriter.
3. BinlogWriter: Writes binlogs in binlog writer queue.
4. Application: Perform INSERT/UPDATE/DELETE on the source database and record
the action into the source database binlog. Essentially
combines both the database itself and the client of the
database (app/dba/whatever).
5. Ferry: Coordinates the entire run by waiting for the data copy to complete,
performing the cut over stage, and stopping the binlog streamer.
## Data Model ##
In the real world, each database contains many schemas, each with many tables.
Each table can have millions/billions of rows, each of which has many values
associated with many columns. This is a very difficult to replicate in a just a
simple mental model, never mind a formal, checkable model. The present data
model makes simplifying assumptions and reduces the source and target databases
down to its very essense:
1. Each database only contains a single table as opposed to many tables.
2. A table is modeled as a finite sequence of `PossibleRecords`. Each element
in the sequence is a row and the index of that element is the primary key of
the row
- Hence, primary keys are in the set of natural numbers.
3. `PossibleRecords` is a set of all possible records. There's a special
element called `NoRecordHere` that indicate that row does not exist. In
other words, if the i-th entry of the table sequence is `NoRecordHere`, it
is the equivalent of saying that row does not exist.
- INSERT into the database is just an UPDATE of a row with `NoRecordHere`
to another record.
- DELETE is just an UPDATE to a row with a record with `NoRecordHere`.
- Thus, the table sequence will always be filled to its maximum capacity.
- The initial table layout should be something like
<<record, record, ..., record, NoRecordHere, ... NoRecordHere>>
- The minimum primary key will always be 1.
- The maximum primary key will initially be the last entry that is not
NoRecordHere. This number is manually specified because it will complicate
the model.
- The table length is defined by the last entry of record before
NoRecordHere. The table length is increased everytime when an INSERT
occurs at a primary key above the current length.
- The table capacity is defined by the size of the sequence, including the
NoRecordHere elements.
4. The Binlog is modeled as a sequence of log entries.
- The sequence starts empty and are appended to every time the source table
is changed.
- To ensure the binlog does not grow to infinity and checking with TLC stays
feasible, a maximum size is constrained on it.
5. The BinlogWriter writes one binlog at a time as opposed to writing them in
batches atomically like it does in the implementation via transactions.
- The reasoning is that writing one binlog event at a time is a "worse"
case than writing more than one binlog event atomically at a time. As long
as they are ordered correctly, it should be okay.
Currently, the simplifications made here are not inductively proven to be
applicable to the real dataset. However, some prose arguments are made that
is convincing that the simplifications will be applicable to the real, larger
(possibly infinite) dataset.
### Hand waving for: Finite table = Infinite tables ###
Note that this is not a proof by any sense of the word, merely a hand-wavy
justification of correctness.
Since TLC can only check a finite model (and only a small one is feasible),
the size of the source and target database must be kept as low as possible.
If we modeled a database with 100 rows, the possible states for TLC to verify
will explode to an astronomical amount and will not complete in a reasonable
time. Thus, we have to choose the size of the initial databases and this is
done via the concept of "super rows".
Super rows are simply the idea that we can combine multiple rows into a single
row and not change the semantics of the operation. For example, we can define
a super row to be a combination of 100 regular rows. If we have a table
with n regular rows, it would also thus have n/100 super rows.
Suppose that the super row j maps to regular rows [i, i+100) and suppose that
the DataIterator atomically copies 100 regular rows [i, i+100) in a batch.
This would be the equivalent of copying a single super row j:
ROW_COPY([i, i+100)) = SUPER_ROW_COPY(j)
As long as ROW_COPY is atomic, the ROW_COPY is the equivalent to
SUPER_ROW_COPY.
A more complex case is if we copy 150 rows, which would imply it is copying
two super rows. Thus:
ROW_COPY([i, i+150)) = SUPER_ROW_COPY(j) /\ SUPER_ROW_COPY(j+1)
We can make similar arguments for the binlog streamer, where instead of row
copy, we are performing operations that mirrows INSERT/UPDATE/DELETE.
Philosophically, we can also think of the entire table as one giant super row.
This would imply the table only needs a size of 1. However, without an actual
proof of correness, the finite table size is chosen to be 2. To be more
cautious: The capacity of the table is 3 so we can INSERT a row.
An alternative way to think about this:
- We can either: app update a row OR copy the row OR binlog to apply the update.
- For any particular row, the copy phase can only occur once. update and binlog
can occur any number of times in any order.
- Binlog respects the ordering of the update
- There are only three cases that we really care about:
- copy -> update -> binlog: copy before update and binlog
- update -> copy -> binlog: copy in between the update and binlog
- update -> binlog -> copy: copy after update and binlog.
- Logically, the copy operate can only happen within those time.
- This means copy \/ update \/ binlog.
- No where did we need to involve the size of the table, thus copy and update
can be their super equivalent, which means we only need 1 record to prove
that this works.
### Hand waving for: finite set of possible records to be inserted = infinite set of possible records ###
There are an infinite amount of possible Records we can insert in real life.
The TLA+ spec reflects that by not assuming a size in the Record set. However,
since TLC can only check a small, finite model, the number of records in this
set must be restricted to a finite amount and we need to inductively proof
that the behaviours with this finite Record set is equivalent to the infinite
set. However, this proof is not yet available and a brief "justification" is
given below:
Since we ever only care about the transition of a row from containing one
Record to another, we only need to have two records in this set.
### Hand waving for: restricting maximum binlog size is ok ###
If we do not restrict the maximum binlog size, it will grow unbounded. We
need to limit it in TLC otherwise the state checking will not complete.
The intuitive amount to bound this at is the maximum of the capacity of the
tables and the number of different types of operations we model. The reason is
this allows at least one branch of the state tree to update each of the entries
in the table with a different operation.
***************************************************************************)
EXTENDS Integers, Sequences, TLC
\* Helper Methods
\* ==============
SetMin(S) == CHOOSE i \in S: \A j \in S : i <= j
\* Constant Declarations
\* =====================
(***************************************************************************
This defines the set of possible records that can be written to the
database.
Example: {r0, r1}
***************************************************************************)
CONSTANT Records
(***************************************************************************
These are defined as ModelValues that will serve as the identifier to the
different processes running.
***************************************************************************)
CONSTANTS TableIterator, BinlogStreamer, Application, Ferry, BinlogWriter
(***************************************************************************
This defines the capacity of the MySQL tables. The source and target databases
are modeled as a single table each, and each table is modeled as a sequence
of Records.
TableCapacity is the maximum size of the table as nothing in TLA+ that's
checkable can be infinite. Think of this as modeling for the finite disk
size of the MySQL hosts.
See more documentation in the PlusCal variable declarations for the variable
MaxPrimaryKey.
***************************************************************************)
CONSTANT TableCapacity
\* PrimaryKeys is the set of all possible primary keys, which is a set of
\* integers between 1 and TableCapacity, inclusive.
PrimaryKeys == 1..TableCapacity
\* This value cannot be computed by the TLC. Use the Model to override it to be
\* a model value instead.
NoRecordHere == CHOOSE r : r \notin Records
\* A set of possible records for TypeOK
PossibleRecords == Records \cup {NoRecordHere}
\* These are all the possible states of the table, but they may not be valid.
PossibleTableStates == [1..TableCapacity -> PossibleRecords]
\* This is a set of all possible binlog structs in a partial form: it is missing the pk.
PossibleOldNewRowStates == {s \in [{"oldr", "newr"} -> Records] : s.oldr # s.newr}
\* This is an operator that adds the pk back into a partial binlog struct in PossibleOldNewRowStates and adds back the PK.
ConstructBinlogFromOldNewRowStates(k, s) == [pk |-> k] @@ s
\* This generates a set of possible binlog entries with all possible PKs.
PossibleBinlogEntries == UNION {{ConstructBinlogFromOldNewRowStates(k, s) : k \in PrimaryKeys} : s \in PossibleOldNewRowStates}
\* This generates a set of possible SourceBinlog, which is a sequence.
\* For simplicity and TLC feasibility, we only generate a single binlog entry in the initial possible SourceBinlog.
\* See more comments near the SourceBinlog variable below.
PossibleSourceBinlogs == [{1} -> PossibleBinlogEntries]
(***************************************************************************
--algorithm ghostferry {
variables
\* MaxPrimaryKey is the current length of the table (TableCapacity is not
\* that, TableCapacity is how we initialize the array). Example, suppose you have an initial table:
\* <<r0, r1, NoRecordHere>>. The TableCapacity is always 3. MaxPrimaryKey
\* can be 2 or 3. If it is 2, it can "grow" to 3.
\* Think of TableCapacity as the total available disk storage space and
\* MaxPrimaryKey to be the current table size.
\*
\* CurrentMaxPrimaryKey indicates the length of the table currently.
\* This number can change as we insert entries.
\*
\* LastSuccessfulPK is the last successful binlog position to start. 0 means
\* it's a fresh start, otherwise it's always a resume.
\*
\* ActualCopiedPK is the PK position that's actually copied to the target.
\* This number can be ahead of LastSuccessfulPK, which is what we want to
\* test idempotency.
MaxPrimaryKey \in 1..TableCapacity,
CurrentMaxPrimaryKey = MaxPrimaryKey,
LastSuccessfulPK \in 0..MaxPrimaryKey,
ActualCopiedPK \in LastSuccessfulPK..MaxPrimaryKey,
\* The source table is initialized from all the possible table states such
\* that rows at indices after MaxPrimaryKey have to be NoRecordHere.
\* Also reject the case where the entire table is empty as that is a trivial
\* case. This saves some time in TLC.
SourceTable \in {
table \in PossibleTableStates : /\ \A k \in MaxPrimaryKey+1..TableCapacity : table[k] = NoRecordHere
/\ ~(\A k \in 1..TableCapacity : table[k] = NoRecordHere)
},
\* The target table is initialized from all the possible table states such
\* that the part of the table that's behind LastSuccessfulPK is copied
\* correctly and the part of the table after LastSuccessfulPK could be
\* either copied or could be NoRecordHere.
\* Also we need to make sure we don't generate something like
\* LastSucessfulPK: 1
\* MaxPrimaryKeys: 3
\* SourceTable: <<r1, r0, r1>>
\* TargetTable: <<r1, NoRecordHere, r1>>
\*
\* To check that these initial conditions are correctly generated, add the line
\* /\ PrintT(<<LastSuccessfulPK, MaxPrimaryKey, SourceTable, TargetTable>>)
\* into the Init definition after generating the TLA+ from pluscal and then
\* run the model (while ignoring warnings). This should print the initial
\* states in the output for examine. You can immediately cancel the run.
TargetTable \in {
table \in PossibleTableStates : /\ \A k \in 1..ActualCopiedPK: table[k] = SourceTable[k]
/\ \A k \in (ActualCopiedPK+1)..TableCapacity : table[k] = NoRecordHere
},
\* The binlogs are modeled as a list of binlog events.
\* The size of the binlog is constrainted to MaxBinlogSize via
\* ActionConstraint. This serves to ensure that the binlog do not increase
\* infinitely. In essense, this assumes that Ghostferry is fast enough to
\* tail and apply binlogs.
\*
\* We generate a possible SourceBinlog before Ghostferry starts running from two possibilities:
\*
\* 1. There are no binlog entries yet. This is the "standard" Ghostferry start.
\* 2. There's a single binlog entry. This binlog entry has already been copied to the
\* target database during a previous Ghostferry run. This simulates a resume.
SourceBinlog \in (
{<<>>}
\cup
{binlogs \in PossibleSourceBinlogs : binlogs[1].newr = TargetTable[binlogs[1].pk]}
),
\* Set to TRUE when the cutover operation is started to prevent the
\* application from writing more data into the database. This is equivalent
\* to either setting the database to read only or use some method to set
\* the application to read only.
ApplicationReadonly = FALSE,
\* The binlog events are read by the BinlogStreamer and written by the
\* BinlogReader. This is the message queue between them.
BinlogWriteQueue = <<>>,
\* This is the index of the binlog position we want to stream to when the
\* application is set to read only during the cutover.
TargetBinlogPos = 0,
\* This is set to TRUE to stop all components of Ghostferry and Ghostferry
\* should terminate after finishing streaming all the binlog events.
BinlogStreamingStopRequested = FALSE;
fair process (ProcTableIterator = TableIterator)
variables currentRow; \* The current row's data
{
\* Note that tblit_rw is an atomic step. If the read and write steps are
\* two distinct steps, this could cause a race condition that will cause
\* data corruption.
\* TODO: offer a way to see this data corruption with TLC.
\*
\* In order to do this in the real work, this could be done via something
\* like SELECT FOR UPDATE, which would block the Application from being
\* to change currentRow while currentRow is being written to the target
\* database.
\*
\* It may be possible to perform some sort of locking between the
\* BinlogStreamer and the TableIterator.
\* TODO: model this with TLA+ and validate its correctness.
\*
\* It may be possible to perform some sort of locking via the Application,
\* but this seems cumbersome and prone to implementation level error.
\* TODO: model this with TLA+ and validate its correctness.
tblit_loop: while (LastSuccessfulPK < MaxPrimaryKey) {
tblit_rw: currentRow := SourceTable[LastSuccessfulPK + 1];
if (currentRow # NoRecordHere /\ TargetTable[LastSuccessfulPK + 1] = NoRecordHere) {
TargetTable[LastSuccessfulPK + 1] := currentRow;
};
tblit_upkey: LastSuccessfulPK := LastSuccessfulPK + 1;
};
}
fair process (ProcBinlogStreamer = BinlogStreamer)
variables
lastSuccessfulBinlogPos = 0, \* Last binlog pos successfully applied on the target db
currentBinlogEntry; \* The binlog event that is currently being read
{
binlog_loop: while (BinlogStreamingStopRequested = FALSE \/ (BinlogStreamingStopRequested = TRUE /\ lastSuccessfulBinlogPos < TargetBinlogPos)) {
\* We cannot use an await as there could be a deadlock for
\* when the application is set to read only and thus nothing
\* else writes to the database.
\*
\* This also means in the real implementation we need a
\* non-blocking read for the binlog.
binlog_read: if (lastSuccessfulBinlogPos < Len(SourceBinlog)) {
currentBinlogEntry := SourceBinlog[lastSuccessfulBinlogPos + 1];
binlog_queue: BinlogWriteQueue := Append(BinlogWriteQueue, currentBinlogEntry);
binlog_upkey: lastSuccessfulBinlogPos := lastSuccessfulBinlogPos + 1;
};
}
}
fair process (ProcBinlogWriter = BinlogWriter) {
binlog_writer_loop: while(pc[BinlogStreamer] # "Done" \/ Len(BinlogWriteQueue) > 0) {
binlog_write: while (Len(BinlogWriteQueue) > 0) {
with (currentBinlog = Head(BinlogWriteQueue)) {
if (TargetTable[currentBinlog.pk] = currentBinlog.oldr) {
TargetTable[currentBinlog.pk] := currentBinlog.newr;
};
};
\* Note that Tail returns everything in the sequence expect the Head().
BinlogWriteQueue := Tail(BinlogWriteQueue);
};
};
}
(***********************************************************************
The application modeled here encompasses both the real application as
well as the database itself. It is modeled as a process that is always
issuing INSERT/UPDATE/DELETEs. The model also appends the record into
the binlogs.
The model here also models setting the database/application as read only.
It will stop writing to the database when ApplicationReadonly = TRUE.
The model combines the act of picking the row to update, writing to it,
and recording the action into the binlog as one big atomic step. This
ASSUMPTION relies upon the fact that MySQL's updates atomically writes
to the binlog and the actual data with respect to Ghostferry.
***********************************************************************)
fair process (ProcApplication = Application)
variables
oldRecord,
newRecord,
chosenPK,
{
app_loop: while (ApplicationReadonly = FALSE) {
\* Choose a "random" PK to update.
app_write: with (pk \in 1..SetMin({TableCapacity, CurrentMaxPrimaryKey + 1})) {
chosenPK := pk;
};
oldRecord := SourceTable[chosenPK];
\* Choose a "random" record to update the chosen row, except
\* the current value of the chosen row (oldRecord), as that
\* would be a pointless update and we don't need to make TLC
\* check that.
with (r \in PossibleRecords \ {oldRecord}) {
newRecord := r;
};
SourceBinlog := Append(
SourceBinlog,
[
pk |-> chosenPK,
oldr |-> oldRecord,
newr |-> newRecord
]
);
SourceTable[chosenPK] := newRecord;
\* The following essentially implements auto_increment. We
\* might not necessarily need this, but there's no proof
\* saying that we can, thus it is included.
if (oldRecord = NoRecordHere /\ chosenPK > CurrentMaxPrimaryKey) {
assert (chosenPK - 1 = CurrentMaxPrimaryKey);
CurrentMaxPrimaryKey := chosenPK;
}
}
}
(***********************************************************************
In the actual code, the Ferry class would have started all of the above.
This is unnecessary here as it is done via the Next definition in TLA+
(automatically generated from PlusCal). Thus, the Ferry here is really
the Ferry that each application must implement:
1. Waiting until the DataITerator is finished copying data.
2. Perform the cutover operation (setting the source to be read only).
3. Instruct the BinlogStreamer to quit after streaming.
Note that setting the target binlog position and requesting binlog
streaming to stop are two distinct steps. Making them one atomic step
is not realistic unless we implement a lock. With two distinct steps,
if the steps are reversed, a race condition will be present.
***********************************************************************)
fair process (ProcFerry = Ferry) {
ferry_setro: await pc[TableIterator] = "Done";
ApplicationReadonly := TRUE;
ferry_waitro: await pc[Application] = "Done";
ferry_binlogpos: TargetBinlogPos := Len(SourceBinlog);
ferry_binlogstop: BinlogStreamingStopRequested := TRUE;
}
}
***************************************************************************)
\* BEGIN TRANSLATION - the hash of the PCal code: PCal-88d6f264d1db8757d257b45fcc829f8a
CONSTANT defaultInitValue
VARIABLES MaxPrimaryKey, CurrentMaxPrimaryKey, LastSuccessfulPK,
ActualCopiedPK, SourceTable, TargetTable, SourceBinlog,
ApplicationReadonly, BinlogWriteQueue, TargetBinlogPos,
BinlogStreamingStopRequested, pc, currentRow,
lastSuccessfulBinlogPos, currentBinlogEntry, oldRecord, newRecord,
chosenPK
vars == << MaxPrimaryKey, CurrentMaxPrimaryKey, LastSuccessfulPK,
ActualCopiedPK, SourceTable, TargetTable, SourceBinlog,
ApplicationReadonly, BinlogWriteQueue, TargetBinlogPos,
BinlogStreamingStopRequested, pc, currentRow,
lastSuccessfulBinlogPos, currentBinlogEntry, oldRecord, newRecord,
chosenPK >>
ProcSet == {TableIterator} \cup {BinlogStreamer} \cup {BinlogWriter} \cup {Application} \cup {Ferry}
Init == (* Global variables *)
/\ MaxPrimaryKey \in 1..TableCapacity
/\ CurrentMaxPrimaryKey = MaxPrimaryKey
/\ LastSuccessfulPK \in 0..MaxPrimaryKey
/\ ActualCopiedPK \in LastSuccessfulPK..MaxPrimaryKey
/\ SourceTable \in {
table \in PossibleTableStates : /\ \A k \in MaxPrimaryKey+1..TableCapacity : table[k] = NoRecordHere
/\ ~(\A k \in 1..TableCapacity : table[k] = NoRecordHere)
}
/\ TargetTable \in {
table \in PossibleTableStates : /\ \A k \in 1..ActualCopiedPK: table[k] = SourceTable[k]
/\ \A k \in (ActualCopiedPK+1)..TableCapacity : table[k] = NoRecordHere
}
/\ SourceBinlog \in (
{<<>>}
\cup
{binlogs \in PossibleSourceBinlogs : binlogs[1].newr = TargetTable[binlogs[1].pk]}
)
/\ ApplicationReadonly = FALSE
/\ BinlogWriteQueue = <<>>
/\ TargetBinlogPos = 0
/\ BinlogStreamingStopRequested = FALSE
(* Process ProcTableIterator *)
/\ currentRow = defaultInitValue
(* Process ProcBinlogStreamer *)
/\ lastSuccessfulBinlogPos = 0
/\ currentBinlogEntry = defaultInitValue
(* Process ProcApplication *)
/\ oldRecord = defaultInitValue
/\ newRecord = defaultInitValue
/\ chosenPK = defaultInitValue
/\ pc = [self \in ProcSet |-> CASE self = TableIterator -> "tblit_loop"
[] self = BinlogStreamer -> "binlog_loop"
[] self = BinlogWriter -> "binlog_writer_loop"
[] self = Application -> "app_loop"
[] self = Ferry -> "ferry_setro"]
tblit_loop == /\ pc[TableIterator] = "tblit_loop"
/\ IF LastSuccessfulPK < MaxPrimaryKey
THEN /\ pc' = [pc EXCEPT ![TableIterator] = "tblit_rw"]
ELSE /\ pc' = [pc EXCEPT ![TableIterator] = "Done"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK, SourceTable,
TargetTable, SourceBinlog, ApplicationReadonly,
BinlogWriteQueue, TargetBinlogPos,
BinlogStreamingStopRequested, currentRow,
lastSuccessfulBinlogPos, currentBinlogEntry,
oldRecord, newRecord, chosenPK >>
tblit_rw == /\ pc[TableIterator] = "tblit_rw"
/\ currentRow' = SourceTable[LastSuccessfulPK + 1]
/\ IF currentRow' # NoRecordHere /\ TargetTable[LastSuccessfulPK + 1] = NoRecordHere
THEN /\ TargetTable' = [TargetTable EXCEPT ![LastSuccessfulPK + 1] = currentRow']
ELSE /\ TRUE
/\ UNCHANGED TargetTable
/\ pc' = [pc EXCEPT ![TableIterator] = "tblit_upkey"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK, SourceTable,
SourceBinlog, ApplicationReadonly,
BinlogWriteQueue, TargetBinlogPos,
BinlogStreamingStopRequested,
lastSuccessfulBinlogPos, currentBinlogEntry,
oldRecord, newRecord, chosenPK >>
tblit_upkey == /\ pc[TableIterator] = "tblit_upkey"
/\ LastSuccessfulPK' = LastSuccessfulPK + 1
/\ pc' = [pc EXCEPT ![TableIterator] = "tblit_loop"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
ActualCopiedPK, SourceTable, TargetTable,
SourceBinlog, ApplicationReadonly,
BinlogWriteQueue, TargetBinlogPos,
BinlogStreamingStopRequested, currentRow,
lastSuccessfulBinlogPos, currentBinlogEntry,
oldRecord, newRecord, chosenPK >>
ProcTableIterator == tblit_loop \/ tblit_rw \/ tblit_upkey
binlog_loop == /\ pc[BinlogStreamer] = "binlog_loop"
/\ IF BinlogStreamingStopRequested = FALSE \/ (BinlogStreamingStopRequested = TRUE /\ lastSuccessfulBinlogPos < TargetBinlogPos)
THEN /\ pc' = [pc EXCEPT ![BinlogStreamer] = "binlog_read"]
ELSE /\ pc' = [pc EXCEPT ![BinlogStreamer] = "Done"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK, SourceTable,
TargetTable, SourceBinlog, ApplicationReadonly,
BinlogWriteQueue, TargetBinlogPos,
BinlogStreamingStopRequested, currentRow,
lastSuccessfulBinlogPos, currentBinlogEntry,
oldRecord, newRecord, chosenPK >>
binlog_read == /\ pc[BinlogStreamer] = "binlog_read"
/\ IF lastSuccessfulBinlogPos < Len(SourceBinlog)
THEN /\ currentBinlogEntry' = SourceBinlog[lastSuccessfulBinlogPos + 1]
/\ pc' = [pc EXCEPT ![BinlogStreamer] = "binlog_queue"]
ELSE /\ pc' = [pc EXCEPT ![BinlogStreamer] = "binlog_loop"]
/\ UNCHANGED currentBinlogEntry
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK, SourceTable,
TargetTable, SourceBinlog, ApplicationReadonly,
BinlogWriteQueue, TargetBinlogPos,
BinlogStreamingStopRequested, currentRow,
lastSuccessfulBinlogPos, oldRecord, newRecord,
chosenPK >>
binlog_queue == /\ pc[BinlogStreamer] = "binlog_queue"
/\ BinlogWriteQueue' = Append(BinlogWriteQueue, currentBinlogEntry)
/\ pc' = [pc EXCEPT ![BinlogStreamer] = "binlog_upkey"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK, SourceTable,
TargetTable, SourceBinlog, ApplicationReadonly,
TargetBinlogPos, BinlogStreamingStopRequested,
currentRow, lastSuccessfulBinlogPos,
currentBinlogEntry, oldRecord, newRecord,
chosenPK >>
binlog_upkey == /\ pc[BinlogStreamer] = "binlog_upkey"
/\ lastSuccessfulBinlogPos' = lastSuccessfulBinlogPos + 1
/\ pc' = [pc EXCEPT ![BinlogStreamer] = "binlog_loop"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK, SourceTable,
TargetTable, SourceBinlog, ApplicationReadonly,
BinlogWriteQueue, TargetBinlogPos,
BinlogStreamingStopRequested, currentRow,
currentBinlogEntry, oldRecord, newRecord,
chosenPK >>
ProcBinlogStreamer == binlog_loop \/ binlog_read \/ binlog_queue
\/ binlog_upkey
binlog_writer_loop == /\ pc[BinlogWriter] = "binlog_writer_loop"
/\ IF pc[BinlogStreamer] # "Done" \/ Len(BinlogWriteQueue) > 0
THEN /\ pc' = [pc EXCEPT ![BinlogWriter] = "binlog_write"]
ELSE /\ pc' = [pc EXCEPT ![BinlogWriter] = "Done"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK,
SourceTable, TargetTable, SourceBinlog,
ApplicationReadonly, BinlogWriteQueue,
TargetBinlogPos,
BinlogStreamingStopRequested, currentRow,
lastSuccessfulBinlogPos,
currentBinlogEntry, oldRecord, newRecord,
chosenPK >>
binlog_write == /\ pc[BinlogWriter] = "binlog_write"
/\ IF Len(BinlogWriteQueue) > 0
THEN /\ LET currentBinlog == Head(BinlogWriteQueue) IN
IF TargetTable[currentBinlog.pk] = currentBinlog.oldr
THEN /\ TargetTable' = [TargetTable EXCEPT ![currentBinlog.pk] = currentBinlog.newr]
ELSE /\ TRUE
/\ UNCHANGED TargetTable
/\ BinlogWriteQueue' = Tail(BinlogWriteQueue)
/\ pc' = [pc EXCEPT ![BinlogWriter] = "binlog_write"]
ELSE /\ pc' = [pc EXCEPT ![BinlogWriter] = "binlog_writer_loop"]
/\ UNCHANGED << TargetTable, BinlogWriteQueue >>
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK, SourceTable,
SourceBinlog, ApplicationReadonly,
TargetBinlogPos, BinlogStreamingStopRequested,
currentRow, lastSuccessfulBinlogPos,
currentBinlogEntry, oldRecord, newRecord,
chosenPK >>
ProcBinlogWriter == binlog_writer_loop \/ binlog_write
app_loop == /\ pc[Application] = "app_loop"
/\ IF ApplicationReadonly = FALSE
THEN /\ pc' = [pc EXCEPT ![Application] = "app_write"]
ELSE /\ pc' = [pc EXCEPT ![Application] = "Done"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK, SourceTable,
TargetTable, SourceBinlog, ApplicationReadonly,
BinlogWriteQueue, TargetBinlogPos,
BinlogStreamingStopRequested, currentRow,
lastSuccessfulBinlogPos, currentBinlogEntry,
oldRecord, newRecord, chosenPK >>
app_write == /\ pc[Application] = "app_write"
/\ \E pk \in 1..SetMin({TableCapacity, CurrentMaxPrimaryKey + 1}):
chosenPK' = pk
/\ oldRecord' = SourceTable[chosenPK']
/\ \E r \in PossibleRecords \ {oldRecord'}:
newRecord' = r
/\ SourceBinlog' = Append(
SourceBinlog,
[
pk |-> chosenPK',
oldr |-> oldRecord',
newr |-> newRecord'
]
)
/\ SourceTable' = [SourceTable EXCEPT ![chosenPK'] = newRecord']
/\ IF oldRecord' = NoRecordHere /\ chosenPK' > CurrentMaxPrimaryKey
THEN /\ Assert((chosenPK' - 1 = CurrentMaxPrimaryKey),
"Failure of assertion at line 442, column 21.")
/\ CurrentMaxPrimaryKey' = chosenPK'
ELSE /\ TRUE
/\ UNCHANGED CurrentMaxPrimaryKey
/\ pc' = [pc EXCEPT ![Application] = "app_loop"]
/\ UNCHANGED << MaxPrimaryKey, LastSuccessfulPK, ActualCopiedPK,
TargetTable, ApplicationReadonly,
BinlogWriteQueue, TargetBinlogPos,
BinlogStreamingStopRequested, currentRow,
lastSuccessfulBinlogPos, currentBinlogEntry >>
ProcApplication == app_loop \/ app_write
ferry_setro == /\ pc[Ferry] = "ferry_setro"
/\ pc[TableIterator] = "Done"
/\ ApplicationReadonly' = TRUE
/\ pc' = [pc EXCEPT ![Ferry] = "ferry_waitro"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK, SourceTable,
TargetTable, SourceBinlog, BinlogWriteQueue,
TargetBinlogPos, BinlogStreamingStopRequested,
currentRow, lastSuccessfulBinlogPos,
currentBinlogEntry, oldRecord, newRecord,
chosenPK >>
ferry_waitro == /\ pc[Ferry] = "ferry_waitro"
/\ pc[Application] = "Done"
/\ pc' = [pc EXCEPT ![Ferry] = "ferry_binlogpos"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK, SourceTable,
TargetTable, SourceBinlog, ApplicationReadonly,
BinlogWriteQueue, TargetBinlogPos,
BinlogStreamingStopRequested, currentRow,
lastSuccessfulBinlogPos, currentBinlogEntry,
oldRecord, newRecord, chosenPK >>
ferry_binlogpos == /\ pc[Ferry] = "ferry_binlogpos"
/\ TargetBinlogPos' = Len(SourceBinlog)
/\ pc' = [pc EXCEPT ![Ferry] = "ferry_binlogstop"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK,
SourceTable, TargetTable, SourceBinlog,
ApplicationReadonly, BinlogWriteQueue,
BinlogStreamingStopRequested, currentRow,
lastSuccessfulBinlogPos, currentBinlogEntry,
oldRecord, newRecord, chosenPK >>
ferry_binlogstop == /\ pc[Ferry] = "ferry_binlogstop"
/\ BinlogStreamingStopRequested' = TRUE
/\ pc' = [pc EXCEPT ![Ferry] = "Done"]
/\ UNCHANGED << MaxPrimaryKey, CurrentMaxPrimaryKey,
LastSuccessfulPK, ActualCopiedPK,
SourceTable, TargetTable, SourceBinlog,
ApplicationReadonly, BinlogWriteQueue,
TargetBinlogPos, currentRow,
lastSuccessfulBinlogPos,
currentBinlogEntry, oldRecord, newRecord,
chosenPK >>
ProcFerry == ferry_setro \/ ferry_waitro \/ ferry_binlogpos
\/ ferry_binlogstop
(* Allow infinite stuttering to prevent deadlock on termination. *)
Terminating == /\ \A self \in ProcSet: pc[self] = "Done"
/\ UNCHANGED vars
Next == ProcTableIterator \/ ProcBinlogStreamer \/ ProcBinlogWriter
\/ ProcApplication \/ ProcFerry
\/ Terminating
Spec == /\ Init /\ [][Next]_vars
/\ WF_vars(ProcTableIterator)
/\ WF_vars(ProcBinlogStreamer)
/\ WF_vars(ProcBinlogWriter)
/\ WF_vars(ProcApplication)
/\ WF_vars(ProcFerry)
Termination == <>(\A self \in ProcSet: pc[self] = "Done")
\* END TRANSLATION - the hash of the generated TLA code (remove to silence divergence warnings): TLA-435b2ab0b73238c032317daecbca1f25
\* Safety Constraints
\* ==================
SourceTargetEquality == (\A self \in ProcSet: pc[self] = "Done") => (SourceTable = TargetTable)
\* /\ PrintT(<<"Source", SourceTable, "Target", TargetTable>>)
\* Action Constraints
\* ==================
\* It is possible that the binlog infinitely grows. If it becomes too big it
\* becomes infeasible to check the behaviour of the ghostferry algorithm. Thus
\* we limit it.
\*
\* Usually we can pick a MaxBinlogSize == 3.
CONSTANT MaxBinlogSize
BinlogSizeActionConstraint == Len(SourceBinlog) <= MaxBinlogSize
=============================================================================
\* Modification History
\* Last modified Wed Sep 16 16:07:45 EDT 2020 by shuhao
\* Created Thu Jan 18 11:35:40 EST 2018 by shuhao