-
Notifications
You must be signed in to change notification settings - Fork 8
/
analystgraphs.ijs
1686 lines (1271 loc) · 42.5 KB
/
analystgraphs.ijs
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
NB.*analystgraphs s-- analystgraphs class - analyzes analyst csv
NB. and prepares dot digraph code.
NB.
NB. This class analyzes metadata in analyst *.csv and generates
NB. dot digraph code that can be used by the graphviz addon to
NB. produce structure diagrams for analyst models.
NB.
NB. verbatim:
NB.
NB. http://www.graphviz.org/
NB. http://www.jsoftware.com/jwiki/Addons/graphics/graphviz
NB. http://bakerjd99.wordpress.com/2009/09/09/fake-progamming/
NB.
NB. created: 2008apr23
NB. author: bakerjd99@gmail.com
NB. ---------------------------------------------------------
NB. 09aug19 fixed bug to allow models with no links
NB. 09aug26 graphviz loaded
NB. 13dec21 saved in (jacks) GitHub repo
require 'graphics/graphviz'
coclass 'analystgraphs'
NB.*end-header
NB. path to analyst csv print files
AnalystCsvPath=:'\\fe10data\user$\bakej01\modelcsv'
NB. carriage return character
CR=:13{a.
NB. carriage return line feed character pair
CRLF=:13 10{a.
NB. csv file extension
CSVEXT=:'.csv'
NB. header text that delimits objects in analyst print csvs
CSVHEADER=:61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 61 13 10 32 13 10{a.
NB. color or nodes with 0 in and out degree
DISCONNECTCOLOR=:'lightblue'
NB. analyst "." character
DOT=:'.'
NB. D-Cube, D-List or D-Link prefix
DTYPES=:2 3$<;._1 ' D-Cube D-List D-Link cube list link'
NB. dot main graph name
DotGraphName=:'franklin'
NB. color of box edges enclosing analytgraph
EDGECOLOR=:'blue'
NB. elist name in standard nld format - if nonnull length is forced to 1
ElistNameNLD=:''
NB. interface words for class group
IFACEWORDSanalystgraphs=:<;._1 ' loadanalystmodel loadcontributormodel'
NB. indent spaces in DOT generated code
INDENT=:' '
NB. input node color
INPUTCOLOR=:'green'
NB. color of nodes that have in and out degree
INTERIORCOLOR=:'lightgoldenrod'
NB. DOT label word wrap limit
LABELWIDTH=:15
NB. line feed character
LF=:10{a.
NB. used to filter possible links - MUST BE COMPLETE
LINKFILTERTYPES=:<;._1 '|Name:,|Library:,|Source:,|ODBC Datasource:,|Target:,|Table:,|SQL Statement:,'
NB. name library datatype delimiter character
NLDSEP=:'|'
NB. basic node style of analystgraph - usually 'filled'
NODESTYLE=:'filled'
NB. output node color
OUTPUTCOLOR=:'yellow'
NB. root words for class group
ROOTWORDSanalystgraphs=:<;._1 ' ROOTWORDSanalystgraphs appendCFlts atfrdlist dlistnldobjs epsfrps filterlinks loadanalystmodel loadcontributormodel loadhrd pcdigraph uniquesql'
NB. link source text
SOURCE=:'Source'
NB. all supported sources/targets - MUST BE COMPLETE
SOURCETARGETTYPES=:'/D-Cube /.cube./File Map /.fmap/Contributor Cube: /.contributor. /ODBC-Input /.ODBC '
NB. tab character
TAB=:9{a.
NB. link target text
TARGET=:'Target'
NB. maximum line length of dlink plaintext nodes
TEXTDLINKLABELWIDTH=:100
NB. maximum line length of dlist plaintext nodes
TEXTDLISTLABELWIDTH=:80
NB. color of arcs connecting plaintext nodes in analystgraph - white works best
TEXTNODEARROWCOLOR=:'white'
NB. color of plaintext nodes
TEXTNODECOLOR=:'lightgray'
NB. font size in points of analystgraph main title text
TITLEFONTSIZE=:24
NB. length of leading type prefix text in analyst csv
TYPEWIDTH=:6
NB. text used for links with unknown sources
UNKNOWNSOURCECUBES=:'UNKNOWN LINK SOURCE(S)'
NB. text used for links with unknown targets
UNKNOWNTARGETCUBES=:'UNKNOWN LINK TARGET(S)'
NB. adjacency matrix from source target connection table
adjfrct=:([: { 2 # [: < [: ~. ,) e. <"1
NB. retains string (y) after last occurrence of (x)
afterlaststr=:] }.~ #@[ + 1&(i:~)@([ E. ])
NB. retains string after first occurrence of (x)
afterstr=:] }.~ #@[ + 1&(i.~)@([ E. ])
NB. trims all leading and trailing blanks
alltrim=:] #~ [: -. [: (*./\. +. *./\) ' '&=
NB. trims all leading and trailing white space
allwhitetrim=:] #~ [: -. [: (*./\. +. *./\) ] e. (9 10 13 32{a.)"_
antimode=:3 : 0
NB.*antimode v-- finds the least frequently occurring item(s) in
NB. a list.
NB.
NB. monad: ul =. antimode ul
NB.
NB. antimode ?.500#100
NB. antimode ;:'blah blah blah yada yada wisdom'
if. 0 < # y =. ,y do. NB. no antimodes for null lists
f =. #/.~ y NB. nub frequency
(~. y) #~ f e. <./ f NB. lowest frequency items
else. y
end.
)
appendCFlts=:4 : 0
NB.*appendCFlts v-- kludge to work around analyst CF link
NB. printing bug.
NB.
NB. Printing an analyst link that references a CF source/target
NB. crashes analyst 8.3 (08sep15). This verb appends in a list of
NB. known links to the (lts) table so these sources and sinks can
NB. be included in digraphs.
NB.
NB. dyad: clCFlinks appendCFlts (stSrm;<stLts)
NB.
NB. StagingCFLinks appendCFlts (srm;<lts)
'srm lts'=. y
if. 0=#x do. lts return. end.
NB. known CF links
cfl=. s: x
NB. rumfeldian known unknowns
unsource=. s:<UNKNOWNSOURCECUBES
untarget=. s:<UNKNOWNTARGETCUBES
lc=. {:"1 >0{srm
cb=. {."1 >0{srm
NB. test link names - a link does not have to
NB. reference known sources and targets
b=. cfl e.{."1;lc
if. 0 e. b do.
NB. pass through links - unknown sources and targets
ptl=. (-.b)#cfl
lts=. lts , (ptl ,. unsource) ,. untarget
end.
if. -.+./b do. lts return. end.
cfl=. b#cfl
NB. blpl of masks of cubes in library that reference known CF links
m=. ({."1&.> lc) e.&.> <cfl
b=. +./&> m
NB. cubes with CF links
lc=. b#m #&.> lc
cb=. (#&> lc)#b#;cb
cb=. cb ,. ;lc
NB. links to first column
cb=. 1 2 0 {"1 cb
NB. we cannot see CF sources/targets - replace any target symbols with UNKNOWNS
pt=. I. (1 {"1 cb) = s:<TARGET
cb=. unsource (<pt;1)} cb
NB. rotate any remaining sources to proper column
ps=. (1 {"1 cb) = s:<SOURCE
cb=. (0 {"1 cb) ,. ps (|."0 1) 1 2 {"1 cb
cb=. untarget (<(I. ps);2)} cb
NB. append to link source target table
lts , cb
)
NB. signal with optional message
assert=:0 0"_ $ 13!:8^:((0: e. ])`(12"_))
atfrdlist=:3 : 0
NB.*atfrdlist v-- generate TAB delimited access table text from dlist csv text.
NB.
NB. monad: atfrdlist clDlistCsv
NB.
NB. di=. dlistnldobjs read '\\fe10data\user$\bakej01\modelcsv\dev common dlists.csv'
NB. pos=. (;0{di) i. s: <'2 - Products SRP2 and SRP3|Common|list'
NB. atfrdlist ;pos{1{di
NB.
NB. dyad: clAccess atfrdlist clDlistCsv
NB.
NB. 'HIDDEN' atfrdlist dltxt
'WRITE' atfrdlist y
:
'dlist items'=. 1 dlistitems y
head=. (NLDSEP&beforestr dlist),TAB,'AccessLevel',CRLF
head, ;items ,&.> <TAB,(alltrim x),CRLF
)
NB. retains string (y) before last occurrence of (x)
beforelaststr=:] {.~ 1&(i:~)@([ E. ])
NB. retains string before first occurrence of (x)
beforestr=:] {.~ 1&(i.~)@([ E. ])
betweenstrs=:4 : 0
NB.*betweenstrs v-- select sublists between nonnested delimiters
NB. discarding delimiters.
NB.
NB. dyad: blcl =. (clStart;clEnd) betweenstrs cl
NB. blnl =. (nlStart;nlEnd) betweenstrs nl
NB.
NB. ('start';'end') betweenstrs 'start yada yada end boo hoo start ahh end'
NB.
NB. NB. also applies to numeric delimiters
NB. (1 1;2 2) betweenstrs 1 1 66 666 2 2 7 87 1 1 0 2 2
's e'=. x
llst=. ((-#s) (|.!.0) s E. y) +. e E. y
mask=. ~:/\ llst
(mask#llst) <;.1 mask#y
)
NB. boxes open nouns
boxopen=:<^:(L. = 0:)
changestr=:4 : 0
NB.*changestr v-- replaces substrings - see long documentation.
NB.
NB. dyad: clReps changestr cl
NB.
NB. NB. first character delimits replacements
NB. '/change/becomes/me/ehh' changestr 'blah blah ...'
pairs=. 2 {."(1) _2 [\ <;._1 x NB. change table
cnt=._1 [ lim=. # pairs
while. lim > cnt=.>:cnt do. NB. process each change pair
't c'=. cnt { pairs NB. /target/change
if. +./b=. t E. y do. NB. next if no target
r=. I. b NB. target starts
'l q'=. #&> cnt { pairs NB. lengths
p=. r + 0,+/\(<:# r)$ d=. q - l NB. change starts
s=. * d NB. reduce < and > to =
if. s = _1 do.
b=. 1 #~ # b
b=. ((l * # r)$ 1 0 #~ q,l-q) (,r +/ i. l)} b
y=. b # y
if. q = 0 do. continue. end. NB. next for deletions
elseif. s = 1 do.
y=. y #~ >: d r} b NB. first target char replicated
end.
y=.(c $~ q *# r) (,p +/i. q)} y NB. insert replacements
end.
end. y NB. altered string
)
charsub=:4 : 0
NB.*charsub v-- single character pair replacements.
NB.
NB. dyad: clPairs charsub cu
NB.
NB. '-_$ ' charsub '$123 -456 -789'
'f t'=. ((#x)$0 1)<@,&a./.x
t {~ f i. y
)
classifyobjs=:3 : 0
NB.*classifyobjs v-- collects cubes, lists and links from general
NB. analyst model csv.
NB.
NB. monad: blcl =. classifyobjs clCsv
NB.
NB. csv=. read 'g:\modelcsv\dev sales forecast model.csv'
NB. classifyobjs csv
NB.
NB. dyad: blcl=. blcl classifyobjs clCsv
NB.
NB. NB. get only dlists
NB. common=. read 'g:\modelcsv\dev common.csv'
NB. (1{0{DTYPES) classifyobjs common
NB.
NB. NB. dcubes and dlinks
NB. (0 2{0{DTYPES) classifyobjs common
(0{DTYPES) classifyobjs y
:
NB. check types
dtypes=. ~. x
'invalid object types' assert dtypes e. 0{DTYPES
csv=. labelfirstcsvobj y
NB. cut on header
'invalid first object header' assert 0{mask=. CSVHEADER E. csv
csv=. mask <;.1 csv
NB. collect objects
otype=. allwhitetrim&.> 'Name:,'&beforestr&.> 'Type of object:,'&afterstr&.> csv
order=. /: otype=. dtypes i. otype
;&.> (#dtypes){.(order{otype) </. order{csv
)
colorinputs=:4 : 0
NB.*colorinputs v-- generate DOT node coloring code.
NB.
NB. dyad: (stShortlongxref;clNodeattr) colorinputs slNodes
'st na'=. x
ctl INDENT ,"1 ;"1 [ 5 s: (((1{"1 st) i. y) { 0 {"1 st) ,. s:<' [',na,'];'
)
coloroutputs=:4 : 0
NB.*coloroutputs v-- generate DOT output node coloring code.
NB.
NB. dyad: (stShortlongxref;clNodeattr) colorinputs slNodes
'st na'=. x
ctl INDENT ,"1 ;"1 [ 5 s: (((1 {"1 st) i. y) { 0 {"1 st) ,. s:<' [',na,'];'
)
countsql=:3 : 0
NB.*countsql v-- sql syntax match test.
NB.
NB. Given a list of SQL statements this verb attempts to count
NB. the number of distinct statements. The test is based on SQL
NB. syntax and should ignore all white space and identifier case
NB. differences.
NB.
NB. The best algorithm uses J's word parsing to cut away all the
NB. white space and to isolate single quote delimited strings.
NB. Double quote delimited strings are not handled. J word
NB. formation is not ideal for SQL and a better algorithm would
NB. use a finite state machine to tokenize SQL.
NB.
NB. monad: btcl =. countsql blclSql
NB. attempt to tokenize sql
tsql=. (;: :: _1:) &.> y
if. +./_1 e.&> tsql do.
NB. sql will not cut with J word formation use naive algorithm
tsql=. >y ,&.> y
tsql=. toupper reb"1 ljust tsql
usql=. ~. tsql
usql=. usql #~ -. *./"1 ' ' = usql
else.
tsql=. > tsql
qp=. I. ,''''&e.&> tsql
str=. qp { ,tsql
NB. upper case for all non-empty tokens - the empty
NB. test is marginally faster than: toupper L: 0 tsql
tsql=. (]`toupper@.(0 < #)) L: 0 tsql
NB. reinsert quoted text
tsql=. ($tsql) $ str qp} ,tsql
NB. unique nonempty rows
usql=. ~.tsql
usql=. usql #~ -. *./"1 ] 0=#&> usql
end.
NB. label unique sql
selst=. (<"1 'select #' ,"1 ljust ": ,.>:i.#usql),<''
usql=. ((usql i. tsql) {selst) ,. y
)
NB. character table to newline delimited list
ctl=:}.@(,@(1&(,"1)@(-.@(*./\."1@(=&' '@])))) # ,@((10{a.)&(,"1)@]))
NB. enclose all character lists in blcl in " quotes
dblquote=:'"'&,@:(,&'"')&.>
dcubelistsizes=:3 : 0
NB.*dcubelistsizes v-- computes cube list sizes table
NB.
NB. monad: bt =. dcubelistsizes stSrm
NB.
NB. srm=: modelstruc hrm
NB. dcubelistsizes srm
cubes=. ;{."1 >0{ y
lists=. 1 {"1 >1{ y
ld=. lists #&.>~ ({."1&.> lists) e.&.> <cubes
ld=. (0 2 {"1 >1{y) ,. ld
ld=. ld #~ 0 < #&> {:"1 ld
ld=. (( #&.> {:"1 ld) ,.@#&.> 2 {."1 ld) ,. {:"1 ld
ld=. (; 0{"1 ld);(; 1 {"1 ld); ;2 {"1 ld
ld=. (</: >{:"1 ld) {&.> ld
NB. if a nonempty elist name has been set - replace
NB. whatever length the elist has with 1 - this better
NB. reflects actual node size in contributor models
if. #ElistNameNLD do.
el=. s:<ElistNameNLD
ld=. (<1(I.,el=;0{ld)};1{ld) (1)}"1 ld
end.
ld=. |:(<~: >{:"1 ld) <;.1&> ld
((1&{.)@:,&.> ~.&.> 2 {"1 ld) ,. 2 {."1 ld
)
dcubesizes=:3 : 0
NB.*dcubesizes v-- computes cube sizes from cube list sizes
NB.
NB. monad: bt =, dcubesizes btDcubelistsizes
NB.
NB. dcs=. dcubelistsizes srm
NB. dcubesizes dcs
(5 s: ;0 {"1 y) ,. <"1 (#&> 1 {"1 y) ,. */&> 2 {"1 y
)
dcubestat=:3 : 0
NB.*dcubestat v-- descriptive statistics for dcubes
NB.
NB. monad: ct =. dcubestal nl
NB.
NB. dcubestat ?.1000#100
NB.
NB. dyad: ct =. faRound dcubestat nl
NB.
NB. 0.1 dcubestat ?.1000#100
0.0001 dcubestat y
:
t=. '/sample size/minimum/maximum/1st quartile/2nd quartile/3rd quartile/first mode'
t=. t , '/first antimode/mean/std devn'
min=. <./
max=. >./
t=. ,&': ' ;._1 t
v=. $,min,max,q1,median,q3,({.@mode2),({.@antimode),mean,stddev
s=. x round ,. v , y
NB. split to properly format nondecimals
t ,. rjust ('c' (8!:2) <. 0 1 2 { s) , ('c' (8!:2) (3 4 5 6 7){s) , ('c' (8!:2) 8 9{s)
)
NB. deviation about mean
dev=:-"_1 _ mean
dimxref=:3 : 0
NB.*dimxref v-- dlist/dimension xref
NB.
NB. monad: dimxref stDlist
NB.
NB. dls=. dcubelistsizes srm
NB. dimxref dls
un=. /:~ ~. ; ,&.> 1 {"1 y NB. unique dlists
NB. short name full xref
(s: 'd' ,"1 ljust ": ,. >:i. #un) ,. un
)
dlistitems=:3 : 0
NB.*dlistitems v-- extract dlist item table from dlist object.
NB.
NB. monad: dlistitems clDistcCsv
NB.
NB. do=. objlist read '\\fe10data\user$\bakej01\modelcsv\dev common dlists.csv'
NB. dlistitems 0 pick do
0 dlistitems y
:
th=. 'No.,IID,Item name,Format,Calc,Calc Options'
NB. following code assumes current *.csv dlist format
NB. will break if ',' character appears after dlist item section
items=. ',' ,~ ','&beforelaststr (th,LF)&afterstr y -. CR
NB. replace quoted ,'s and parse
rep=. 1{a.
'replacement character in dlist text' assert ~: rep e. items
items=. ('",',rep) requoted items
items=. items -. '"'
items=. (<;._1 ',', th) , <;._1&> ',' ,&.> <;._2 tlf items
NB. restore ,'s in parsed text
if. #rows=. I. +./"1 rep&e.&> items do.
items=. ((rep,',')&charsub&.> rows{items) rows} items
end.
NB. remove items without item numbers & iids
if. x-:1 do.
itiid=. _1&".&.> (0 1 {"1 items) -.&.> ' '
items=. items #~ (*./"1 ] 0 < ;"1 itiid) *. *./"1 #&> itiid
items=. 2 {"1 items
end.
NB. extract dlist name and library
name=. allwhitetrim ;('Name:,';'Library:,') betweenstrs y
lib=. allwhitetrim& ;('Library:,';'Created:,') betweenstrs y
(name,NLDSEP,lib);<items
)
dlistnldobjs=:3 : 0
NB.*dlistnldobjs v-- returns a bt of dlist nlds and cut object text.
NB.
NB. monad: dlistnlds clDistCsv
NB.
NB. dlistnldobjs read '\\Fe10sql-cp-dev\f$\Planning Data Loads\Expenses Support Files\prd common dlists.csv'
do=. objlist y
('list'&nld&.> do) ,: do
)
NB. extracts length of dlist from analyst csv
dlistsize=:[: _1&".&> (<13 10 9{a.) -.&.>~ [: 'Timescale:,'&beforestr&.> 'Number of items:,'&afterstr&.>
NB. bit mask of ct rows starting with 'D-'
dlmask=:[: , 1 {."1 (,:'D-') E. ]
dotdigraph=:4 : 0
NB.*dotdigraph v-- generates dot digraph code.
NB.
NB. This verb takes a directed graph represented as a three
NB. column table of name source and target (parent child) symbols
NB. and generates DOT code that can be used by the graphviz addon
NB. to draw the graph.
NB.
NB. Long labels are mapped to short forms and formatted so they
NB. will wrap within the graph nodes. This handles long analyst
NB. object names.
NB.
NB. dyad: clDot =. btHrm dotdigraph stNameSourcetarget
NB.
NB. lts=. hrm linktargets hrd NB. see: linktargets
NB. dot=. (hrm;'dotname') dotdigraph lts
NB. dotflt=. ((<hrm),'dotname';'\n(filtered)') dotdigraph lts
'hrm dotname filter'=. 3{.x
NB. parsed model to symbol
srm=. modelstruc hrm
NB. formatted dcube sizes
dls=. dcubelistsizes srm
dcs=. dcubesizes dls
NB. extract library model name
lb=. NLDSEP&beforestr NLDSEP&afterstr ,>0{0{dcs
dcsfmt=. (<"1 'lp<\n>q<d>6.0,cq<c>' 8!:2 >{:"1 dcs) (<a:;1)} dcs
dcsfmt=. dcsfmt fmtdimlists dls
NB. topological link source target
lts=. sortlts y
NB. source/target cubes
st=. 1 2 {"1 lts
NB. link names
ln=. 0 {"1 lts
NB. short labels long name xref
nxf=. nodexref st
NB. cubes without any links
ncn=. ~.(;{."1 >0{srm) -. {:"1 nxf
ncxf=. (#nxf) nodexref ncn
NB. map nodes to short forms
dg=. ((1 {"1 nxf) i.st) { 0 {"1 nxf
NB. link labels
ll=. s: (<' "]') ,&.>~ (<' [label=" ') ,&.> ":&.> <"0 >: i. # dg
NB. dot node connection syntax
dg=. (0 3 1 2 4) {"1 (dg ,. ll) ,"1 s: '->';'; //L: '
dg=. dg ,. ln
dg=. ctl ;"1 (<' ') ,. (5 s: dg) ,&.> ' '
NB. graph header and trailer - cluster 1
hdr=. 'digraph ',dotname,' {',LF,'subgraph cluster1 {',LF,' ordering=out;',LF
if. #NODESTYLE do.
hdr=. hdr,' node [style=',NODESTYLE,', color=',INTERIORCOLOR,'];',LF
dccolor=. ', color=',DISCONNECTCOLOR
edcolor=. ' color=',EDGECOLOR
incolor=. ' color=',INPUTCOLOR
outcolor=. ' color=',OUTPUTCOLOR
else.
NB. with no nodestyle a plain uncolored graph is generated
NB. this form is more readable when printed on bw printers
dccolor=. edcolor=. ''
NB. replace colors with peripheries
incolor=. outcolor=. 'peripheries=3'
end.
tr=. LF,'}'
NB. node labels
ndl=. (dcsfmt;'') labelsfrnld nxf
NB. disconnected labels
dcnodes=. (dcsfmt;dccolor) labelsfrnld ncxf
NB. color input and output nodes
innodes=. (nxf;incolor) colorinputs ~.(1 {"1 st) -. 0 {"1 st
outnodes=. (nxf;outcolor) coloroutputs ~.(0 {"1 st) -. 1 {"1 st
NB. statistic text nodes
lsg=.LF,(dcs;<dls) fmtlinklabels ln
tail=. LF,INDENT,'label="',lb,'\n',(timestamp ''),filter,'" fontsize=',(":TITLEFONTSIZE),';',LF,edcolor,LF,'}'
lsg=. lsg,tail
NB. dot code
hdr,dg,LF,ndl,LF,innodes,LF,outnodes,LF,dcnodes,lsg,tr
)
epsfrps=:3 : 0
NB.*epsfrps v-- convert postscript (ps) to encapsulated
NB. postscript (eps).
NB.
NB. Many simple postscript files can be converted to encapsulated
NB. postscript with this simple hack. The postscript generated by
NB. the graphviz addon can be converted with this verb. WARNING:
NB. this type of hack will not work for most postscript files and
NB. may stop working for graphviz outputs in the future.
NB.
NB. monad: clPathfileEPS =. epsfrps clPathfilePs
'missing .ps file extension' assert 1 e. '.ps' E. y
ps=. read y NB. get postscript PS
eps=. '%!PS-Adobe-3.0 EPSF-3.0',CRLF,ps NB. instant EPS
epsfile=. ('.eps' ,~ '.'&beforelaststr) y NB. eps file
eps write epsfile NB. save eps data
epsfile NB. file name result
)
filterlinks=:4 : 0
NB.*filterlinks v-- filter analyst model link table.
NB.
NB. The internal links within a contributor model must be
NB. confined to a single analyst library. Links that target items
NB. outside a library are seldom included in a contributor model.
NB. This verb removes all such links from a source target table.
NB.
NB. dyad: st =. clLibName filterlinks stSourcetarget
NB. (n*2) source target st to (n*6) btcl
elts=. ;"1 (<;._1)&.> NLDSEP ,&.> 5 s: 1 2 {"1 y
NB. mask matching (x) library name
mask=. *./"1 (<allwhitetrim x) -:&> allwhitetrim&.> 1 4 {"1 elts
NB. contributor links
mask # y
)
fmtdimlists=:4 : 0
NB.*fmtdimlists v-- format dlist label xref lists.
NB.
NB. Appends a column of xrefed dlists to cube sizes table (x).
NB.
NB. dyad: bt =. dcs fmtdimlists dls
NB. dlist short name long name xref
s=. dimxref y
NB. short names replacing long dlist names
d=. ((<1 {"1 s) i.&.> 1 {"1 y) {&.> <0 {"1 s
NB. short names to char lists
d=. ,&.> (rjust&.> 4&s:@,&.> d) ,"1&.> '-'
d=. }:&.> d -.&.> ' '
NB. we need only one d
d=. (<'d= ') ,&.> d -.&.> 'd'
NB. append to cube sizes
x ,. (<'\n') ,&.> d
)
fmtdlistsizesxref=:3 : 0
NB.*fmtdlistsizesxref v-- formats dlist lengths xref.
NB.
NB. monad: fmtdlistsizesxref btDls
dxf=. dimxref y
udls=. uniquedlistsizes y
NB. attach dlist lengths to xref symbol
dlens=. <"1 'p<[>q<]>' (8!:2) > ((; 0 {"1 udls) i. 1 {"1 dxf) { 1 {"1 udls
dsx=. s: (5 s: 0 {"1 dxf) ,&.> ' ' ,&.> dlens
NB. return xref bt
(dsx) (<a:;0)} dxf
)
fmtlinklabels=:4 : 0
NB.*fmtlinklabels v-- formats a single node subgraph that contains left justified link names.
NB.
NB. dyad: clNode =. (btDcs;<btDls) fmtlinklabels slLinknames
NB. dcube and dlist size tables & contributor flag
'dcs dls'=. x
ctr=. 1
NB. dlist names with index prefix
dln=. allwhitetrim&.> <"1 ] 2 }."1 ;"1 (<': ') ,&.> 5 s: fmtdlistsizesxref dls
ll=. 5 s: y NB. link names
NB. extract library model name
lb=. NLDSEP&beforestr NLDSEP&afterstr ,>0{0{dcs
NB. node suffix
sfx=. ',shape=plaintext, color=',TEXTNODECOLOR,', fontname=courier];'
NB. model summary
ms=. INDENT,'l0 [label="',lb,(modelsummary dcs;<dls),'"',sfx,LF
NB. dcube size statistics
dcs=. INDENT,'l1 [label="',lb,' DCube Sizes',(,'\n' ,"1 ] 0.1 dcubestat {:&> {:"1 dcs),'"',sfx,LF
NB. dlist length statistics
dls=. ;&.> <"1 |: 1 2 {"1 dls
dls=. ,(~:>{."1 dls) # >{:"1 dls
dls=. INDENT,'l2 [label="',lb,' DList Lengths',(,'\n' ,"1 ] 0.1 dcubestat dls),'"',sfx,LF
NB. head=. 'subgraph cluster2 {',LF,' node [style=filled, color=white];',LF
head=.''
NB. tail=. LF,INDENT,'label="',lb,' Model\n',(timestamp ''),'" fontsize=24;',LF,' color=',EDGECOLOR,LF,'}'
NB. dlink names
ll=. 5 s: y
if. ctr do.
NB. all contributor model links should exists in one library
NB. if this is not the case do not peel library names
lbcn=. NLDSEP&afterlaststr&.> NLDSEP&beforelaststr&.> ll
lbcn=. ~.lbcn -. <lb
ctr=. ctr - 0<#lbcn
end.
NB. standard utility !(*)=. list
fl=. (,'\n' ,"1 ,. TEXTDLINKLABELWIDTH list (ljust ': ' ,"1~ ": ,. >:i.#ll) ,. >(NLDSEP&beforelaststr^:(1+ctr))&.> ll) ,'"'
fl=. INDENT,'l3 [label="',lb,' DLink Names',fl,sfx
NB. dlist names
if. ctr do.
NB. determine common library name if it exists
lbcn=. NLDSEP&afterlaststr&.> NLDSEP&beforelaststr&.> ,dln
lbcn=. lbcn -. <lb
if. 1=#~. lbcn do.
lbc=. ;{.~.lbcn -. <lb
lbc=. ''"_ `('\n Common Library: '&,)@.(0<#lbc) lbc
else.
NB. proper contributor models have lists from at most two libraries
NB. if more are detected do not peal library names on dlists
lbc=. '' [ ctr=. 0
end.
dl=. (lbc,,'\n' ,"1 TEXTDLISTLABELWIDTH list > (NLDSEP&beforelaststr^:(1+ctr))&.> ,dln) , '"'
else.
dl=. (,'\n' ,"1 TEXTDLISTLABELWIDTH list > NLDSEP&beforelaststr&.> ,dln) , '"'
end.
dl=. LF,INDENT,'l4 [label="',lb,' Dlist Names',dl,sfx
rn=. LF,INDENT,'{rank=same; l1; l2}'
head,ms,dcs,dls,fl,dl,rn,LF,INDENT,'l0 -> l1 -> l2 -> l3 -> l4 [color=',TEXTNODEARROWCOLOR,'];' NB. ,tail
)
labelfirstcsvobj=:3 : 0
NB.*labelfirstcsvobj v-- labels the the first dcube, dlist or
NB. dlink in model csv.
NB.
NB. Analyst 8.4 print CSVs have an annoying feature. The first
NB. object is not labeled with a header like all subsequent
NB. objects. I am sure there is some lame ass reason for this but
NB. it complicates parsing. This verb inspects the first object
NB. and labels it. The only valid labels applied are dcube, dlist
NB. and dlink. Other types are not labeled but a header break is
NB. inserted so whatever it is can be parsed.
NB.
NB. monad: cl =. labelfirstcsvobj clCsv
NB. extract any nonblank text before the first header
if. #fo=. alltrim CSVHEADER&beforestr y do.
NB. inspect file type to determine what the object is
fo=. '.'&afterlaststr CRLF&beforestr 'Name of DOS file:,'&afterstr fo
select. fo
case. 'H1' do. CSVHEADER,'Type of object:,D-List',CRLF,y
case. 'H2' do. CSVHEADER,'Type of object:,D-Cube',CRLF,y
case. 'H8' do. CSVHEADER,'Type of object:,D-Link',CRLF,y
case. do. CSVHEADER,'Type of object:,NOTLABLED',CRLF,y
end.
else.
y
end.
)
labelsfrnld=:3 : 0
NB.*labelsfrnld v-- format DOT labels from nld symbols.
NB.
NB. monad: cl =. labelsfrnld stNxf
NB. dyad: cl =. (btCubesizes;clLabelmodifiers) labelsfrnld stNxf
NB.
NB.
NB. NB. dcube size table
NB. dcs=. dcubesizes dcubelistsizes srm
NB.
NB. NB. append node modifier string to formatted labels
NB. (dcs;' ,color=paleblue') labelsfrnld ncxf NB. see (dotdigraph)
('';'') labelsfrnld y
:
NB. cube size table and label modifier string
'dcs lbm'=. x
NB. extract only names from nld
names=. {."1 <;._1"1 NLDSEP ,. 4 s: 1 {"1 y
NB. full names more useful for pure analyst data flows
NB. names=. 5 s: 1 {"1 y
NB. NIMP check no quotes no \ chars
NB. analyst object names are often long - insert breaks
wrp=. LABELWIDTH&wrapwords
names=. wrp&.> names
names=. (TAB,LF,TAB,'\n')&changestr&.> names
NB. append any matching cube dimension/sizes
if. #dcs do.
NB. match cubes and reorder
dcs=. ((s: 0 {"1 dcs) i. 1 {"1 y) { dcs , '';''
NB. append dimension cell counts and dim lists
dims=. allwhitetrim&.> <"1 ;"1 ] allwhitetrim &.> 1 2{"1 dcs
names=. allwhitetrim&.> names ,&.> dims
end.
NB. label syntax with short xref node names
names=. dblquote names
labels=. (<' [label=') ,&.> names ,&.> <lbm,'];'
ctl >(<' ') ,&.> (5 s: 0 {"1 y) ,&.> labels
)
linktargets=:4 : 0
NB.*linktargets v-- link source and target cubes.
NB.
NB. dyad: bt linktargets blclCsv
NB.
NB. lts=. hrm linktargets hrd NB. see: parsemodelstruc
NB. link table
if. 0 e.$lnks=. 2 pick y do.
0 3$s:<'' NB. no dlinks
else.
tab=. ];._1 LF,lnks -. CR
NB. filter table
mask=. +./ ,&> (,:&.> LINKFILTERTYPES) (1&{."1)@:E.&.> <tab
'invalid model data' assert +./mask
tab=. rebtbcol mask#tab
NB. link symbol name
names=. (allwhitetrim&.> <"1 'Name:,'&afterstr"1 tab) -. a:
librs=. (allwhitetrim&.> <"1 'Library:,'&afterstr"1 tab) -. a:
names=. s: }."1 ;"1 NLDSEP ,&.> names ,. librs ,. <'link'
NB. check link names
modellinks=. ; 0 {"1 [ 2 pick x
'link name references invalid' assert names e. modellinks
NB. targets and sources
targets=. (allwhitetrim&.> <"1 'Target:,'&afterstr"1 tab) -. a:
NB. relabel any ODBC sources - ODBC sources
NB. do not follow analyst naming conventions
if. +./odbcmask=. 0 {"1 (,:'ODBC Datasource:,') E. tab do.
NB. DANGER WILL ROBINSON - using an offset to pull dsn,sql
dsn=. rebtbcol tab #~ _1 |.!.0 odbcmask
NB. Count distinct sql statements - this count is only
NB. an estimate of how many "different" sources as
NB. different sql could return the same result and
NB. my test for sql syntax equivalence is naive.
NB. A GOOD ANALYST MODEL SHOULD HAVE STANDARDIZED SQL
NB. STATEMENTS IN LINKS - people should give up
NB. smoking as well.
NB. WARNING: Cognos Analyst mangles the SQL in *.csv
NB. files so even if this was a perfect sql equivalence
NB. it will still be misleading. (Feb 12, 2009)
sql=. countsql <"1 (#'SQL Statement:,') }."1 tab #~ _2 |.!.0 odbcmask
odbctab=. '/Table:,/ODBC Datasource:,ODBC-Input 'changestr ctl dsn,.' ',. >0{"1 sql
odbctab=. ('/',DOT,'/-/',NLDSEP,'/ ') changestr odbctab
odbctab=. <;._2 tlf odbctab
tab=. >odbctab (I. odbcmask) } <"1 tab
end.