Ideas for refining the structure of FFTrees objects #226

ndphillips · 2024-05-29T12:32:41Z

@hneth In the context of #224 I realized that the structure of FFTrees objects is making development challenging - such as knowing how to modularize the plot.FFTrees() function.

The current code to create FFTrees objects is contained here:

FFTrees/R/fftrees_create.R

Lines 765 to 852 in 71b9da0

    
           # Create x (as list): 
        
           x <- list( 
        
             # Names of criterion vs. cues: 
        
             criterion_name = criterion_name, 
        
             cue_names = cue_names, 
        
             # Formula: 
        
             formula = formula, # original formula 
        
             # Tree info: 
        
             trees = list( 
        
               n = NULL, 
        
               best = NULL, 
        
               definitions = NULL, 
        
               inwords = NULL, 
        
               stats = NULL, 
        
               level_stats = NULL, 
        
               decisions = list( 
        
                 train = list(), 
        
                 test = list() 
        
               ) 
        
             ), 
        
             # Raw training data: 
        
             data = list( 
        
               train = data, 
        
               test = data.test 
        
             ), 
        
             # Store parameters (as list): 
        
             params = list( 
        
               algorithm = algorithm, 
        
               # 
        
               goal = goal, 
        
               goal.chase = goal.chase, 
        
               goal.threshold = goal.threshold, 
        
               # 
        
               max.levels = max.levels, 
        
               numthresh.method = numthresh.method, 
        
               numthresh.n = numthresh.n, 
        
               repeat.cues = repeat.cues, 
        
               stopping.rule = stopping.rule, 
        
               stopping.par = stopping.par, 
        
               # 
        
               sens.w = sens.w, 
        
               # 
        
               cost.outcomes = cost.outcomes, 
        
               cost.cues = cost.cues, 
        
               # 
        
               main = main, 
        
               decision.labels = decision.labels, 
        
               # 
        
               my.goal = my.goal, 
        
               my.goal.fun = my.goal.fun, 
        
               my.tree = my.tree, 
        
               # 
        
               quiet = quiet 
        
             ), 
        
             # One row per algorithm competition: 
        
             competition = list( 
        
               train = data.frame( 
        
                 algorithm = NA, 
        
                 n = NA, 
        
                 hi = NA, fa = NA, mi = NA, cr = NA, 
        
                 sens = NA, spec = NA, far = NA, 
        
                 ppv = NA, npv = NA, 
        
                 acc = NA, bacc = NA, 
        
                 cost = NA, cost_dec = NA, cost_cue = NA 
        
               ), 
        
               test = data.frame( 
        
                 algorithm = NA, 
        
                 n = NA, 
        
                 hi = NA, fa = NA, mi = NA, cr = NA, 
        
                 sens = NA, spec = NA, far = NA, 
        
                 ppv = NA, npv = NA, 
        
                 acc = NA, bacc = NA, 
        
                 cost = NA, cost_dec = NA, cost_cue = NA 
        
               ), 
        
               models = list(lr = NULL, cart = NULL, rf = NULL, svm = NULL) 
        
             ) # competition. 
        
           ) # x.

Here are the core issues I see:

Inconsistent and confusing naming
- Ex) How do the definitions, inwords, stats, level_stats, and decisions objects in FFTrees.relate to each other?
Information at different levels of abstraction aren't consistently stored
- Ex) Why are tree definitions stored at the tree level but not the node level? Why aren't overall tree accuracy stats located close to the tree level accuracy stats?
Inconsistent storage locations
- Ex) Why are criterion_name, cue_names and formula stored at the same level as trees, data and params? Could these be stored in a list such as metadata?

To solve these issues, I'm drafted an object design doc at https://github.com/ndphillips/FFTrees/wiki/%5B80%25%5D-FFTrees-Object-Design. I'm eager for feedback

hneth · 2024-05-30T08:07:56Z

Just realized that my comment on #224 should better have been posted here — sorry!

ndphillips · 2024-06-01T11:12:02Z

No worries, let's carry on the discussion here

ndphillips added the question label May 29, 2024

hneth mentioned this issue May 30, 2024

[DRAFT] Issue 221/plot refactoring #224

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ideas for refining the structure of FFTrees objects #226

Ideas for refining the structure of FFTrees objects #226

ndphillips commented May 29, 2024 •

edited

Loading

hneth commented May 30, 2024

ndphillips commented Jun 1, 2024

Ideas for refining the structure of FFTrees objects #226

Ideas for refining the structure of FFTrees objects #226

Comments

ndphillips commented May 29, 2024 • edited Loading

hneth commented May 30, 2024

ndphillips commented Jun 1, 2024

ndphillips commented May 29, 2024 •

edited

Loading