Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ideas for refining the structure of FFTrees objects #226

Open
ndphillips opened this issue May 29, 2024 · 2 comments
Open

Ideas for refining the structure of FFTrees objects #226

ndphillips opened this issue May 29, 2024 · 2 comments
Labels

Comments

@ndphillips
Copy link
Owner

ndphillips commented May 29, 2024

@hneth In the context of #224 I realized that the structure of FFTrees objects is making development challenging - such as knowing how to modularize the plot.FFTrees() function.

The current code to create FFTrees objects is contained here:

FFTrees/R/fftrees_create.R

Lines 765 to 852 in 71b9da0

# Create x (as list):
x <- list(
# Names of criterion vs. cues:
criterion_name = criterion_name,
cue_names = cue_names,
# Formula:
formula = formula, # original formula
# Tree info:
trees = list(
n = NULL,
best = NULL,
definitions = NULL,
inwords = NULL,
stats = NULL,
level_stats = NULL,
decisions = list(
train = list(),
test = list()
)
),
# Raw training data:
data = list(
train = data,
test = data.test
),
# Store parameters (as list):
params = list(
algorithm = algorithm,
#
goal = goal,
goal.chase = goal.chase,
goal.threshold = goal.threshold,
#
max.levels = max.levels,
numthresh.method = numthresh.method,
numthresh.n = numthresh.n,
repeat.cues = repeat.cues,
stopping.rule = stopping.rule,
stopping.par = stopping.par,
#
sens.w = sens.w,
#
cost.outcomes = cost.outcomes,
cost.cues = cost.cues,
#
main = main,
decision.labels = decision.labels,
#
my.goal = my.goal,
my.goal.fun = my.goal.fun,
my.tree = my.tree,
#
quiet = quiet
),
# One row per algorithm competition:
competition = list(
train = data.frame(
algorithm = NA,
n = NA,
hi = NA, fa = NA, mi = NA, cr = NA,
sens = NA, spec = NA, far = NA,
ppv = NA, npv = NA,
acc = NA, bacc = NA,
cost = NA, cost_dec = NA, cost_cue = NA
),
test = data.frame(
algorithm = NA,
n = NA,
hi = NA, fa = NA, mi = NA, cr = NA,
sens = NA, spec = NA, far = NA,
ppv = NA, npv = NA,
acc = NA, bacc = NA,
cost = NA, cost_dec = NA, cost_cue = NA
),
models = list(lr = NULL, cart = NULL, rf = NULL, svm = NULL)
) # competition.
) # x.

Here are the core issues I see:

  • Inconsistent and confusing naming
    • Ex) How do the definitions, inwords, stats, level_stats, and decisions objects in FFTrees.relate to each other?
  • Information at different levels of abstraction aren't consistently stored
    • Ex) Why are tree definitions stored at the tree level but not the node level? Why aren't overall tree accuracy stats located close to the tree level accuracy stats?
  • Inconsistent storage locations
    • Ex) Why are criterion_name, cue_names and formula stored at the same level as trees, data and params? Could these be stored in a list such as metadata?

To solve these issues, I'm drafted an object design doc at https://github.com/ndphillips/FFTrees/wiki/%5B80%25%5D-FFTrees-Object-Design. I'm eager for feedback

@hneth
Copy link
Collaborator

hneth commented May 30, 2024

Just realized that my comment on #224 should better have been posted here — sorry!

@ndphillips
Copy link
Owner Author

No worries, let's carry on the discussion here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants