Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workshop March 2022 Examples #139

Merged
merged 14 commits into from
Mar 22, 2022
Merged

Conversation

PaulTalbot-INL
Copy link
Collaborator

@PaulTalbot-INL PaulTalbot-INL commented Mar 10, 2022


Pull Request Description

What issue does this change request address?

Initial steps towards #122

What are the significant changes in functionality due to this change request?

Adds example case(s) for users to try out while learning the code

  • "simple" case (1 source, 1 sink, 1 fixer source)

Other modifications:

  • Adds nominal parallel controls from HERON input
    • not HPC, just local parallel
  • Modifies default Optimizer inputs to more generically useful values
    • shrinkFactor 2 -> 1.5
    • initialStepSize default -> 0.2
    • gradient convergence target 1e-8 -> 1e-4
  • Bugfix for running Debug in Opt mode

For Change Control Board: Change Request Review

The following review must be completed by an authorized member of the Change Control Board.

  • 1. Review all computer code.
  • 2. If any changes occur to the input syntax, there must be an accompanying change to the user manual and xsd schema. If the input syntax change deprecates existing input files, a conversion script needs to be added (see Conversion Scripts).
  • 3. Make sure the Python code and commenting standards are respected (camelBack, etc.) - See on the wiki for details.
  • 4. Automated Tests should pass.
  • 5. If significant functionality is added, there must be tests added to check this. Tests should cover all possible options. Multiple short tests are preferred over one large tes.
  • 6. If the change modifies or adds a requirement or a requirement based test case, the Change Control Board's Chair or designee also needs to approve the change. The requirements and the requirements test shall be in sync.
  • 7. The merge request must reference an issue. If the issue is closed, the issue close checklist shall be done.
  • 8. If an analytic test is changed/added, the the analytic documentation must be updated/added.
  • 9. If any test used as a basis for documentation examples have been changed, the associated documentation must be reviewed and assured the text matches the example.

@PaulTalbot-INL PaulTalbot-INL marked this pull request as draft March 10, 2022 17:36
@PaulTalbot-INL PaulTalbot-INL marked this pull request as ready for review March 17, 2022 21:08
Copy link
Collaborator

@dgarrett622 dgarrett622 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled your branch locally and tried to run the test to make sure it runs on a Windows machine. It failed for two reasons:

  1. GLPK can only be run serially, there is no "threads" option.
  2. Specifying 10 for is too many cores for my machine (max 8).

Commenting out the "threads" option and specifying 6 for <parallel> allowed the test to run successfully for my setup. I think the fixes for these two issues are pretty simple to implement and will ensure that the test runs on Windows.

@@ -303,7 +303,8 @@ def dispatch_window(self, time, time_offset,
attempts += 1
print(f'DEBUGG solve attempt {attempts} ...:')
# solve
soln = pyo.SolverFactory(self._solver).solve(m)
solve_options = {'threads': 2}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting "threads" does not work for GLPK. The options passed into "solve" appear to be specific to the solver used. I'm sure this is an option for CBC, but it throws an error when using GLPK. I don't know what the behavior is if other solvers are used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting; I don't recall adding this line; I must have been experimenting with something I didn't finalize. I can remove this. Philosophically I think it should only be set based on user requests specific to individual solvers.

Copy link
Collaborator Author

@PaulTalbot-INL PaulTalbot-INL Mar 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think perhaps there are pieces of your top comment above that aren't showing. I think you have to escape XML nodes as <example> (using backticks "`") otherwise they get treated as formatting tags such as "" (no backticks).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, forgot the backticks on that one. Preview function is your friend.

# for now, outer does not use InternalParallel
batchSize = run_info.find('batchSize')
batchSize.text = f'{case.outerParallel}'
if case.innerParallel:
Copy link
Collaborator

@dgarrett622 dgarrett622 Mar 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment on os.cpu_count()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully the check I added will catch this, let me know if you think anything further should be done

sampler = 'grid'
else:
sampler = 'opt'
print()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty print statement can be deleted.

# parallel
if case.innerParallel:
run_info.append(xmlUtils.newNode('internalParallel', text='True'))
run_info.find('batchSize').text = f'{case.innerParallel}'
Copy link
Collaborator

@dgarrett622 dgarrett622 Mar 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment on os.cpu_count().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same, let me know if you think anything more needs doing

src/Cases.py Outdated
parallel = InputData.parameterInputFactory('parallel', descr=r"""Describes how to parallelize this run.""")
parallel.addSub(InputData.parameterInputFactory('outer', contentType=InputTypes.IntegerType,
descr=r"""the number of parallel runs to use for the outer optimization run. The product of this
number and \xmlNode{inner} should be at most the number of parallel process availabe on
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"available"

@@ -68,7 +68,7 @@ class Template(TemplateBase, Base):

var_template = xmlUtils.newNode('variable')
var_template.append(xmlUtils.newNode('distribution'))
var_template.append(xmlUtils.newNode('grid', attrib={'type':'value', 'construction':'custom'}))
#var_template.append(xmlUtils.newNode('grid', attrib={'type':'value', 'construction':'custom'}))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This appears to be commented out, does it need to stay or can it be deleted?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

if sub.getName() == 'outer':
self.outerParallel = sub.value
elif sub.getName() == 'inner':
self.innerParallel = sub.value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried running the htse test case and an error was thrown because more cores were requested than available (my Windows machine has 8). I think it would be a good idea to check the number of available cores using "os.cpu_count()" and make sure that the user specified number is not greater than the maximum number available or some threshold. Generally, I try to only use max os.cpu_count() -1 cores so that it doesn't lock up my machine. Probably should add a debug or warning message if the user specifies too many cores and the number is adjusted downward to allow the case to run.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good thought. That check won't work well when we're running on HPC, but we can follow up with that in a future PR where we auto-handle HPC stuff.

Copy link
Collaborator

@dgarrett622 dgarrett622 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the automated tests are failing and I think I know why. A quick fix and a comment to consider.

# check to see if the number of processes available can meet the request
detected = os.cpu_count() - 1 # -1 to prevent machine OS locking up
if detected < cores_requested:
self.raiseAWarning('System may be overloaded and greatly increase run time! ' +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment for consideration:

On my Windows machine, requesting more cores than you have available results in an error and RAVEN will not run, not sure what the behavior is on Mac or Linux. We could raise an error here or raise a warning and wait for RAVEN to throw an error later, either way is probably fine. I think the important thing is to inform the user that they requested more cores than the machine has available. If that particular error is thrown later, the user will have an idea as to why the simulations were not run.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't love how this is handled generally speaking; I think handling it better in RAVEN may lead to sensible error messages so we don't duplicate that effort in HERON.

@@ -303,7 +303,6 @@ def dispatch_window(self, time, time_offset,
attempts += 1
print(f'DEBUGG solve attempt {attempts} ...:')
# solve
solve_options = {'threads': 2}
soln = pyo.SolverFactory(self._solver).solve(m, options=solve_options)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this is what is causing the automated tests to fail. Removing the options=solve_options should fix this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops! fixed.

@dgarrett622 dgarrett622 merged commit 00481ae into idaholab:devel Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants