Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The gfsgenesis task failed on Jet #2628

Open
guoqing-noaa opened this issue May 25, 2024 · 4 comments
Open

The gfsgenesis task failed on Jet #2628

guoqing-noaa opened this issue May 25, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@guoqing-noaa
Copy link
Contributor

What is wrong?

The error message:


+ Unknown[43]: [[ -d /gpfs/hps ]]
+ Unknown[48]: [[ -L /usrx ]]
+ Unknown[53]: [[ -d /scratch2 ]]
+ Unknown[58]: [[ -d /work ]]
+ Unknown[63]: [[ -d /lfs3 ]]
+ Unknown[68]: [[ -d /lfs/h1 ]]
+ Unknown[74]: machine=unknown
+ Unknown[74]: export machine
+ Unknown[75]: echo Job failed: unknown platform
+ Unknown[75]: 1>& 2
Job failed: unknown platform
+ Unknown[76]: err_exit 'FAILED genesis.1116443 - ERROR IN unknown platform - ABNORMAL EXIT'

-------------------------------------------------------------
-- FATAL ERROR: FAILED genesis.1116443 - ERROR IN unknown platform - ABNORMAL EXIT
-- ABNORMAL EXIT at Sat May 25 04:16:36 UTC 2024 on k3
-------------------------------------------------------------

What should have happened?

The gfsgenesis task should complete successfully

What machines are impacted?

Jet

Steps to reproduce

Run a forecast-only experiment on Jet will reproduce the error.

Additional information

The error happens in
/lfs4/HFIP/hfv3gfs/glopara/git/TC_tracker/feature-GFSv17_com_reorg/scripts/exgfs_tc_genesis.sh

Where line 128-131 reads as follows:

elif [[ -d /lfs3 ]] ; then
  # We are on NOAA Jet
  machine=jet
  ${USHens_tracker}/extrkr_gen_gfs.sh ${loopnum} ${cmodel} ${pert} ${pertdir} #2>&1 >${outfile}

Jet no longer has the /lfs3 directory

Do you have a proposed solution?

update TC_tracker to a more recent version so that it can detect Jet correctly.

@guoqing-noaa guoqing-noaa added bug Something isn't working triage Issues that are triage labels May 25, 2024
@guoqing-noaa guoqing-noaa changed the title The gfsgenesis task failed o Jet The gfsgenesis task failed on Jet May 25, 2024
@guoqing-noaa
Copy link
Contributor Author

Or can we update ens_tracker_ver in run.spack.ver
from
export ens_tracker_ver=feature-GFSv17_com_reorg
to
export ens_tracker_ver=v1.1.15.6

@WalterKolczynski-NOAA WalterKolczynski-NOAA removed the triage Issues that are triage label May 28, 2024
@WalterKolczynski-NOAA
Copy link
Contributor

@InnocentSouopgui-NOAA
Copy link
Contributor

Please see the following issue: #2841
After the failure of /lfs4, libraries on S4 are moving to /lfs5

The following pull request #2878 has the fix for pretty much everything. Unfortunately, we can not push it through yet, because of a bug in one one component, that will affect other systems.

Also the TC_Tracker component has not moved yet, but will be moving soon.

If you want try the version with the fix, let me know.

You can also mirror the fix in your local working copy.

For instance, you can replace /lfs3 or /lfs1 by /lsf5 in /lfs4/HFIP/hfv3gfs/glopara/git/TC_tracker/feature-GFSv17_com_reorg/scripts/exgfs_tc_genesis.sh

Proceed with caution if you want to fix your local copy, as there is no guarantee of /lfs4 being mounted on the compute nodes.

@HananehJafary-NOAA
Copy link

I have migrated all the /lfs4 filesystem to /lfs5 for TC_Tracker on Jet. You can clone the updated package under my directory: https://github.com/HananehJafary-NOAA/tracker_package

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants