Merge pull request #85 from jtwhite79/feat_touchup

Feat touchup
gmdsi · Aug 21, 2023 · 4a4a585 · 4a4a585
2 parents d20de90 + e1860f4
commit 4a4a585
Show file tree

Hide file tree

Showing 9 changed files with 124 additions and 40 deletions.
diff --git a/environment.yml b/environment.yml
@@ -4,8 +4,8 @@ channels:
 dependencies:
   - python=3.10
   - pip
-  - pandas
-  - numpy 
+  - pandas>=2.0
+  - numpy>=1.25 
   - matplotlib 
   - jupyter 
   - notebook 

diff --git a/etc/test_my_install.ipynb b/etc/test_my_install.ipynb
@@ -23,8 +23,12 @@
     "import matplotlib.pyplot as plt\n",
     "\n",
     "print(\"python\",sys.version)\n",
+    "assert sys.version.startswith(\"3.10\")\n",
     "print(\"pandas\",pd.__version__)\n",
+    "assert pd.__version__.startswith(\"2\")\n",
     "print(\"numpy\",np.version.version)\n",
+    "assert np.version.version.startswith(\"1.2\")\n",
+    "assert int(np.version.version.split(\".\")[1]) >= 25\n",
     "#todo: add some asserts for min versions...\n",
     "\n",
     "import flopy\n",

diff --git a/tutorials/clear_output_all.py b/tutorials/clear_output_all.py
@@ -1,15 +1,11 @@
 import os
 dirs = [d for d in os.listdir(".") if os.path.isdir(d)]
+notebook_count = 0
 dirs.append(os.path.join("..","etc"))
 for d in dirs:
     nb_files = [os.path.join(d,f) for f in os.listdir(d) if f.lower().endswith(".ipynb")]
     for nb_file in nb_files:
         print("clearing",nb_file)
         os.system("jupyter nbconvert --ClearOutputPreprocessor.enabled=True --ClearMetadataPreprocessor.enabled=True --inplace {0}".format(nb_file))
-
-# d = os.path.join("..","etc")
-# if os.path.exists(d):
-#     nb_files = [os.path.join(d,f) for f in os.listdir(d) if f.lower().endswith(".ipynb")]
-#     for nb_file in nb_files:
-#         print("clearing",nb_file)
-#         os.system("jupyter nbconvert --ClearOutputPreprocessor.enabled=True --ClearMetadataPreprocessor.enabled=True --inplace {0}".format(nb_file))
+        notebook_count += 1
+print(notebook_count," notebooks cleared")
diff --git a/tutorials/part1_06_glm_response_surface/freyberg_glm_response_surface.ipynb b/tutorials/part1_06_glm_response_surface/freyberg_glm_response_surface.ipynb
@@ -68,7 +68,7 @@
     "# a dir to hold a copy of the org model files\n",
     "tmp_d = os.path.join('freyberg_mf6')\n",
     "\n",
-    "runflag= False\n",
+    "runflag= True\n",
     "\n",
     "if runflag==False:\n",
     "    print('Assuming PEST++SWP has bene run already and the folder with files is available')\n",
@@ -201,7 +201,7 @@
    "source": [
     "Make a plot of the response surface for `hk1` (x-axis) and `rch0` (y-axis). The colored contours indicate the objective function value for each combination of these two parameters. \n",
     "\n",
-    "As you can see, a long eliptical \"trough\" of optimal values is formed (grey zone). Parameter combinations in this zone all result in equivalent levels of \"good fit\"."
+    "As you can see, a long eliptical \"trough of dispair\" of optimal values is formed. Parameter combinations in this zone all result in equivalent levels of \"good fit\" to the observation dataset.  The trough of dispair is an example of non-uniqueness in graphically form.  A problem that arises is while many combinations of recharge and HK can fit the observation dataset, the forecasts of interest model with \"calibrated\" model could be highly sensitive to the value of HK and/or recharge and single \"calibrated\" model cant represent this non-uniqueness under forecasting conditions.  #uncertaintyanalysis"
    ]
   },
   {
@@ -283,7 +283,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "And plot it up again. Now we see the objective function surface funneling down to a single point. We have achieved a unique solution."
+    "And plot it up again. Now we see the objective function surface funneling down to a single point. We have achieved a unique solution.  The \"trough of dispair\" has been the \"bowl of uniqueness\"!  A clear demonstration of the value of unique and diverse data..."
    ]
   },
   {
@@ -301,7 +301,7 @@
    "source": [
     "# Understanding Lambdas\n",
     "\n",
-    "When used to undertake highly parameterized inversion, PESTPP-GLM implements theory and methodologies that are programmed into PEST. Some theory, employing matrices and vectors, is used to describe the linearized inverse problem on which so-called “gradient methods” are based. Through repeated linearization of the inverse problem over successive iterations, these methods achieve their purpose of model calibration, notwithstanding the nonlinear relationship that exists between model outputs and model parameters. \n",
+    "When used to undertake highly parameterized inversion, PESTPP-GLM implements theory and methodologies that are programmed into PEST. Some theory, employing matrices and vectors, is used to describe the linearized inverse problem on which so-called “gradient methods” are based. Through repeated linearization of the inverse problem over successive iterations, these methods achieve their purpose of model calibration, notwithstanding the nonlinear relationship that exists between model outputs and model parameters.  It should also be noted that PESTPP-IES also implements the GLM solution, but an ensemble of parameter vectors.  So the single trajectory below can be thought of as one of the many tradjectories that ensemble of parameter vectors take.\n",
     "\n",
     "Nonlinear model behaviour is also accommodated by introducing a so-called \"Marquardt lambda\" to these equations. Employing a nonzero lambda tweaks the direction of parameter improvement so that it is more aligned with the objective function gradient. This increases the efficiency of early iterations of the inversion process when implemented in conjunction with a nonlinear model.\n",
     "\n",
@@ -311,7 +311,9 @@
     "\n",
     "However, as the objective function minimum is approached, the process becomes more eficient if smaller lambdas are used. This avoids the phenomenon known as \"hemstitching\", in which parameter upgrades jump-across small, thin \"troughs\" in the objective function surface. \n",
     "\n",
-    "See the PEST Book (Doherty, 2015) for more details.\n"
+    "Note again, the effect of lambda on the parameter upgrade is the same in PESTPP-IES.\n",
+    "\n",
+    "See the PEST Book (Doherty, 2015) and the PEST++ users manual for more details.\n"
    ]
   },
   {

diff --git a/tutorials/part1_11_local_and_global_sensitivity/freyberg_2_global_sensitivity.ipynb b/tutorials/part1_11_local_and_global_sensitivity/freyberg_2_global_sensitivity.ipynb
@@ -248,8 +248,26 @@
    "metadata": {},
    "outputs": [],
    "source": [
+    "df = df.loc[df.sen_mean_abs>1e-6,:]\n",
     "df.loc[:,[\"sen_mean_abs\",\"sen_std_dev\"]].plot(kind=\"bar\", figsize=(13,4))\n",
-    "plt.yscale('log');"
+    "#ax = plt.gca()\n",
+    "#ax.set_ylim(1,ax.get_ylim()[1]*1.1)\n",
+    "plt.yscale('log');\n",
+    "fig,ax = plt.subplots(1,1,figsize=(13,8))\n",
+    "tmp_df = df\n",
+    "ax.scatter(tmp_df.sen_mean_abs,tmp_df.sen_std_dev,marker=\"^\",s=20,c=\"r\")\n",
+    "tmp_df = tmp_df.iloc[:8]\n",
+    "for x,y,n in zip(tmp_df.sen_mean_abs,tmp_df.sen_std_dev,tmp_df.index):\n",
+    "    ax.text(x,y,n)\n",
+    "mx = max(ax.get_xlim()[1],ax.get_ylim()[1])\n",
+    "mn = min(ax.get_xlim()[0],ax.get_ylim()[0])\n",
+    "ax.plot([mn,mx],[mn,mx],\"k--\")\n",
+    "ax.set_ylim(mn,mx)\n",
+    "ax.set_xlim(mn,mx)\n",
+    "ax.grid()\n",
+    "ax.set_ylabel(\"$\\\\sigma$\")\n",
+    "ax.set_xlabel(\"$\\\\mu^*$\")\n",
+    "plt.tight_layout()\n"
    ]
   },
   {
@@ -307,7 +325,21 @@
     "    tmp_df = df_pred_sen.loc[df_pred_sen.observation_name==forecast].sort_values(by='sen_mean_abs', ascending=False)\n",
     "    tmp_df.plot(x=\"parameter_name\",y=[\"sen_mean_abs\",\"sen_std_dev\"],kind=\"bar\", figsize=(13,2.5))\n",
     "    plt.title(forecast)\n",
-    "    plt.yscale('log');"
+    "    plt.yscale('log');\n",
+    "    fig,ax = plt.subplots(1,1,figsize=(13,8))\n",
+    "    ax.scatter(tmp_df.sen_mean_abs,tmp_df.sen_std_dev,marker=\"^\",s=20,c=\"r\")\n",
+    "    tmp_df = tmp_df.iloc[:8]\n",
+    "    for x,y,n in zip(tmp_df.sen_mean_abs,tmp_df.sen_std_dev,tmp_df.parameter_name):\n",
+    "        ax.text(x,y,n)\n",
+    "    mx = max(ax.get_xlim()[1],ax.get_ylim()[1])\n",
+    "    mn = min(ax.get_xlim()[0],ax.get_ylim()[0])\n",
+    "    ax.plot([mn,mx],[mn,mx],\"k--\")\n",
+    "    ax.set_ylim(mn,mx)\n",
+    "    ax.set_xlim(mn,mx)\n",
+    "    ax.grid()\n",
+    "    ax.set_ylabel(\"$\\\\sigma$\")\n",
+    "    ax.set_xlabel(\"$\\\\mu^*$\")\n",
+    "    plt.tight_layout()\n"
    ]
   },
   {
@@ -318,6 +350,13 @@
     "\n",
     "As we saw above for parameters, once again σ is very high (for almost all parameters...). This suggests either non-linearity and/or parameter interactions. Relying on linear methods for uncertainty analysis is therefore compromised. Ideally we should employ non-linear methods, as will be discussed in the subsequent tutorial."
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {

diff --git a/tutorials/part2_06_ies/freyberg_ies_1_basics.ipynb b/tutorials/part2_06_ies/freyberg_ies_1_basics.ipynb
@@ -737,9 +737,9 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Ruh roh!  The posterior isn't covering the correct values for some of forecasts. Major bad times. \n",
+    "Ruh roh!  The posterior isn't covering the correct values (with lots of realizations) for some of forecasts. #badtimes. \n",
     "\n",
-    "But hold on! The prior does (nearly) cover the true values for all forecasts. So that implies there is somewhere between the prior and posterior we have now, which is optimal with respect to all forecasts. Hmmm...so this means that history matching made our prediction worse. We have incurred forecast-sensitive bias through the parameter adjustment process. How can we fit historical data so well but get the \"wrong\" answer for some of the forecasts?\n",
+    "But hold on! The prior does (nearly) cover the true values for all forecasts. So that implies there is somewhere between the prior and posterior we have now, which is optimal with respect to all forecasts. Hmmm...so this means that history matching to a high level of fit made our prediction worse. We have incurred forecast-sensitive bias through the parameter adjustment process. How can we fit historical data so well but get the \"wrong\" answer for some of the forecasts?\n",
     "\n",
     "> **Important Aside**! When you are using an imperfect model (compared to the truth), the link between a \"good fit\" and robust forecast is broken. A good fit does not mean a good forecaster! This is particularily the case for forecasts that are sensitive to combinations of parameters that occupy the history-matching null space (see [Doherty and Moore (2020)](https://s3.amazonaws.com/docs.pesthomepage.org/documents/model_complexity_monograph.pdf) for a discussion of these concepts). In other words, forecasts which rely on (combinations of) parameters that are not informed by available observation data. (In our case, an example is the \"headwater\" forecast.)\n",
     "\n",
@@ -826,7 +826,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Interesting! That's a bit better. What have we done? We've accepted \"more uncertainty\" for a reduced propensity of inducing forecast bias. Perhaps if we had more realizations we would have gotten a wider sample of the posterior? But even if we hadn't failed to capture the truth, in the real-world how would we know? So...should we just stick with prior? (Assuming the prior is adequately described...) Feeling depressed yet? Worry not, in our next tutorial we will introduce some coping strategies. \n",
+    "Interesting! That's better for some forecasts, not for others. What have we done? We've accepted \"more uncertainty\" for a reduced propensity of inducing forecast bias. Perhaps if we had more realizations we would have gotten a wider sample of the posterior? But even if we hadn't failed to capture the truth, in the real-world how would we know? So...should we just stick with prior? (Assuming the prior is adequately described...) Feeling depressed yet? Worry not, in our next tutorial we will introduce some coping strategies. \n",
     "\n",
     "In summary, we have learnt:\n",
     " - How to configure and run PESTPP-IES\n",
@@ -904,9 +904,9 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "pr_oe = pyemu.ObservationEnsemble.from_csv(pst=pst,filename=os.path.join(m_d,\"freyberg_mf6.0.obs.csv\"))\n",
-    "pt_oe = pyemu.ObservationEnsemble.from_csv(pst=pst,filename=os.path.join(m_d,\"freyberg_mf6.{0}.obs.csv\".format(pst.control_data.noptmax)))\n",
-    "noise = pyemu.ObservationEnsemble.from_csv(pst=pst,filename=os.path.join(m_d,\"freyberg_mf6.obs+noise.csv\"))"
+    "pr_oe_more = pyemu.ObservationEnsemble.from_csv(pst=pst,filename=os.path.join(m_d,\"freyberg_mf6.0.obs.csv\"))\n",
+    "pt_oe_more = pyemu.ObservationEnsemble.from_csv(pst=pst,filename=os.path.join(m_d,\"freyberg_mf6.{0}.obs.csv\".format(pst.control_data.noptmax)))\n",
+    "noise_more = pyemu.ObservationEnsemble.from_csv(pst=pst,filename=os.path.join(m_d,\"freyberg_mf6.obs+noise.csv\"))"
    ]
   },
   {
@@ -915,17 +915,24 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "fig = plot_forecast_hist_compare(pt_oe=pt_oe_iter,pr_oe=pr_oe,\n",
-    "                                last_pt_oe=pt_oe,last_prior=pr_oe\n",
+    "fig = plot_forecast_hist_compare(pt_oe=pt_oe_more,pr_oe=pr_oe_more,\n",
+    "                                last_pt_oe=pt_oe_iter,last_prior=pr_oe\n",
     "                                )"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "having more realizations has two benefits:  more samples for uncertainty analysis and better resolution of the empirical first-order relations between parameters and observations.  In this case, these two combined effects have helped us (nearly) bracket the true value for each forecast - yeh!  So always use a many realizations as you can tolerate!"
+    "having more realizations has two benefits:  more samples for uncertainty analysis and better resolution of the empirical first-order relations between parameters and observations.  In this case, these two combined effects have helped us better bracket the true value for each forecast - yeh!  So always use a many realizations as you can tolerate!"
    ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
   }
  ],
  "metadata": {