Skip to content
This repository has been archived by the owner on Apr 11, 2019. It is now read-only.

Theme 2 Scenario 2A, Update usgs notebook #177

Merged
merged 11 commits into from
Dec 8, 2014
Merged

Conversation

birdage
Copy link
Contributor

@birdage birdage commented Sep 12, 2014

@Bobfrat lets keep this open, and i will be pushing changes to it. will note when complete

@birdage
Copy link
Contributor Author

birdage commented Dec 1, 2014

@Bobfrat @ocefpaf can someone merge this pls

#file_name_list.append(processTextFile(url,html_link,'wl'))
wl_count+=1
print "num water level:",wl_count
if (os.path.isdir("./data_files/")):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@birdage This will break on windows machines. Use os.path.join.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ocefpaf want me to correct it before merge?

@ocefpaf
Copy link
Member

ocefpaf commented Dec 1, 2014

IMHO it is OK for the merge @birdage, but I do not use windows 😁 Maybe we can open an issue after the merge to remind that the paths need to be fixed.

@birdage
Copy link
Contributor Author

birdage commented Dec 1, 2014

@ocefpaf sounds like a good idea, thanks!

@ocefpaf
Copy link
Member

ocefpaf commented Dec 1, 2014

Almost done reviewing. I will merge as soon as I finish running it here.

for file_name in files:
print count,file_name
display(pb)
for fi,file_name in enumerate(files):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop is too complex and easily breakable (in fact I am having trouble to run it right now). Maybe an abstraction to read the metadata is in hand.

Also the actual data and dates can be easily read with pandas and stored in the dictionary instead of the numpy array in the next cell.

@ocefpaf
Copy link
Member

ocefpaf commented Dec 1, 2014

  • Some suggestions (I am OK to merge without these):
    • Add a note mentioning what are the files sample_data_files.zip and track.csv
    • Add %matplotlib inline in the beginning of the notebook. (Newer versions of IPython notebook need that to display the figures.)
    • Always reset the kernel and re-run the notebook before sending a PR. That helps to debug (with a proper cell count in the notebook), assert that things ran in order, and avoid namespace contamination.
  • One fix (this is necessary for the merge):
    • I cannot run the notebook locally. I get zero length data after running the main files loop. After I re-factored the loop things worked fine. Can you run it there the way it is? Please check if I did not break anything:
from pandas import read_csv


full_data = {}

def parse_metadata(fname):
    meta_data = {}
    fields = {'Sensor location latitude': 'lat',
              'Sensor location longitude': 'lon',
              'Site id =': 'name',
              'Sensor elevation above NAVD 88 =': 'elevation',
              'Barometric sensor site (source of bp) =': 'bp_source',
              'Lowest recordable water elevation is': 'lowest_wl'}
    with open(os.path.join('data_files', fname)) as f:
        content = f.readlines()
        for k, ln in enumerate(content):
            content[k] = ln.strip()
            if content[k].startswith('#'):
                for field in fields:
                    if field in content[k]:
                        if fields[field] == 'name':
                            meta_data[fields[field]] = content[k].split(field)[-1]
                        else:
                            val = (content[k].split(field)[-1])
                            meta_data[fields[field]] = float(non_decimal.sub('', val))
                        if fields[field] == 'lon':
                            meta_data[fields[field]] = -meta_data[fields[field]]
    return meta_data

for count, fname in enumerate(files):
    print('{}: {} of {} '.format(fname, count+1, len(files)))
    meta_data = parse_metadata(fname)
    kw = dict(parse_dates=True, sep='\t', skiprows=29, index_col=0)
    actual_data = read_csv(os.path.join('data_files', fname), **kw)
    full_data[fname] = {'meta': meta_data,
                        'data': actual_data}

Done reviewing. Over to you @birdage.

@birdage
Copy link
Contributor Author

birdage commented Dec 1, 2014

@ocefpaf regarding comments

  • i thought we moved away from %matplotlib inline in the code? and you had to adopt the ' ipython notebook with --pylab=inline' method of starting ipython notebook?
  • i dont think the user needs to know too much about the data files and track lines, though i do mention in the header that the data files are zipped etc
  • id suggest committing your changes to the loop and ill pull and test.

@ocefpaf
Copy link
Member

ocefpaf commented Dec 1, 2014

  • i thought we moved away from %matplotlib inline in the code? and you had to adopt the ' ipython notebook with --pylab=inline' method of starting ipython notebook?

I would go away from %pylab, but that is just me...

i dont think the user needs to know too much about the data files and track lines, though i do mention in the header that the data files are zipped etc

OK. I missed the header info. That is what I would like to see, a link to the original data. Thanks!

id suggest committing your changes to the loop and ill putt and test.

I will prepare a PR to your branch tomorrow and lets take a look at the results then.

@birdage
Copy link
Contributor Author

birdage commented Dec 1, 2014

@ocefpaf Thanks for the comments, well sort it tomorrow.

@ocefpaf
Copy link
Member

ocefpaf commented Dec 2, 2014

@ocefpaf ocefpaf mentioned this pull request Dec 4, 2014
@jkupiec
Copy link

jkupiec commented Dec 5, 2014

@ocefpaf, since #204 is closed, is this PR good to merge? Also, if and when the merge is complete, does that make #203 ready to close?

@ocefpaf
Copy link
Member

ocefpaf commented Dec 5, 2014

Hi John,
The PR is not pending due to #204. I sent the fix to @birdage branch so he
can modify the PR before we merge.
Em 05/12/2014 15:02, "John Kupiec" notifications@github.com escreveu:

@ocefpaf https://github.com/ocefpaf, since #204
#204 is closed, is this PR
good to merge?


Reply to this email directly or view it on GitHub
#177 (comment).

@birdage
Copy link
Contributor Author

birdage commented Dec 5, 2014

@ocefpaf @jkupiec this is on my plate now to finish up

@birdage
Copy link
Contributor Author

birdage commented Dec 8, 2014

@ocefpaf thats for the updates, the processing code runs a bit slower but i think the refactor makes it easier to read. ive made some simple changes, will rebase and commit.

@birdage
Copy link
Contributor Author

birdage commented Dec 8, 2014

@ocefpaf looks like were ready to merge in, could you do it if possible as i dont want to self merge

@ocefpaf
Copy link
Member

ocefpaf commented Dec 8, 2014

May I ask one last thing? Can you restart the kernel and re-run the notebook so we have a meaningful view at nbviewer:

http://nbviewer.ipython.org/github/birdage/system-test/blob/update_usgs/Theme_2_Extreme_Events/Scenario_2A/USGS_Gauges/Rapid_Deploy_Gauges.ipynb

@birdage
Copy link
Contributor Author

birdage commented Dec 8, 2014

@ocefpaf once the travis has run i think were good.

ocefpaf added a commit that referenced this pull request Dec 8, 2014
Theme 2 Scenario 2A, Update usgs notebook
@ocefpaf ocefpaf merged commit 08074ee into ioos:master Dec 8, 2014
@ocefpaf
Copy link
Member

ocefpaf commented Dec 8, 2014

Nice!

@birdage birdage deleted the update_usgs branch December 8, 2014 18:45
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants