Merge changes to figure out what's going on with overpass test #3

nataliejschultz · 2024-07-11T16:48:43Z

No description provided.

Changed print statements to logging.debug() to ensure these are visible easily and in sequence with other logging statements in Cloudwatch logs. Added more information including: - length of predictions - failed assertion message in case of multiple users for same trip list

Cleanup fixes including log statements for PR #944

* Added details in logging statements Completed marked future fixes for PR-944 Provided more detailed info regarding number of distinct users, as well as number of predictions made for the trip_list. * Implemented logic to clear models Clearing or trimming number of models for a particular user from the Stage_updateable_models collection. Constant K_MODELS_COUNT defines the number of models to keep. Currently tested with model data available in stage dataset snapshot. Pending - Complete testing with assertions to be added for testing. * Added function call Called the trim_model_entries function in upsert_model to delete models before inserting new ones. * Squashed commit of the following: commit 54659fb Merge: cf0c9e2 1159eac Author: K. Shankari <shankari@eecs.berkeley.edu> Date: Thu Dec 21 20:17:15 2023 -0800 Merge pull request #951 from MukuFlash03/fix-vuln Bulk deletion of site-package/tests commit 1159eac Author: Mahadik, Mukul Chandrakant <MukulChandrakant.Mahadik@nrel.gov> Date: Thu Dec 21 20:43:39 2023 -0700 Bulk deletion of site-package/tests Added a one line pipe command to remove all occurrences of site-packages/tests occurring in miniconda main directory. commit cf0c9e2 Merge: d2f38bc 3be2757 Author: K. Shankari <shankari@eecs.berkeley.edu> Date: Thu Dec 21 17:47:27 2023 -0800 Merge pull request #950 from MukuFlash03/fix-vuln Remove obsolete package versions commit 3be2757 Author: Mahadik, Mukul Chandrakant <MukulChandrakant.Mahadik@nrel.gov> Date: Thu Dec 21 18:05:23 2023 -0700 Remove obsolete package versions Cleaned up older versions for two packages: urllib3 - deleted stale version folders python - deleted tests folder commit d2f38bc Merge: 978a719 c1b0889 Author: K. Shankari <shankari@eecs.berkeley.edu> Date: Wed Dec 20 14:31:09 2023 -0800 Merge pull request #949 from MukuFlash03/fix-vuln Fixing latest Docker image vulnerabilities commit c1b0889 Author: Mahadik, Mukul Chandrakant <MukulChandrakant.Mahadik@nrel.gov> Date: Mon Dec 18 11:04:25 2023 -0700 Upgraded Ubuntu base image Latest Ubuntu base image was just released officially by Docker which contains updated version of libc6 and libc-bin. commit 07747d0 Author: Mahadik, Mukul Chandrakant <MukulChandrakant.Mahadik@nrel.gov> Date: Fri Dec 15 18:38:12 2023 -0700 Fixing latest Docker image vulnerabilities AWS Inspector found the following vulnerable packages: CRITICAL perl HIGH nghttp2, libnghttp2-14 cryptography, libssl3 cryptography libc6, libc-bin Upgraded perl, libssl3, nghttp2 packages by upgrading base Ubuntu image to latest of the same LTS version - jammy (22.04). Cryptography package was fixed by mentioning required version to be installed using conda. Libc6, Libc-bin can be fixed by using apt-get upgrade but this upgrades all packages which is not recommended as a blanket upgrade fix. * Updated Test file with dummy models Was using pre-built models for testing that were present in stage_snapshot dataset but these won't be available in the normal dataset. Hence, took reference from emission.tests.modellingTests.TestRunGreedyModel.py for creating test models using dummy trip data. Creating multiple such models and then checking for trimming operation success. * Reverted commit 31626ba + Removed log statements Commit 31626ba included changes related to vulnerability fixes, hence reverted them. Had added some log statements related to scalability fixes for model loading. Removed them and will add in separate PR. * Changed imports to access K_MODEL_COUNT Removed redundant "esda" import added possibly during local testing. Removed get_model_limit() as K_MODEL_COUNT is a class variable and can be accessed directly using class name. * Revert "Squashed commit of the following:" This reverts commit 31626ba. Earlier commit f54e85f missed out the discarded reverted changes relating to the vulnerability fixes. Hence removing them in these. * Included maximum_stored_model_limit parameter in trip_model config file Decided upon threshold value for model count above which redundant models will be deleted. This was agreed upon to be 3 and is defined in the trip_model.config.json.sample file. Necessary changes have been made in the related files to use this config value. * Addressed latest review comments - Improved logging statements to included whether model count has changed. - Skipped check for saved test data in Test file as we clear teardown() related collections in the DB. * Restored inferrers.py Did this to revert a whitespace change which was possibly due to removing a line at the end of the file. We don't want it to be modified as it is unrelated to this set of PR changes. Restored using this command by pointing to specific commit hash for the first instance where the file was modified. git restore --source=79ad1cc~1 emission/analysis/classification/inference/labels/inferrers.py * Updated Test for Storing limited models Added test to ensure that only the latest models are stored by comparing the write_ts times. * Revert whitespace changes * Revert whitespace changes --------- Co-authored-by: Mahadik, Mukul Chandrakant <MukulChandrakant.Mahadik@nrel.gov> Co-authored-by: K. Shankari <shankari@eecs.berkeley.edu>

…tionary (#954) * Filtered out rows with dictionary in user_label_df The idea is to check the data type using isinstance() and then apply this check on the entire data frame as a whole instead of doing it iteratively on each row which is much slower. These rows are then filtered out of the original dataframe leaving behind only the non-dict rows. * Added log statements to indicate dataframe filtering done Added log statement to the greedy_similarity_binning, to indicate filtering is being done for the dictionary elements in the dataframe if the column is 'trip_user_input'. * Modified filtering of survey inputs Now filtering survey inputs before dataframe itself is created by checking whether each dictionary value is a value or a nested dictionary. * Add TODO so I can merge this --------- Co-authored-by: Mahadik, Mukul Chandrakant <MukulChandrakant.Mahadik@nrel.gov> Co-authored-by: K. Shankari <shankari@eecs.berkeley.edu>

1. Updated ubuntu version to latest version. This fixes libgnutls30 package. 2. Cryptography package updated to suggested version. 3. Updated bash package to latest version manually. Latest Ubuntu version still contains vulnerable package so had to manually upgrade.

Latest vulnerability fixes

These are the server side changes related to e-mission/e-mission-docs#1062 The changes are fairly straightforward, and consistent with https://github.com/e-mission/e-mission-docs/blob/2665b39e1335ea04896b6944a4a065e7887b6cdc/docs/dev/back/adding_a_new_data_type.md?plain=1#L4 Concretely: - we add a new `bluetoothble` wrapper - we add references to it to `entry.py` and `builtin_timeseries.py` - we add formatters for both android and iOS A new wrinkle this time is that we are modifying the FSM, so there are also new transitions. Those needed to be added to the enums in the transition wrapper, and to the maps in the formatters so that the enums could be created properly. Bonus fix: check for the `None` transition properly on android and iOS to avoid spurious errors ``` >>> broken_transition_example = {'_id': ObjectId('661b129fc271a44bb612b464'), 'metadata': {'time_zone': 'America/Los_Angeles', 'plugin': 'none', 'write_ts': 1713050268.574551, 'platform': 'ios', 'read_ts': 0, 'key': 'statemachine/transition', 'type': 'message'}, 'user_id': UUID('f1aaae55-fc42-4527-bf7f-33f84d7c8c2f'), 'data': {'currState': 'STATE_ONGOING_TRIP', 'transition': None, 'ts': 1713050268.574418}} >>> broken_transition_example_entry = ad.AttrDict(broken_transition_example) >>> enufit.format(broken_transition_example_entry) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/kshankar/Desktop/data/e-mission/gis_branch_tests/emission/net/usercache/formatters/ios/transition.py", line 64, in format data.transition = transition_map[entry.data.transition].value KeyError: None ----- fixed code ------ >>> importlib.reload(enufit) <module 'emission.net.usercache.formatters.ios.transition' from '/Users/kshankar/Desktop/data/e-mission/gis_branch_tests/emission/net/usercache/formatters/ios/transition.py'> >>> enufit.format(broken_transition_example_entry) AttrDict({'_id': ObjectId('661b129fc271a44bb612b464'), 'user_id': UUID('f1aaae55-fc42-4527-bf7f-33f84d7c8c2f'), 'metadata': AttrDict({'time_zone': 'America/Los_Angeles', 'plugin': 'none', 'write_ts': 1713050268.574551, 'platform': 'ios', 'read_ts': 0, 'key': 'statemachine/transition', 'type': 'message', 'write_local_dt': LocalDate({'year': 2024, 'month': 4, 'day': 13, 'hour': 16, 'minute': 17, 'second': 48, 'weekday': 5, 'timezone': 'America/Los_Angeles'}), 'write_fmt_time': '2024-04-13T16:17:48.574551-07:00'}), 'data': AttrDict({'curr_state': 2, 'transition': None, 'ts': 1713050268.574551, 'local_dt': AttrDict({'year': 2024, 'month': 4, 'day': 13, 'hour': 16, 'minute': 17, 'second': 48, 'weekday': 5, 'timezone': 'America/Los_Angeles'}), 'fmt_time': '2024-04-13T16:17:48.574551-07:00'})}) ``` Testing done: Used the corresponding changes in e-mission/e-mission-phone#1144 to simulate BLE as follows: - Region exit - A few range updates until the `ble_beacon_found` transition was generated - Turned on location mocking from the android and iOS simulators, and manually generated the start trip transition on android - Clicked "range update" at random times during the simulated trip - BLE beacon lost from the UI - Turn off location mocking - Force end trip transition Testing Results: Android: ``` START 2024-04-13 17:36:02.096342 POST /usercache/put END 2024-04-13 17:36:02.313529 POST /usercache/put ebc13f1b-671b-4094-bce6-fed342da7e9c 0.2171182632446289 START 2024-04-13 17:36:02.583812 POST /usercache/get END 2024-04-13 17:36:02.591868 POST /usercache/get ebc13f1b-671b-4094-bce6-fed342da7e9c 0.007989168167114258 ``` ``` >>> edb.get_usercache_db().count_documents({"metadata.key": "background/bluetooth_ble"}) 57 >>> edb.get_timeseries_db().count_documents({"metadata.key": "background/bluetooth_ble"}) 0 ``` ``` 2024-04-13 17:37:57,635:DEBUG:140704655566784:write_ts = 1713054811.255 2024-04-13 17:37:57,635:DEBUG:140704655566784:module_name = emission.net.userca che.formatters.android.bluetooth_ble 2024-04-13 17:37:57,636:DEBUG:140704655566784:write_ts = 1713054811.294 2024-04-13 17:37:57,636:DEBUG:140704655566784:module_name = emission.net.userca che.formatters.android.bluetooth_ble 2024-04-13 17:37:57,636:DEBUG:140704655566784:write_ts = 1713054811.316 2024-04-13 17:37:57,636:DEBUG:140704655566784:module_name = emission.net.userca che.formatters.android.bluetooth_ble 2024-04-13 17:37:57,637:DEBUG:140704655566784:write_ts = 1713054811.339 2024-04-13 17:37:57,637:DEBUG:140704655566784:module_name = emission.net.userca che.formatters.android.bluetooth_ble 2024-04-13 17:37:57,637:DEBUG:140704655566784:write_ts = 1713054811.369 ``` ``` >>> edb.get_usercache_db().count_documents({"metadata.key": "background/bluetooth_ble"}) 0 >>> edb.get_timeseries_db().count_documents({"metadata.key": "background/bluetooth_ble"}) 57 ``` iOS ``` START 2024-04-13 16:17:50.707151 POST /usercache/put START 2024-04-13 16:17:50.737703 POST /usercache/get END 2024-04-13 16:17:50.763880 POST /usercache/get f1aaae55-fc42-4527-bf7f-33f84d7c8c2f 0.026064157485961914 END 2024-04-13 16:17:51.340867 POST /usercache/put f1aaae55-fc42-4527-bf7f-33f84d7c8c2f 0.6329052448272705 ``` ``` >>> edb.get_usercache_db().count_documents({"metadata.key": "background/bluetooth_ble"}) 74 ``` ``` DEBUG:root:module_name = emission.net.usercache.formatters.ios.bluetooth_ble DEBUG:root:write_ts = 1713050204.075974 DEBUG:root:module_name = emission.net.usercache.formatters.ios.bluetooth_ble DEBUG:root:write_ts = 1713050204.077563 DEBUG:root:module_name = emission.net.usercache.formatters.ios.bluetooth_ble DEBUG:root:write_ts = 1713050204.078974 DEBUG:root:module_name = emission.net.usercache.formatters.ios.bluetooth_ble DEBUG:root:write_ts = 1713050207.32417 DEBUG:root:module_name = emission.net.usercache.formatters.ios.bluetooth_ble DEBUG:root:write_ts = 1713050207.326033 ``` ``` >>> edb.get_usercache_db().count_documents({"metadata.key": "background/bluetooth_ble"}) 0 >>> edb.get_timeseries_db().count_documents({"metadata.key": "background/bluetooth_ble"}) 74 >>> edb.get_timeseries_error_db().count_documents({"metadata.key": "background/bluetooth_ble"}) 0 ```

Our negative test in this test suite used `transition': none` as an invalid transition. However, in c86fcf5 we started handling `None` transitions properly So we need to change it to a transition that is actually invalid. Testing done: Before this, `testMoveWhenEmpty` failed. Now, `testMoveWhenEmpty` passes

`save` -> `replace_one` with upsert `remove` -> `delete_one`

🗃️ Plumb through the BLE objects to the server

This will provide a way for the pipeline to access the dynamic config. the first time `get_dynamic_config` is called, it fetches the config associated with the study from Github. The study is defined by the environment variable `STUDY_CONFIG`. The downloaded config is cached within one session (each pipeline run is a different session), so subsequent calls to `get_dynamic_config` will not need to re-fetch.

Testing done: -loaded a dump of dfc-fermata from 24 april into my local database -ran `. setup/setup.sh` to update dependencies (e-mission-common is a new dependency) -located one user (the opcode I've been using the most for my own testing). cleared pipeline for that user via `./e-mission-py.bash bin/reset_pipeline.py -e nrelop_dfc-fermata_test_*` -ran pipeline for that user, supplying the STUDY_CONFIG as an environment variable `STUDY_CONFIG=dfc-fermata ./e-mission-py.bash bin/debug/intake_single_user.py -e nrelop_dfc-fermata_test_*` -observed log output. BLE modes were detected for at least some of the sections ``` 2024-05-02 17:11:52,566:DEBUG:140704482301888:Returning cached dynamic config for dfc-fermata at version 5 2024-05-02 17:11:52,566:DEBUG:140704482301888:getting BLE sensed vehicle for section from 1713192674 to 1713193146 2024-05-02 17:11:52,593:DEBUG:140704482301888:After filtering, 471 BLE ranging entries during the section 2024-05-02 17:11:52,595:DEBUG:140704482301888:after counting, ble_beacon_counts = {'dfc0:fff0': 471} 2024-05-02 17:11:52,595:DEBUG:140704482301888:found vehicle Jack's Mazda 3 with BLE beacon dfc0:fff0 ``` Some sections still have a ble_sensed_mode of None but this is plausible because we hadn't integrated the BLE sensing until a couple weeks ago and I've been using this opcode longer than that. - observed that the sections have `ble_sensed_mode`: ``` ts = esta.TimeSeries.get_time_series(user_id) entries = ts.find_entries(["analysis/cleaned_section"]) print([e['data']['ble_sensed_mode'] for e in entries]) ``` Output shows 31 sections with 'None' and 9 sections with the following: ``` {'value': 'car_jacks_mazda3', 'bluetooth_major_minor': ['dfc0:fff0'], 'text': "Jack's Mazda 3", 'baseMode': 'CAR', 'met_equivalent': 'IN_VEHICLE', 'kgCo2PerKm': 0.16777, 'vehicle_info': {'type': 'car', 'license': 'JHK ****', 'make': 'Mazda', 'model': '3', 'year': 2014, 'color': 'red', 'engine': 'ICE', 'mpg': 33}} ```

Since sections now have a `ble_sensed_mode` property in addition to `sensed_mode`, we can use the `ble_sensed_mode` as the basis for a new kind of section summary. To enable this, the `get_section_summary` function was refactored to accept an additional argument, which can be either `sensed_mode` or `ble_sensed_mode`. If it's `ble_sensed_mode`, the summary values will be summed and grouped by the `baseMode` of the vehicle that was sensed by BLE. If the additional argument is not given, it just defaults to `sensed_mode` (same behavior as before) Testing done: -using a dump of dfc-fermata from 24 april loaded into my local database -reset and re-ran the pipeline for a user -inspected the `ble_sensed_summary`s of the resulting confirmed trips: ``` ts = esta.TimeSeries.get_time_series(user_id) entries = ts.find_entries(["analysis/confirmed_trip"]) for e in entries: print(e['data']['ble_sensed_summary']) ``` Output: ``` {'distance': {'UNKNOWN': 4294.125850161847}, 'duration': {'UNKNOWN': 10379.947390556335}, 'count': {'UNKNOWN': 2}} {'distance': {'UNKNOWN': 1511402.9404509964}, 'duration': {'UNKNOWN': 131559.69582343102}, 'count': {'UNKNOWN': 4}} {'distance': {'UNKNOWN': 34462.99760301494}, 'duration': {'UNKNOWN': 977.1180453300476}, 'count': {'UNKNOWN': 3}} {'distance': {'UNKNOWN': 19485.795458445176}, 'duration': {'UNKNOWN': 3718.8894975185394}, 'count': {'UNKNOWN': 2}} {'distance': {'UNKNOWN': 850.4852679201318}, 'duration': {'UNKNOWN': 15861.93504524231}, 'count': {'UNKNOWN': 1}} {'distance': {'UNKNOWN': 47455.78820030493}, 'duration': {'UNKNOWN': 7241.158731937408}, 'count': {'UNKNOWN': 3}} {'distance': {'UNKNOWN': 1551245.9810044689}, 'duration': {'UNKNOWN': 13537.040816783905}, 'count': {'UNKNOWN': 5}} {'distance': {'UNKNOWN': 713.4707286619786}, 'duration': {'UNKNOWN': 1448.6543893814087}, 'count': {'UNKNOWN': 1}} {'distance': {'UNKNOWN': 3981236.406827432}, 'duration': {'UNKNOWN': 26685.684997320175}, 'count': {'UNKNOWN': 2}} {'distance': {'UNKNOWN': 333.2678850005484}, 'duration': {'UNKNOWN': 21.80204725265503}, 'count': {'UNKNOWN': 1}} {'distance': {'UNKNOWN': 2866.186136018516}, 'duration': {'UNKNOWN': 3249.8193860054016}, 'count': {'UNKNOWN': 1}} {'distance': {'UNKNOWN': 2645.5964919933926}, 'duration': {'UNKNOWN': 750.7993273735046}, 'count': {'UNKNOWN': 1}} {'distance': {'UNKNOWN': 2442.091053542208}, 'duration': {'UNKNOWN': 489.8975965976715}, 'count': {'UNKNOWN': 1}} {'distance': {'UNKNOWN': 4449.832978365483}, 'duration': {'UNKNOWN': 2710.8742623329163}, 'count': {'UNKNOWN': 1}} {'distance': {'UNKNOWN': 6409.15214382765}, 'duration': {'UNKNOWN': 745.2548396587372}, 'count': {'UNKNOWN': 1}} {'distance': {'UNKNOWN': 163.76752941202727}, 'duration': {'UNKNOWN': 12.731597900390625}, 'count': {'UNKNOWN': 1}} {'distance': {'UNKNOWN': 4440.681532305813}, 'duration': {'UNKNOWN': 93.64632654190063}, 'count': {'UNKNOWN': 3}} {'distance': {'UNKNOWN': 366.71884583356365}, 'duration': {'UNKNOWN': 1299.6981484889984}, 'count': {'UNKNOWN': 1}} {'distance': {'UNKNOWN': 2078.9642225193907}, 'duration': {'UNKNOWN': 11704.498789787292}, 'count': {'UNKNOWN': 1}} {'distance': {'E_CAR': 2328.846316296626, 'UNKNOWN': 6971.404103278014}, 'duration': {'E_CAR': 3389.3658237457275, 'UNKNOWN': 945.0989944934845}, 'count': {'E_CAR': 1, 'UNKNOWN': 2}} {'distance': {'UNKNOWN': 155.63066751202888}, 'duration': {'UNKNOWN': 116.02340650558472}, 'count': {'UNKNOWN': 1}} {'distance': {'E_CAR': 7177.555954021382, 'UNKNOWN': 43.50518475446066}, 'duration': {'E_CAR': 1140.2288630008698, 'UNKNOWN': 175.48782801628113}, 'count': {'E_CAR': 1, 'UNKNOWN': 1}} {'distance': {'UNKNOWN': 846.2578386118126}, 'duration': {'UNKNOWN': 766.3598084449768}, 'count': {'UNKNOWN': 1}} {'distance': {'E_CAR': 104.02576498063684}, 'duration': {'E_CAR': 38.734825134277344}, 'count': {'E_CAR': 1}} {'distance': {'E_CAR': 8250.614829636044}, 'duration': {'E_CAR': 825.4048693180084}, 'count': {'E_CAR': 1}} {'distance': {'E_CAR': 6171.384206967509}, 'duration': {'E_CAR': 781.507465839386}, 'count': {'E_CAR': 1}} {'distance': {'E_CAR': 1065.2594223724116, 'UNKNOWN': 73.02336166659416}, 'duration': {'E_CAR': 491.0299062728882, 'UNKNOWN': 105.17022156715393}, 'count': {'E_CAR': 1, 'UNKNOWN': 1}} {'distance': {'E_CAR': 733.2246673403289}, 'duration': {'E_CAR': 845.3612954616547}, 'count': {'E_CAR': 1}} {'distance': {'E_CAR': 4275.13516142465}, 'duration': {'E_CAR': 541.7157530784607}, 'count': {'E_CAR': 1}} {'distance': {'E_CAR': 7391.707788417107}, 'duration': {'E_CAR': 611.8730540275574}, 'count': {'E_CAR': 1}} {'distance': {'E_CAR': 10035.89912694906}, 'duration': {'E_CAR': 1205.663957118988}, 'count': {'E_CAR': 1}} {'distance': {'E_CAR': 9085.234840541858}, 'duration': {'E_CAR': 3597.4890780448914}, 'count': {'E_CAR': 1}} ```

When conda installs dependencies, it will attempt to use git for any dependencies that reference a git repository (e.g. `git+https://github.com/my/repo`) So git must be installed on the container before dependencies are installed, or it will fail.

`ble_sensed_summary` is a new property for confirmed trips, so tests fail on the old snapshots. New generated snapshots are the same, but with different uuid and oids, and with empty ble_sensed_summary values in each confirmed trip. Currently we don't have any tests with BLE sensor data, so there are no tests that have a `ble_sensed_summary` that is filled in. Coming later...

This test is failing because it expects shankari_2016-06-20.expected_confirmed_trips to have a specific UUID, which is hardcoded in the test. This came up the last time these "expected" snapshots were re-generated: 3c52375 I think it will be annoying to keep replacing the user_id every time the expected outputs change. So I just reworked the test to use whatever user_id is in the file rather than some arbitrary one

🎇 BLE matching in the pipeline

We use a `git` based reference to the `em-common` repo. The related PR added git to the test dockerfile #965 to get the tests to pass. We need to add a similar command to the production Dockerfile

* Trying to fix regression Attempting to get the nominatim tests to pass after regression was identified. * Getting the test to run Have to get the test to run to see if it fixes it! * Push test Re-enabled nominatim test on my branch, so hopefully it will run on this push * Reverting nominatim-docker-test.yml

The existing implementation that was here was addressing the localdate fields separately and treating them like a set of filters rather than one date range from start -> end. The "set of filters" was good enough for some things, like filtering to a specific month, but not for the canonical datetime range filtering where you want everything between an arbitrary start and end datetime. The complexity in the logic to form the necessary NoSQL queries is more than expected, but after the dust settles it's about the same amount of code as the old implementation was. In short, this implementation finds what units are used by both start and end localdates, walks down them (from largest unit to smallest unit), and combines $and and $or blocks to determine the upper and lower bound

#968 (comment)

make localdate queries behave like canonical datetime range filters

…into survey-dashboard

The updated in-app dashboard will be using this new time type via `/result/metrics/yyyy_mm_dd`. When using this time type, `e-mission-common` is used for the metrics calculations, which has a new structure (#966 (comment)) `start_time` and `end_time` are given as ISO dates

Make the text for push notifications configurable, translatable, and filterable to only include users with at least one recent trip that has no user input. The new config field 'push_notifications' has 'title' and 'message' (the text, per language) and 'recent_user_input_threshold' (the number of days to consider for determining 'recent user input'). Note the caveat about partially-labeled trips.

includes bugfixes for metrics calculations JGreenlee/e-mission-common@0.5.0...0.5.1

make push notifs configurable, i18n, & only for users missing labels

Co-authored-by: K. Shankari <shankari@eecs.berkeley.edu>

Dashboard Survey Metrics

I recently rewrote the local dt queries used by TimeComponentQuery (#968) to make them behave as datetime range queries. But this results in some pretty complex queries which could have nested $and and $or conditions. It felt overengineered but that was the logic required if we were to query date ranges by looking at local dt objects where year, month, day, etc are all recorded separately. Unfortunately, those queries run super slowly on production. I think this is because the $and / $or conditions prevented MongoDB from optimizing efficiently via indexing Looked for a different solution, I found something better and simpler. At first I didn't think using fmt_time fields with $lt and $gt comparisons would work becuase they're ISO strings, not numbers. But luckily MongoDB can compare strings this way. it performs a lexicographical comparison where 0-9 < A-Z < a-z (like ASCII). ISO has the standard format 0000-00-00T00:00:00 The dashes and the T will always be in the same places, so effectively, only the numbers will be compared. And the timezone info is at the end, so it doesn't get considered as long as we don't include it in the start and end inputs. The only thing that specifically needs to be handled is making the end range inclusive. If I query from "2024-06-01" to "2024-06-03", an entry like "2024-06-03T23:59:59-04:00" should match. But lexicographically, that comes after "2024-06-03". Appending a "Z" to the end range solves this. The result of all this is that the TimeQuery class (timequery.py) is now able to handle either timestamp or ISO dates, and this is what metrics.py will use for summarize_by_yyyy_mm_dd.

…into faster_localdate_queries

#970 (comment) > I realized that the TimeQuery on `e-mission-server` (which works w/ MongoDB) mirrors TimeQuery in the native code plugins (which work with SQL dbs). > > For that reason, I don't want to rename anything in `TimeQuery` right now. So instead of extending the functionality of `TimeQuery` to work with either ts or fmt_time, I'm going to make a new class called `FmtTimeQuery` (or similar) > > I briefly read up on SQL queries to make sure they work the same way when comparing strings. They do – so later on, we could implement `FmtTimeQuery` on the native code UserCache plugin too

#970 (comment) reverts #968 back to the original implementation since we now have a faster way to do "range" queries via FmtTimeQuery. To make it clearer that FmtTimeQuery is for ranges and TimeComponentQuery is for filters, I renamed some functions and added more descriptive comments/docstrings

for yyyy_mm_dd metrics, query by fmt_time instead of local_dt

Integration tests for Geofabrik Overpass API

Mahadik, Mukul Chandrakant and others added 30 commits January 26, 2024 13:51

Merge pull request #955 from MukuFlash03/log-statements-model-load

0a8c61d

Cleanup fixes including log statements for PR #944

Merge pull request #962 from MukuFlash03/vuln-fix-apr24

7660ecc

Latest vulnerability fixes

⬆️ Upgrade the pymongo calls to the latest version

ae7996c

`save` -> `replace_one` with upsert `remove` -> `delete_one`

Merge pull request #963 from shankari/add_bluetooth_support

bc2d7b6

🗃️ Plumb through the BLE objects to the server

add getSurveyMetric endpoint

675011d

Merge pull request #965 from JGreenlee/ble_matching

cc20ac4

🎇 BLE matching in the pipeline

Add git to the image Dockerfile as well

d33ce2e

We use a `git` based reference to the `em-common` repo. The related PR added git to the test dockerfile #965 to get the tests to pass. We need to add a similar command to the production Dockerfile

fix testLocalRangeRolloverQuery

a326b1e

#968 (comment)

Merge pull request #968 from JGreenlee/fix-localdate-ranges

36432a8

make localdate queries behave like canonical datetime range filters

Merge branch 'master' of https://github.com/e-mission/e-mission-server …

1ad25e2

…into survey-dashboard

remove dummy survey data

05d78f3

use e-mission-common @ 0.5.1

4d87bd7

includes bugfixes for metrics calculations JGreenlee/e-mission-common@0.5.0...0.5.1

Merge pull request #969 from JGreenlee/configurable_push_notifs

3150c99

make push notifs configurable, i18n, & only for users missing labels

fix ymd -> yyyy_mm_dd typo

1f00434

Co-authored-by: K. Shankari <shankari@eecs.berkeley.edu>

JGreenlee and others added 10 commits May 29, 2024 00:30

fix ymd -> yyyy_mm_dd typo

5c20021

Merge pull request #966 from jiji14/survey-dashboard

e7d028a

Dashboard Survey Metrics

Fix 500 error by bumping up the dependency

20acdd9

Merge branch 'master' of https://github.com/e-mission/e-mission-server …

23630ce

…into faster_localdate_queries

fix docstring in fmt_time_query

f9e411c

Merge pull request #970 from JGreenlee/faster_localdate_queries

f55f223

for yyyy_mm_dd metrics, query by fmt_time instead of local_dt

Merge pull request #959 from nataliejschultz/overpass

99d9c21

Integration tests for Geofabrik Overpass API

nataliejschultz merged commit d9b1ea8 into nataliejschultz:overpass Jul 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge changes to figure out what's going on with overpass test #3

Merge changes to figure out what's going on with overpass test #3

nataliejschultz commented Jul 11, 2024

Merge changes to figure out what's going on with overpass test #3

Merge changes to figure out what's going on with overpass test #3

Conversation

nataliejschultz commented Jul 11, 2024