-
Notifications
You must be signed in to change notification settings - Fork 293
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rotate bike_id on free_bike_status #147
Conversation
Regardless of where we end up on the issue of |
If the primary purpose of having a bike_id is for booking handoff, wouldn't it be clearer to rename this field to |
This was one issue discussed during an in-person GBFS developers workshop held before the NABSA 2019 conference (workshop agenda and notes). During the workshop, we discussed GBFS dataset producer and consumer uses of bike_id and if/how bike_id is currently rotated. We learned that some producers currently rotate bike_id; we learned of one example where the rotation of bike_id was required by a city as a requirement of permitted operations. We also heard about GBFS producers removing bike_id from their GBFS datasets and being required by cities to add them back in as a requirement of their permits. There was speculation about why producers may have done this, and privacy was discussed as a likely factor. Identified uses of bike_id include: Because GBFS is concerned with traveler-facing information, we are primarily concerned with the booking use case, though participants did discuss how to address the other two uses cases assuming that bike_id would be rotated regularly in the future. Ultimately, the inconvenience of changing current use cases needs to be weighed against current privacy concerns, and participants generally agreed that the rotation of bike_id was the simplest solution for now. For booking, the main purpose of bike_id is as a hand-off token. With the deeplinks PR #25 assumed to be moving forward in the near future, it’s possible that bike_id may not even serve that purpose in the future. Participants discussed making bike_id an optional field, but ultimately to reduce the possibility of this being a breaking change, the consensus was to rotate bike_id at least after every trip (some operators suggested they’d prefer to rotate more often for operational reasons) and to provide best practice guidance suggesting methods for rotation and for reversibility for operators. This change is intended to reduce the exposure of private data & does not guarantee anonymization. This proposal is part of the series of the minimum viable proposals we’re hoping to move forward in the coming weeks to get the spec better in line with industry practices and concerns, but we anticipate the community continuing to examine this issue in more depth on an on-going basis. |
One of the use cases of GBFS is monitoring of bike-sharing operators by
municipalities (It's for example required to provide a GBFS in the MDS
specification
https://github.com/CityOfLosAngeles/mobility-data-specification/tree/dev/provider
).
I respect the wish of bike-sharing operators to not communicate their
bike_ids in public feeds. But it would be a good idea to have the
possibility to communicate a static bike_id in a GBFS feed that is private
for municipalities / governments to not break with current implementations
of monitoring software.
Op di 8 okt. 2019 om 00:33 schreef Heidi G <notifications@github.com>:
… This was one issue discussed during an in-person GBFS developers workshop
held before the NABSA 2019 conference (workshop agenda
<https://docs.google.com/document/d/13uXgMF5LWYvKX7vw5Y6_SZsBmtd32VS0opEfDJFjJYo/edit?usp=sharing>
and notes
<https://docs.google.com/document/d/1TXf3U7gbrJrS_r2PmTrfbJspNydAyd42xU3ctNgzNLM/edit?usp=sharing>
).
During the workshop, we discussed GBFS dataset producer and consumer uses
of bike_id and if/how bike_id is currently rotated. We learned that some
producers currently rotate bike_id; we learned of one example where the
rotation of bike_id was required by a city as a requirement of permitted
operations. We also heard about GBFS producers removing bike_id from their
GBFS datasets and being required by cities to add them back in as a
requirement of their permits. There was speculation about why producers may
have done this, and privacy was discussed as a likely factor.
Identified uses of bike_id include:
Booking - bike_id is used for in-app booking
Operations - bike_id can be used to identify stale bikes
Regulatory compliance - bike_id may be used for counting bikes in support
of regulatory enforcement
Because GBFS is concerned with traveler-facing information, we are
primarily concerned with the booking use case, though participants did
discuss how to address the other two uses cases assuming that bike_id would
be rotated regularly in the future. Ultimately, the inconvenience of
changing current use cases needs to be weighed against current privacy
concerns, and participants generally agreed that the rotation of bike_id
was the simplest solution for now.
For booking, the main purpose of bike_id is as a hand-off token. With the
deeplinks PR assumed to be moving forward in the near future, it’s possible
that bike_id may not even serve that purpose in the future. Participants
discussed making bike_id an optional field, but ultimately to reduce the
possibility of this being a breaking change, the consensus was to rotate
bike_id *at least* after every trip (some operators suggested they’d
prefer to rotate more often for operational reasons) and to provide best
practice guidance suggesting methods for rotation and for reversibility for
operators.
This change is intended to reduce the exposure of private data & does not
guarantee anonymization. This proposal is part of the series of the minimum
viable proposals we’re hoping to move forward in the coming weeks to get
the spec better in line with industry practices and concerns, but we
anticipate the community continuing to examine this issue in more depth on
an on-going basis.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#147>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAUTBW76QCISN2IEKVC2XZLQNO2KJANCNFSM4HJP7XDA>
.
|
@sven4all In the workshop we reinforced the division of responsibilities between GBFS and MDS. GBFS is intended to be used for traveler-facing information, and MDS is intended to be used for regularly monitoring and enforcement. Some may that have used GBFS for regulatory purposes, but going forward privacy concerns outweigh alternate use cases like this for GBFS that fall outside its primary purpose. Hence the need to rotate bike_ids. If MDS doesn't fit the exact use case you're looking at for monitoring, I'd suggest bringing that up on the MDS repository as a new issue - https://github.com/CityOfLosAngeles/mobility-data-specification. In the workshop we did discuss fleshing out this PR with the exact expected steps for rotating IDs to make expectations clearer to producers and consumers, and that work remains to be done. |
@HeidiMG thanks for the informative summary for those of us following along online. I have committed the change in language to specify that bike_id needs to be rotated after every trip, "at minimum". |
@morganherlocker Want to go ahead and call a vote on this now that versioning has passed? |
IIUC, @sven4all could be referring to authenticated (and encrypted?) access to In the case of the City of Lisbon, most Regarding privacy, and the particularly strict requirements of GDPR in Europe, there should be no limitations given that the City itself is GDPR compliant in the ways it processes and stores information. My 2c. |
@tmontes that is exactly what I mean. I think that it would be helpful if GBFS supports and standardise this usecase as well (inclusive authentication and encryption). We use GBFS really like a MDS light variant (it's easier for operators to implement and less data has to be handed over). If it's decided that free_bike_status should have rotating bike ids it will lead to a lot of confusion for many operators when governments are still requiring static bike_ids in the free_bike_status.json endpoint. Would it be an idea to introduce a monitor_bike_status.json exclusively for government access inspired by free_bike_status.json? |
In the interest of not duplicating too much data between MDS and GBFS, while respecting the use cases of both (and noting that MDS currently requires providers to produce a GBFS feed), I wonder if a solution might be to leave I would only propose that something like that even be considered if a producer was willing to actually test it out, but it could serve both needs in a relatively lightweight way. |
We need to have a broader community conversation about how to avoid breaking the existing use cases of bike_id and what guidance to provide around when feed authorization is appropriate and how fields might be different in that case. But we also need updated spec documentation as soon as possible to ensure that we create a new understanding for operators and regulators that publishing stable bike_id in a publicly available GBFS dataset is not GBFS compliant. Everything that we have heard from GBFS consumers and producers indicates that there is a shared understanding about the risks of publicly publishing stable bike_id, and that we just need to come up with the right solutions for how to address it and still meet industry needs. With other GBFS enhancements we have started with minimum viable proposals, and we should do the same here. What do others think of this plan forward?
|
@heidiguenin I think that path forward sounds reasonable. I agree with @jcn that a new field for an immutable I agree that we need better language around what is allowed in a public feed for |
@heidiguenin Well, thinking more, I'm going to walk back a thumbs up for:
I think the above text should be part of the broader issue of private feeds and immutable bike IDs in these feeds. The GBFS Guiding Principles say (emphasis mine):
IMHO adding explicit text saying "for publicly published GBFS data sets" implies that privately published GBFS data sets are also encouraged/supported, which seems to go directly against the above guiding principles. So I would propose in this issue we focus only on providing guidance for handling bike IDs in public GBFS feeds, and leave all implicit and explicit modifications related to private feeds for another issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added language to provide some context for the change, as it will break certain use cases.
Co-Authored-By: Heidi G <38441752+heidiguenin@users.noreply.github.com>
I talked to @heidiguenin further about this issue offline. Given the critical privacy issues at stake, and that we want to follow the community process for changes to GBFS that meet the needs of the entire community, I'd like to propose that we try to address both issues (privacy and IDs in private feeds) in this proposal in order to try and move forwards with a viable solution quickly. I think the simplest path forward is adding the immutable_bike_id that @jcn proposed earlier that is used only in feeds with authentication (i.e., not open publicly available feeds). So, we would redefine
and we would also add:
@morganherlocker @tmontes @sven4all @jcn @heidiguenin What do you all think of this? |
@barbeau what do you think of the @sven4all idea of a separate file |
That's why
An application that uses public Scooter GBFS feed to display Mobility Locations on a map and displays Scooter ID. For your understanding, I attached a screenshot from an application running with one of our scooter feeds: |
Hey everyone, thank you for the excellent discussion. I really appreciate all the wonderful feedback and perspectives from you all. 🙇 It sounds like we are at a consensus point where:
Since this an active, widespread vulnerability, I would like to see us move to a vote and get this merged as quickly as possible. I'm supportive of the emerging regulatory use cases around authenticated feeds that could contain new metadata, and believe we should follow this up with separate formal proposals to shore up these ideas in the specification. These use cases are important and deserving of dedicated discussion and design process. With expediency in mind, I voice my support for merging bike_id rotation as the PR currently describes, since this position seems to have the fewest operational concerns and the broadest level of consensus, while addressing the user safety issue. I'll follow up this comment with a call for a formal vote, following the GBFS governance guidelines. Thank you for taking the time to follow along and contribute to this process! |
I hereby call a vote on this proposal. Voting will be open for 7 full days, until 11:59PM UTC on January 16. Please vote for or against the proposal, and include the organization for which you are voting in your comment. Please note if you can commit to implementing the proposal. |
Clarification:
I know a call for vote is ongoing and I'll review this with the City of Lisbon officials before casting any vote. Nevertheless, I suppose a better PR would say:
My 2c. |
Hi Tiago - Thanks for the thorough comments. Here are some notes for your and City of Lisbon's consideration. (Responses inline)
I believe some operators intend to rotate bike_id after a bike is moved or after some other "non-trip" event. "…At a minimum" would explicitly allow that, and would not diminish functionality in traveler-facing applications.
The stated purpose of GBFS is for traveler-facing applications. GBFS has also been useful for the purposes of monitoring and oversight by cities. I believe there needs to be a more thorough process to determine best practices and standards for that purpose -- beyond this immediate privacy patch. Long-term, I think there is a broadly held intention that GBFS and associated best practices should offer sound guidance to cities looking to set their own policies and requirements related to data for traveler-facing and monitoring applications.
Some consumers and producers have committed to implementing the deeplinks proposal (#25 (comment)) but to my knowledge it hasn't yet been implemented. It takes time for new features to be implemented. The purpose of specifying bike_id rotation is that this is already current GBFS practice in many cases and easily implementable. It is an expedient way of solving a privacy issue that otherwise remains open. In future versions, the community will have the opportunity to consider some of the ideas you've put forward. |
Ride Report votes in favor of this change. |
Google Maps votes in favor for the current proposal. |
Uber/JUMP votes in favor of this change. |
Ito World votes in favor of this change. |
Bird votes in favor of this change. |
Stae votes in favor of this change. |
Transit votes in favor of this change. |
We'd love to make this an official part of the spec, but first we need to see this change being implemented. Could you comment here if your organization has implemented this? @MuteQ @contra @bhandzo @rickbruce @charlesjump @kanagy @quicklywilliam Others? |
@heidiguenin Bird has implemented rotating IDs |
I opened a new issue #237 to discuss options for maintaining the existing GBFS use cases that may have been broken by bike_id rotation. |
GBFS now specifies that IDs should change after each trip, for privacy reasons. See here: MobilityData/gbfs#147
This PR changes the description of bike_id to a random id, rotated after each trip. This is a non breaking change, as some providers already rotate to a new random id daily.