The basic process of the program is as follows: -> User inputs address text -> user selects address from autocomplete list -> the address is converted to lat/long which obtains Google static street view images -> images are analyzed by Google Vision AI Labels API which analyzes features of images and corresponding scores for features -> the labels and scores are parsed and compiled into a single string -> string is passed to GPT which returns a response which describes the scenery of the area based on labels and scores
This program is meant to be paired with an accessibility feature like VoiceOver for mac. VoiceOver reads text on the screen aloud depending on the desired focus of the user. The main purpose of a reader such a VoiceOver in our program would be to read the description prompt generated by GPT. This brings a partial experience of Google Street View to the visually impaired.
- Type address into address search bar or use the mic feature to use speech to text.
- Select address from autocomplete suggestions.
- Read the output from GPT printed in the text box.
- Google Street View Static API
- Google Places API
- Google Cloud Vision API
- OpenAI Completions API