Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autocomplete attribute for street-address details #4986

Open
battre opened this issue Oct 8, 2019 · 50 comments
Open

autocomplete attribute for street-address details #4986

battre opened this issue Oct 8, 2019 · 50 comments
Labels
accessibility Affects accessibility addition/proposal New features or enhancements i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. needs implementer interest Moving the issue forward requires implementers to express interest topic: forms

Comments

@battre
Copy link

battre commented Oct 8, 2019

Currently the HTML Standard seems a bit US centric:

The autocomplete attribute supports the street-address type for an unstructured street address or address-line1, address-line2 and address-line3 for the respective lines.

In Germany and several other countries, websites typically ask for the street name and a house number. In Spain, websites ask for the floor number and door number (I am not a Spanish speaker, I have seen "Num.", "Piso", "Letra", "Esc." on simyo.es for example). Some Brazilian sites ask for a neighborhood (which we might express with an address-levelX but that might be a bit underspecified because I am not aware of a canonical mapping of entities in specific countries).

Should we start a discussion on defining more entities around addresses? I think that just by supporting a street-name, house-number and apartment-number, we could cover a lot of ground.

@annevk annevk added addition/proposal New features or enhancements topic: forms labels Oct 8, 2019
@annevk
Copy link
Member

annevk commented Oct 8, 2019

cc @mnoorenberghe

@carstenhag
Copy link

carstenhag commented Oct 13, 2019

Chrome seems to somehow fill it correctly sometimes, see https://i.imgur.com/QRNxfXL.png (segmueller.de).

This is very rare though and probably depends on the ID's given. Usually, I'd say with a 90% probability, the "street" field from forms gets filled with the street name and the house number.
The house number field is then empty and I have to manually edit it, which is unfortunate and could be fixed via the spec.

@annevk / maintainers: Do you need any example websites to show/clarify the issue?

@sideshowbarker
Copy link
Contributor

sideshowbarker commented Oct 14, 2019

In Japan, we don’t even have street addresses. Instead, the equivalent is a series of numbers that starts with a district number, and then a block number, and then what’s nominally a specific building number.

I say “nominally” because it’s common for several buildings to actually have the same “building number” (due, e.g., to cases where an older, larger building on a piece of land gets demolished and replaced with two or more smaller buildings on the same land).

So in Japan, in addition to a “building number”, it’s often also necessary to also specify a building name.

When the autocomplete attribute was being defined, I remember we specifically discussed the Japanese-addressing case — as well as other locales/cases with addressing schemes that don’t use street names or that need something more specific than a street address — and (as far as I recall), the relevant set of tokens now in the spec were decided on because trying to define a richer set of tokens to reflect all the possible cases/locales would have ended up with a set that was just too unwieldy in practice.

@annevk annevk added the needs implementer interest Moving the issue forward requires implementers to express interest label Oct 14, 2019
@annevk
Copy link
Member

annevk commented Oct 14, 2019

@carstenhag I think we mainly need to hear from implementers to what extent they are interested in putting work into this as this would also require UI that's aware that if you change the country the address format changes. (E.g., Firefox only supports a single address field at the moment as far as I can tell.)

@battre
Copy link
Author

battre commented Oct 14, 2019

Thank you for sharing the historic context. I appreciate that there is a lot of complexity in international addresses that I am not aware of.

I wonder whether we can still come up with a practical solution that addresses the problem that websites put street and house number into separate fields. We have looked at a few hundred websites in Brazil, Spain, France, Germany and India (focused on non-US websites; acknowledging that this is by no means a representative sample) and noticed this challenge quite frequently in all of these countries but India.

Assuming that websites do not change their UI to support the autocomplete attribute, I wonder how we feel about adding special cases for "a lot but not all countries". I would propose that addresses are too complex to cover all countries with all nuances so we can only get closer to "correct" but never reach it 100%.

Another syntactic idea I had was to go with sub-attributes. Something like
address-line1[street-name] and address-line1[house-number]. Would you feel different about this?
This could also be used for #4987 as a format specifier cc-exp[mm/yy] to express an expiration in format "MM/YY".

@annevk Regarding implementer interest: I am working for Google on Chrome.

@battre
Copy link
Author

battre commented Oct 14, 2019

@rmondello Do you know anybody who works in this space on WebKit?

@annevk annevk added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Oct 14, 2019
@annevk
Copy link
Member

annevk commented Oct 14, 2019

cc @whatwg/i18n

@xfq
Copy link
Contributor

xfq commented Oct 14, 2019

FWIW, this may also affect the purpose attribute in APA's Personalization work.

@annevk annevk added the accessibility Affects accessibility label Oct 14, 2019
@annevk
Copy link
Member

annevk commented Oct 14, 2019

cc @whatwg/a11y per above comment. Does make me wonder how many attributes we need to state the same thing.

@littledan
Copy link
Contributor

@rxaviers and I were chatting about internationalized address formatting. I wonder if the JavaScript Intl API could help by giving data about the required fields for a locale. @sffc

@aardrian
Copy link

Given the role autocomplete has in WCAG SC 1.3.5: Identify Input Purpose, I am a fan of any new values that are more internationalized and can benefit more users.

From the user perspective, as long as browsers support it with auto-fill values, then all good.

From the dev perspective, to @sideshowbarker's comment above, too many options and we can expect many devs to use the wrong ones, blunting the accessibility benefit.

@aphillips
Copy link
Contributor

Postal addresses are famously complicated to internationalize, due to the wide variation in formats between (and sometimes within) countries. In addition, there can be differences in what application authors prefer (in terms of the level of granularity they wish to store/process/validate).

There are several ways this might be addressed. On the one hand, we could try to enumerate all postal address components globally such that page authors can always specify the component they mean--door number, prefecture, administrative unit, floor number, etc. As noted by @xfq and @aardrian this could be of use to assistive technologies. On the other hand, we might try to address only specific problems with additions (at a disadvantage to users in countries that need something different).

A key thing to notice is that the language/locale of the page is not the same thing as the country of an address and the form used to collect a specific address needs to be tied to the country it ships to.

@littledan This has less to do with locale than it does with country/region (although there is some locale influence). I would suggest that postal address parts and their regional association be suggested to CLDR as an addition. Intl could then leverage this.

I'll bring this up at W3C I18N WG's next telecon, but we're off this week due to IUC43.

@patrickhlauke
Copy link
Member

speaking of WCAG's repurposing (perversion) of the attribute...there are certain situations where an input accepts two different types of information (such as "Username or email address") that currently can't be expressed I think? would it be a massive complication allowing these sorts of multiple values to be used (in order of preference, perhaps)?

@annevk
Copy link
Member

annevk commented Oct 15, 2019

@patrickhlauke see #4445.

@domenic
Copy link
Member

domenic commented Oct 15, 2019

Some quick comments here, although it looks like there's a lot of good discussion on the key points already...

Another syntactic idea I had was to go with sub-attributes. Something like
address-line1[street-name] and address-line1[house-number]. Would you feel different about this?

FWIW I would prefer address-line1-street-name and address-line1-house-number.

From the user perspective, as long as browsers support it with auto-fill values, then all good.

Indeed. I think that's the key point, is whether there are actors in the ecosystem who would actually leverage this. It sounds like Chrome might take the steps @annevk describes (i.e., change their "user information" UI to have separate stree-name/house-number fields when you're in the given country, so that it can successfully autofill later).

Given the precedent in #3745 (comment), we allow fairly liberal implementer support signals for expanding the autofill vocabulary. So I think the main goal of the discussion here is to gather input from everyone (as this thread has been doing) to make sure the design is reasonable, even if only one browser has immediate plans to do the UI work necessary.

I think this also speaks to the question of how much work we want to do in creating more autofill tokens for more classes of addresses, e.g. Japanese addresses. The answer, IMO, is as much work as the ecosystem wants to put in. Not just browsers, either. E.g. if extensions, or AT vendors, or similar communicated that they could serve a good number of users by introducing such new autofill tokens, then I think the spec should be a reasonable clearinghouse for coordinate those efforts, and having design discussions among multiple parties to shake out any cross-cutting issues.

@mnoorenberghe
Copy link

Please excuse my rambling…

Do we have any evidence that the sites that currently use separate fields for these values actually require them to be separate fields? Or is it just a local preference?

@carstenhag I think we mainly need to hear from implementers to what extent they are interested in putting work into this as this would also require UI that's aware that if you change the country the address format changes. (E.g., Firefox only supports a single address field at the moment as far as I can tell.)

We show a single address-line textarea in preferences but can split it into up to 3 lines when filling. Likewise for the phone number field, we show one box but can split into many different combinations of <input> when necessary due to all the different tel-* tokens (we don't support all of them).

Another syntactic idea I had was to go with sub-attributes. Something like
address-line1[street-name] and address-line1[house-number]. Would you feel different about this?

FWIW I would prefer address-line1-street-name and address-line1-house-number.

I'm guessing the reason for the square bracket syntax was to allow it to work with address-lineN with N from 1 to 3. Would you want to instead add all 6 tokens since I don't think these components are always on line 1?

Indeed. I think that's the key point, is whether there are actors in the ecosystem who would actually leverage this. It sounds like Chrome might take the steps @annevk describes (i.e., change their "user information" UI to have separate stree-name/house-number fields when you're in the given country, so that it can successfully autofill later).

The UI change is the easiest part to deal with… the harder part is being able to convert in both directions between address-lineN and the subdivisions being proposed. We dealt with a similar problem with less complexity for the different tel-* tokens and it wasn't nice… I think we still don't support all of the tel-* tokens as a result of the complexity. The reason you need to convert in both directions is because the user could have first saved that address-line as one field but needs to autofill in the separate components (and vice versa). I think it will be very hard for UAs to handle that without creating duplicates but other than somehow pushing sites to move away from these fields (unless of course they need them for shipping calculations or something like that), I don't see how we can nicely solve this problem. I'm skeptical that that Intl APIs could even be defined to do the transformations in both directions.

My main concern with this proposal is that it could make the use of these narrower fields more popular in the future even if not all UAs can handle them properly. Can we add them to the spec but mark it deprecated from the beginning? :P That would be similar to how we have https://compat.spec.whatwg.org/ where we standardize the web as it is, not as we want it (I realize this isn't a perfect analogy since sites probably aren't using these specific tokens now).

The spec already has the following text which is relevant but I'm not sure most people would notice it:

Generally, authors are encouraged to use the broader fields rather than the narrower fields, as the narrower fields tend to expose Western biases. For example, while it is common in some Western cultures to have a given name and a family name, in that order (and thus often referred to as a first name and a surname), many cultures put the family name first and the given name second, and many others simply have one name (a mononym). Having a single field is therefore more flexible.

Maybe we should expand on that and explicitly annotate which autocomplete tokens should be avoided in favour of broader ones? Also, in this case I don't think "Western biases" is applicable.

I'm very interested in hearing how other UAs plan to handle the bidirectional data transformation issue… I'm also interested in hearing arguments for why we should codify this pattern rather than leave it up to UA heuristics to figure out (which is the status quo). Why do we want to pave this cow path rather than encourage change?

P.S. I wonder if authors ever handle this by listening for insertReplacementText input events on the address-lineN field and moving the appropriate sub-components to their own fields…

@JohJohan
Copy link

JohJohan commented Dec 24, 2020

What is the status of this ticket?

We as a company use autofill options for our forms. We use postcode combined with house number (optional house number extension) to autocomplete the street and city data. This is for dutch websites. The service for autocompleting we use is https://pro6pp.nl/en.

I would like separate attribute for house number and house number extension

@nablex
Copy link

nablex commented Aug 3, 2021

It's an old and closed thread at this point but I was researching specifically address autocomplete options and came across this thread. I've also checked the linked thread and googled some more, but can't really find a better place to add my 2 cents, so I decided to add it here.

International addresses are complex in that each country or region requires their own details. However, the people living in these countries and regions generally know and expect these details.

By keeping the autocomplete options "generic", you do support the use-case where a single form has to serve a wide array of people, but you don't support the use-case where you specifically want your forms to match the expected input values for a region.

This could be to localize your form and thus appeal to the target population, it could be because your employer only works in that region and expects it to work that way or it could be because you want to validate the address according to region-specific business rules and you need structured input (as defined by that region) to do that.

We musn't forget that forms have existed for many years and region-specific details have always existed and were never an issue, but now you are basically giving developers a choice:

  • adapt your form to a more generic version
  • skip autocomplete alltogether

Neither is really the point of a specification like this.
Because I can't express what I need with the specification, I'm relegated to trying to "game the system" and check browser specific implementations.

In our neck of the woods we generally ask for the street name and the street number in separate input fields. In the case of an apartment building or the like, we also need a further specification as the street number only indicates the building as a whole. The additional bit (like the floor number) can sometimes be added to the number field or it can be asked in a separate third field.

@battre
Copy link
Author

battre commented Jan 28, 2022

We (Chrome) have reached a point where we would love to invest in extending the autocomplete attribute to better reflect the regional ways requesting address information.

I have written up an explainer document in a personal github repository: https://github.com/battre/autocomplete-attribute-explainer/tree/main

The core points are:

  • We would like to invest in a more inclusive way to specifying address structures that meet the requirements in countries that are currently not well served by the specification.
  • We now have some data that indicates the frequency of more structured address representations. We looked into street names and house number fields (so this is a non-exhaustive analysis) and found that these are actually quite common in many countries (e.g. in Germany we estimate that roughly 1/3 of address forms to contain a street name field, in Brazil we estimate roughly half of address forms to contain a house number field). Overall, the data looks compelling to us to implement better regional support.
  • I have made a proposal for integrating more structured addresses into the autocomplete architecture. The proposal would be to ask the site author to either rely on street-address/address-lineX or on the new more fine-grained field types.
  • I have made a couple of proposals for which field types to add, based on what we have observed in the wild.
  • I believe that there are reasonable ways for browsers to ask the user for structured and unstructured representations for their addresses.

I'd love to hear your feedback:

  • Is this the right thing to build?
  • Is the presentation reasonable?
  • Is this now sufficiently inclusive? What are the gaps? I expect that my view is still biased even though we have made an effort to look into several countries.

@battre battre closed this as completed Jan 28, 2022
@battre battre reopened this Jan 28, 2022
@mfreed7 mfreed7 added the agenda+ To be discussed at a triage meeting label Jan 31, 2022
@zcorpan
Copy link
Member

zcorpan commented Aug 11, 2022

@mfreed7 any update here? :)

Mason will ensure the updated proposal is posted in the issue.

#7981 (comment)

@battre
Copy link
Author

battre commented Aug 11, 2022

I can share a brief update: This is not forgotten. I am currently taking another pass on top ecommerce sites (as per ecommercedb.com) for a number of countries. After that, I will update the explainer doc with the results and draft a pull request to the spec.

@battre
Copy link
Author

battre commented Sep 28, 2022

Here is another short update: I have looked at the address form structures of real world websites in 23 countries:
https://github.com/battre/autocomplete-attribute-explainer/blob/main/country_analysis.md

I found this very enlightening but plan to re-do some of the work to convert this into a more machine processable format. I need to wrap my mind more around the question whether we can build an all encompassing solution or whether we need to build country specific solutions.

One observation/confirmation is that the autocomplete attribute does not work well for many countries.

Another observation is that there are a lot of snowflakes. E.g. in Hungary many websites split the name of a street (e.g. "Amphitheatre") from the type of the street ("Parkway", "Avenue", ...) into separate fields. If this is what delivery services ask for, I can see why websites implement it. But in this case today's autocomplete won't work. I am also convinced that this is a feature that we don't want to offer users in all countries.

I am currently thinking of having a machine readable catalog of address structures in different countries that could be shared across browsers. This catalog could contain a list of fields that are commonly used in different countries (this allows browsers to render address forms that meet the structures of countries and are more fine grained than what libaddressinput offers). The catalog could also contain rules for combining fields (in some countries you write "${street name} ${house number}" and in others "${house number} ${street name}"). If browsers agree on a set of rules, we could tell website authors which combinations can be filled on their site and which ones cannot.

Anyways... I am not dead and still exploring....

@mfreed7 mfreed7 added the agenda+ To be discussed at a triage meeting label Oct 3, 2022
@mfreed7
Copy link
Contributor

mfreed7 commented Oct 3, 2022

I've added Agenda+ for this - we'd like to bring it back for a bit more discussion. @past, could we please get this on the agenda for the Oct 27 meeting (skipping the Oct 13 meeting)?

@past past removed the agenda+ To be discussed at a triage meeting label Oct 27, 2022
@mfreed7 mfreed7 added the agenda+ To be discussed at a triage meeting label Nov 9, 2022
@mfreed7
Copy link
Contributor

mfreed7 commented Nov 9, 2022

Agenda+ to just gently ping the request for implementer points of contact to participate in a half-day autocomplete attribute workshop.

@smaug----
Copy link

@galich

@past past removed the agenda+ To be discussed at a triage meeting label Nov 11, 2022
@annevk
Copy link
Member

annevk commented Nov 14, 2022

In principle we're interested in this. Please keep me in the loop. I'll also file some issues on the repository.

@battre
Copy link
Author

battre commented Nov 14, 2022

Great! For planning purposes: Would you be interested in joining a 1/2 day workshop to discuss this in person and go a bit deeper?

@annevk
Copy link
Member

annevk commented Nov 15, 2022

I may or may not be the person that participates, but I'm the person you want to keep in the loop as you go about organizing it. 😊

@mathias22osterhagen22
Copy link

mathias22osterhagen22 commented Nov 28, 2022

@mnoorenberghe

Do we have any evidence that the sites that currently use separate fields for these values actually require them to be separate fields? Or is it just a local preference?

Yes. By example for third party website it's necessary to send the number of the street separately (like sumsub). For banking purpose as well, better to keep each part of the address distinctly to easier screening processes.

@romane-echino
Copy link

If I can add something to the fact that street-name and house-number should be separated and that end-users should now slowely take that habit is because of the new international standard ISO 20022 about payment standards.

That standard has been voted in Europe and currently in production in Switzerland. One part of that new standard is the QR-Invoice (only in switzerland atm) which contains all payment informations inside a 2d qr code. In the documentation the house-number is on another line than the street-name. It was a lot of pain for invoicing software (like us) to translate every address field to two field (I mean regexp doesn't cover 100% of the cases since people write addresses in so many ways...).

@battre
Copy link
Author

battre commented Dec 21, 2022

Thanks for sharing! Are there any other fields that ISO 20022 wants to treat specially? Apartment/Unit numbers? Floors? Staircases? Entrances? How do they deal with countries that rely on building names?

@battre
Copy link
Author

battre commented Dec 21, 2022

Is it this?

    <xs:complexType name="PostalAddress24">
        <xs:sequence>
            <xs:element maxOccurs="1" minOccurs="0" name="AdrTp" type="AddressType3Choice"/>
            <xs:element maxOccurs="1" minOccurs="0" name="Dept" type="Max70Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="SubDept" type="Max70Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="StrtNm" type="Max70Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="BldgNb" type="Max16Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="BldgNm" type="Max35Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="Flr" type="Max70Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="PstBx" type="Max16Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="Room" type="Max70Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="PstCd" type="Max16Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="TwnNm" type="Max35Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="TwnLctnNm" type="Max35Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="DstrctNm" type="Max35Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="CtrySubDvsn" type="Max35Text"/>
            <xs:element maxOccurs="1" minOccurs="0" name="Ctry" type="CountryCode"/>
            <xs:element maxOccurs="7" minOccurs="0" name="AdrLine" type="Max70Text"/>
        </xs:sequence>
    </xs:complexType>

(Found on https://www.iso20022.org/iso-20022-message-definitions?business-domain=1 in "Payments Mandates")

@romane-echino
Copy link

@battre Yep that's exactly it! How fast 😄
I'm not an expert in that new standard so... for now... I don't see any other major problem. But as usual future is gonna prove me wrong 😄

Here is some documentation derived from the ISO 20022 standard for switzerland :

Element Example: Structured Example: Combined Remarks
Address type “S” “K” “S” - Structured address “K” - Combined address
Name Pia-Maria RutschmannSchnyder Pia-Maria RutschmannSchnyder
Street or address line 1 Grosse Marktgasse Grosse Marktgasse 28 “S” - Street/P.O. Box “K” - Street and building number of P.O. Box
Building number or address line 2 28 9400 Rorschach “S” - Building number “K” - Postal code and town
Postal code 9400 “S” - Postal code “K” - Do not fill in
Town Rorschach “S” – Town “K” - Do not fill in
Country CH CH

Found on https://www.paymentstandards.ch/dam/downloads/ig-qr-bill-en.pdf in "4.3.1 Use of address information"

Some of our banks only support the Structured address standard and because of that it has to be separated

@battre
Copy link
Author

battre commented Dec 22, 2022

I am organizing a virtual workshop on the topic of improving the autocomplete attribute.

Date: January 17, 2023
Time: 7 AM - 9 AM PT (US West Coast), 4 PM - 6 PM GMT+1 (Germany), 9 PM - 11 PM GMT+5:30 (India)

The purpose of this workshop is to

  • build a mutual understanding of the problem space
  • align on a strategy and determine stakeholders who are interested in contributing to the execution or serve as advisors.

It should serve as a foundation for creating a workstream on this topic afterwards.

If anybody would like to be invited, please email me at battre@google.com with a one sentence summary of your affiliation and why you are interested in joining.

@battre
Copy link
Author

battre commented Feb 2, 2023

A big thank you to all the participants.

The meetings notes can be found here.

If you would like to contribute to this topic in the next weeks, please register here. I would like to set up a bi-weekly meeting to make progress in this space.

@r12a
Copy link

r12a commented Feb 3, 2023

FWIW [ In the workshop report international name changes were mentioned a few times. I got the impression that that is not a focus of the work here, so i mention this only in passing – but since it came up a few times i thought i should note that the issues with capturing, storing, and presenting personal names are much bigger than the problem with Hispanic family names. If you want a glimpse of some additional aspects to consider see Personal names around the world. ]

@momdo
Copy link
Contributor

momdo commented Feb 4, 2023

FYI:

While @himorin refers to Japanese government documents (#4986 (comment)), GIF DS-442 (Government Interoperability Framework - 442 "コアデータパーツ 住所(アドレス)") is available on GitHub from the Digital Agency.
https://github.com/JDA-DM/GIF/blob/main/440_%E3%82%B3%E3%82%A2%E3%83%87%E3%83%BC%E3%82%BF%E3%83%91%E3%83%BC%E3%83%84/md/442_core_dataparts_address.md

This diagram provides an overview of Japanese addresses structure:
https://github.com/JDA-DM/GIF/blob/main/440_%E3%82%B3%E3%82%A2%E3%83%87%E3%83%BC%E3%82%BF%E3%83%91%E3%83%BC%E3%83%84/md/442/media/image1.png

@battre
Copy link
Author

battre commented May 11, 2023

I have proposed some fundamental principles for how we could move forward for this here: https://github.com/battre/autocomplete-attribute-explainer/

I would appreciate feedback (either here or in the issues I have created for the individual sections).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accessibility Affects accessibility addition/proposal New features or enhancements i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. needs implementer interest Moving the issue forward requires implementers to express interest topic: forms
Development

No branches or pull requests