The trickiest changes are in language/display where the dropping
of nl-BE as Vlaams in Dutch caused some issues. This will
probably be changed back as I think this change is erroneous, but
as we’re not into the politics of it, we adopt the CLDR changes.
Change-Id: I6c593960778cd4c5b9f0553340b72a1410166506
Reviewed-on: https://go-review.googlesource.com/30910
Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
- Match should not allocate
- Prevent allocation by not including a string in an invalid Extension.
Change-Id: Icfe091121d99945b70084214ae9e76c79d6ec5b8
Reviewed-on: https://go-review.googlesource.com/30271
Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
At least until escape analysis is improved for gccgo.
Fixesgolang/go#16135
Change-Id: I6c98d6650ca4be6d8d5eb704764df297b3f4dace
Reviewed-on: https://go-review.googlesource.com/24495
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Marcel van Lohuizen <mpvl@golang.org>
Example is now closer to actual usage.
Change-Id: Ib0d4c3991c78f9a270a5b2916414b78a884b465d
Reviewed-on: https://go-review.googlesource.com/19954
Reviewed-by: Nigel Tao <nigeltao@golang.org>
The CLDR plural data uses a different structure for marking locales
which was not picked up by the previous code.
More specifically, for most types of data, data is split in a
different file per locale. This gets picked up by the cldr package.
The plural data, instead, has collections of rules and then lists the
locales for which the rules apply. This is a special case within
CLDR (albeit a very handy one) and thus needs to be handled separately.
This CL adds these locales to compact base languages and compact
language tags.
Change-Id: I1654501024f4d978361431594c72ee32d8c3ab01
Reviewed-on: https://go-review.googlesource.com/19337
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Allow code handler to generate structs as well. Also now include
positional indexes for easier debugging and compacting output
for sparse slices.
Change-Id: I54ff6b64931d1fdfdd1cfc9ac520c2204e4ffd4d
Reviewed-on: https://go-review.googlesource.com/19191
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Apparently the sorting algo changed, causing the matchLang entries to be sorted differently. This doesn't change the validity, but is annoying for diffs.
Changed to stable sort.
Change-Id: I001a36a86897ef845d6c8db6403e39c92c06fbb5
Reviewed-on: https://go-review.googlesource.com/19190
Reviewed-by: Nigel Tao <nigeltao@golang.org>
cldr:
- needed to excluded "validity" directory as it contains xml data
not specific to locales.
language and internal:
- CompactIndex now supports *all* locales for which there are data
files and not just the ones which have data. This is necessary to
not drop en-US. Seems better anyway. This increases table size, but
added comment about a trick that could be used to reduce it again in
the future.
- Skipping more match data entries. My suspicion is that this is
supported anyway, but needs to be verified (TODO added).
language/display:
- manually removed support for "private use" script names are their
coverage is spotty and questionable.
- manually override coverage test for az_Arab: not supported by
English, but seems to have good data for languages for which this
may be relevant.
currency:
- minor changes in test code and generation code to adapt to
change data.
- manually added MVP as currency as it doesn't appear to be added
otherwise.
Note that search and collate are still stuck on old revisions.
Change-Id: I2b2b2cba66caaff7c027fcb421f99595430e31dd
Reviewed-on: https://go-review.googlesource.com/17802
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
broke during crossing merges.
Change-Id: I5973f0b116decbcd3e98c19f2555efbf2b0f5be8
Reviewed-on: https://go-review.googlesource.com/17496
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
The cldr package is quite low-level and probably would only for
internal use if it hadn't predated the internal directory convention.
Added doc.go in unicode directory to clarify nature of packages in
this subdirectory.
After this, the package that remains to be moved to unicode is bidi.
Change-Id: Ibff3285ca8cc00980e48fd30754c3f7abc87b2a0
Reviewed-on: https://go-review.googlesource.com/17491
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
The display package is now limited to contain only names for values
related to the language package. Moving it as a subpackage of
language makes both its purpose and discoverability clearer.
Sorry for the inconveniences.
Change-Id: I5321e7b81f0837f25f2523f6eb4811f82d0455b8
Reviewed-on: https://go-review.googlesource.com/17490
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Factored out often used functionality to rewrite files in a package that
are used used both for generation and the package itself.
Change-Id: I94da2afb701ffe9ebef333ac0ef504585f12af88
Reviewed-on: https://go-review.googlesource.com/17189
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Package should be pretty stable now.
No backwards incompatible changes expected after the currently
deprecated functionality has been removed.
Change-Id: I02e91082a9d90c0ed527d67ba00b36226dfe265e
Reviewed-on: https://go-review.googlesource.com/16869
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Users should use Currency type defined in currency package instead.
Also removed Currency from Coverage type. Note that Currency coverage
had not been implemented yet, so we're unlikely to break anyone here.
Change-Id: I094d3bbcec3a05481d627e497daf428191ad2eea
Reviewed-on: https://go-review.googlesource.com/16868
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
One of two CLs to remove deprecated functionality.
Final step is to remove the "this package may change" notice.
Change-Id: I4e50904c0be6a8ef1413202e433636910f799ab9
Reviewed-on: https://go-review.googlesource.com/16867
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Change-Id: Ibdfebe8d3bf2b028444c5e5cc4b48c0ee4cfbaba
Reviewed-on: https://go-review.googlesource.com/16863
Reviewed-by: Ralph Corderoy <ralph@inputplus.co.uk>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Considerably reduces binary size contribution of this package.
Change-Id: I97e66cf0cf73471c55d26dd6accdcec95f086661
Reviewed-on: https://go-review.googlesource.com/16694
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
This pulls in a lot of tables and code for Tag that may be unnecessary.
Change-Id: I5df11b6a2da95d1022dda32349ec50dbb35ada95
Reviewed-on: https://go-review.googlesource.com/16696
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Simpler map shaves off about 40k of the binary size even though
table entry size is reduced by only 1k.
This is about 10% of what package langauge adds to the imported
packages.
Change-Id: I479a7de13587b77cce5e5db326c469834de071cd
Reviewed-on: https://go-review.googlesource.com/16626
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
This generates much smaller code. See golang.org/issue/13145.
Update golang/go#13145
Change-Id: Iad06f6bd3a59cb8257bb8f9e331b7e6dc1d28efa
Reviewed-on: https://go-review.googlesource.com/16625
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
maketables assigns a low code to languages that are frequent.
Include all languages for which CLDR defines data in this set.
This guarantees there is a compact index for all languages
references in gen_index.go. This, in turn, can be used to
optimize CompactIndex, which has some issues (explained in
later CL).
This CL also includes the addition of CLDRVersion and
deprecation of Version. This is to make it consistent
with other packages.
(Sorry, thought I did excluded it from staging)
Change-Id: Ia92e8c663fd4641976268c763645bcea2d1679a6
Reviewed-on: https://go-review.googlesource.com/16624
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Applications should always use a single matched Tag per user session
for selecting the display language and language-specific services.
So far, the tag returned from a match was the original supported tag
and as such did not include any user options specified in the -u
extension. It was hard for a user to add this as well, as the Match
method does not return the preferred tag that resulted in the match.
The Match method now copies in the -u extension of the user's tag.
It is assumed that the set of supported tags do not specify a -u
section, which is fair to assume as these are typically needed only
in the next phase of a match.
Also:
- updated documentation
- changed interpretation of grandfathered tag en-GB-oed.
Change-Id: Icb6be5735fb8d7bbdee3469b6a691c7ee74eb16a
Reviewed-on: https://go-review.googlesource.com/16560
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Clarify that ccTLDs are handled even if not equivalent to ISO3166.
Change-Id: If77bfb1d4a0f2bee11f3b76f3c93c1120c8b285a
Reviewed-on: https://go-review.googlesource.com/16471
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Missed some adjustments in the move from en-US-x-posix to
en-US-u-va-posix.
Fixes#12987.
Change-Id: Ie422598c955359556b4d0d244062b0bf267c022c
Reviewed-on: https://go-review.googlesource.com/16011
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
Use CLDR variant. Collate already used this. This provides
a little bit more structure and plays nice with collation.
Note that maketables.go contained a bug that happened to work
given the current data in CLDR.
Change-Id: I40d07d7cd2a8615bbe0e223a074df5b701b7b833
Reviewed-on: https://go-review.googlesource.com/14805
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Compensate for the fact that "und" was already taken out.
Also changed the checksum computation to reflect this change.
Change-Id: I2b29df2a41cd186d886628390a55bd19d0175075
Reviewed-on: https://go-review.googlesource.com/14750
Reviewed-by: Nigel Tao <nigeltao@golang.org>
This function provides an alternative to complex indexes as used in
package display, for example. Many of the upcoming packages will need
to use something like this, so this will be useful and simplify
implementations.
The upcoming localization code will also likely need to do a lot of
table lookups based on tags. CompactIndex allows implementations to
easily associate data based on a simple table lookup. Standardizing
on the index among the packages also allows caching it.
Maybe this should have been internal, but as the code and data are
very interwoven with the types in the language package, I don't see
a trivial way of doing this. The construct may be useful to users
that want to implement their own formatters.
Change-Id: I0c62e825be09d19b85e684fa114aa2cb6bec5003
Reviewed-on: https://go-review.googlesource.com/14438
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Also make grandfathered tags case-insensitve, as they should be and
support the legacy en_US_POSIX. The latter is widely used in CLDR
so not supporting it was kind of a pain.
Note that checking for grandfathered tags is in the critical path.
Change-Id: Ie635ecdf56e5e9de95c9c451bc7afb6985587d66
Reviewed-on: https://go-review.googlesource.com/14555
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Moved currency-related code from language to a separate package.
This will fit better in the grand design for localization support.
Also exposed rounding modes in new API.
Change-Id: Iad8e0988241b81312c4c4458db0ee5f16c71f6a9
Reviewed-on: https://go-review.googlesource.com/13855
Reviewed-by: Nigel Tao <nigeltao@golang.org>
tag.Index will also be used by currency and possibly other packages.
gen now also writes types (if there is no stutter) for vars and const.
This allows strings to be of a specific type.
index data in language is now written as consts. No performance impact
was measured.
Change-Id: I1b63a5bc5e54264acd825000df5af67f8ae759a6
Reviewed-on: https://go-review.googlesource.com/13922
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Semantics of when to use arrays or slices is preserved.
Changed type of M49 codes to int to ensure they will be printed
in decimal format, instead of hex.
Change-Id: Ic4f3b5b4a9a89eb3c3730a65ee4bac9810d0c3ae
Reviewed-on: https://go-review.googlesource.com/13856
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Created new CodeWriter type that wraps a buffer and keeps track of some
whitespacing, taking some of the tediousness away.
Goal:
- simplify writing upcoming localization packages
- take out some common code
- make output look consistent
Change-Id: I46dc25bdb48d0be52ff84f80f2e55b019ed53cd3
Reviewed-on: https://go-review.googlesource.com/13668
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Latest IANA language repository file introduced a variant that has
a prefix of the form language-region. This CL adds support for this.
Updated tables will be included in a separate CL.
Change-Id: Id7e85ca3e76d466ec89f0b8e4ce9a98871f2d5bf
Reviewed-on: https://go-review.googlesource.com/13452
Reviewed-by: Nigel Tao <nigeltao@golang.org>
This currently only happens for
Darkhat (drh) -deprecated-> Halh Mongolian (khk) -macro-> Mongolian (mn)
Added test to verify this property for all languages.
Change-Id: I4a2fb98d436fa88865872b9c62ed5f3114d15de5
Reviewed-on: https://go-review.googlesource.com/13072
Reviewed-by: Nigel Tao <nigeltao@golang.org>
This is the first in a series of currency-related CLs.
Currencies have been not fully supported until now as I wasn't sure
whether to put them in a separate package or included them here.
There are several reasons to take the approach as originally
envisoned:
1) Currencies are somewhat looser, but still quite coupled with the
other types in Region.
2) It would be quite a small package on its own.
3) It allows piggybacking on existing types like Coverage.
4) There are not many other types that we can expect to add
(the set of locale-related types has been quite steady in
other systems).
5) Currencies is usually included in locale-related functionality
in other libraries.
6) We would have to deal with some circular dependencies.
7) Moving Currency out is not backward compatible.
The main problem with keeping Currency is that language.Currency
is somewhat ugly. By hindsight, locale might still have been a
better name for this package. It is consistent with language.Region
and language.Script, though.
Change-Id: If7c4ebf5ca014881fe5866eaa9e132babdf0c2f6
Reviewed-on: https://go-review.googlesource.com/11637
Reviewed-by: Nigel Tao <nigeltao@golang.org>
- CLDR now sports (optional) subversions.
- Added var for the common SerbianLatin locale.
- Allow for UNICODE_VERSION and CLDR_VERSION in internal/gen, as these
values cannot be passed on the command line to go generate.
Note that the display package simplified and reduced in size. This is
a consequence of that latest version of CLDR reducing the complexity
of the language hierarchy and streamlining some of the differences
between languages.
Change-Id: I086679a73815a7bb0aca099ad73ad43994b57633
Reviewed-on: https://go-review.googlesource.com/9625
Reviewed-by: Nigel Tao <nigeltao@golang.org>
This is really just a one liner, but with lots of tables changing.
Change-Id: I1c57a013a8961e7e0d09ac0fb00d0678f79794e2
Reviewed-on: https://go-review.googlesource.com/7929
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Added cldr package in the mix.
More usages of go/format converted.
Implemented variant of Nigel's suggestion to make a WriteFile func.
Change-Id: I73113bc0a9d6e350a86f20435215a054c772f9ed
Reviewed-on: https://go-review.googlesource.com/6923
Reviewed-by: Nigel Tao <nigeltao@golang.org>
I love go generate, but as it doesn't take command line flags, it is now
hard to select a local mirror without manually fiddling with commands.
Also, over time, all gen and maketable commands have gotten slightly
different command lines and local mirror structures. This change cleans
this up and makes it much easier to build tables.
Change-Id: I9d61f72447f43c45d52e1b5c2e3a6de7735687d4
Reviewed-on: https://go-review.googlesource.com/6591
Reviewed-by: Nigel Tao <nigeltao@golang.org>
- CLDR 26 introduced a new type of legacy language alias from ISO-639.
Instead of having three different tables of language replacements, there
is now just one where each entry is marked with a replacement type.
- Moved to using go generate.
- Introduced code file shared by maketables and package to make the use
of constants a bit more manageable. Expect more to move there over time.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/166640043
The mapping is now hand-coded. This is fine as the table is small and
not likely to change.
ICU defines mappings for all grandfathered tags, whereas CLDR does not.
Furthermore, ICU maps zh-guoyu to cmn, instead of zh. cmn is slightly more
informative and arguably preferable. Note that the Matcher will map
cmn to zh without issue when appropriate. Also, canonicalizing with the
Macro option will map cmn to zh as well.
LGTM=roubert, r
R=roubert, r
CC=golang-codereviews
https://golang.org/cl/160790043
- Added Region.Canonicalize to make its use more practical in some cases.
This is generally a useful operation and is warranted as a shortcut for
converting to a Tag, canonicalizing and converting back.
- Added regionTypes lookup table to quickly determine if a region is a
valid ccTLD. This table takes less space than the equivalent code needed
to compute this from existing data. It also allows for other space and
time optimizations for other routine.
- Added Region.IsGroup method as a counterpart to IsCountry.
- Cleaned up meaning of IsPrivateUse. Now is strictly private use as defined
by BCP 47 and ISO. Internal mappings are ignored but can be detected using
IsCountry or IsGroup.
- Moved splitting of IANA registry ranges to registry parsing time.
- For maketables, always print source URL in comment, not the local copy.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/144820043
collatation is tricky and will be handled in a different CL.
display:
- Minor naming updates.
language:
- Added support for the provisional tag for Kosovo (XK).
- Hani script is no longer a special case. Removed respective code.
- Likely tags now make specialize a group region to a single country.
This requires a new table for group regions.
- addTags now specializes region groups whenever it can make an
unambiguous choice.
- Matching changed format. maketables now supports both.
- Added Contains method to Region for determining wheter a region is
contained by another.
Do later:
- Cash rounding and decimals for currencies.
LGTM=r
R=r
CC=golang-codereviews, nigeltao
https://golang.org/cl/102120044
instead of a variadic tag.
The argument will almost always be a list. More importantly, though, it
allows for options to the Matcher. We would like to reserve the possibility
to have a written versus spoken variant, for example, instead of requiring
different versions.
LGTM=r
R=r
CC=golang-codereviews
https://golang.org/cl/90280044