Граф коммитов

9 Коммитов

Автор SHA1 Сообщение Дата
Nigel Tao 2d7673cde2 go.net/publicsuffix: update table to latest list from publicsuffix.org.
LGTM=dr.volker.dobler
R=dr.volker.dobler
CC=golang-codereviews
https://golang.org/cl/116080045
2014-07-23 16:43:57 +10:00
Volker Dobler 68f5bd3ad0 go.net/publicsuffix: add new eTLD test for IDNs.
LGTM=nigeltao
R=nigeltao
CC=bradfitz, golang-codereviews
https://golang.org/cl/33810044
2014-02-18 16:43:49 +11:00
Volker Dobler a62ee0556d go.net/publicsuffix: update table to latest list from publicsuffix.org
Update the public suffix list to the latest (October 15. 2013)
data of publicsuffix.org's list which adds around 60 new gTLDs.

The .ar rules changed, the corresponding tests are modified to
reflect this change in the list.

R=nigeltao
CC=golang-dev
https://golang.org/cl/14930048
2013-10-24 13:49:04 +11:00
Volker Dobler a9bbf44c11 publicsuffix: update tests
Update the test cases to reflect the updated
test_psl.txt found on http://publicsuffix.org.

R=nigeltao
CC=golang-dev
https://golang.org/cl/15720044
2013-10-23 08:19:41 +11:00
Nigel Tao 617911444d go.net/publicsuffix: add an EffectiveTLDPlus1 function.
Also expand the gen.go subset to cover the test cases from
http://mxr.mozilla.org/mozilla-central/source/netwerk/test/unit/data/test_psl.txt

R=dr.volker.dobler, patrick
CC=golang-dev
https://golang.org/cl/7124044
2013-01-22 21:23:30 +11:00
Nigel Tao b8ab510da6 go.net/publicsuffix: distinguish ICANN domains from private domains;
add a publicsuffix.PublicSuffix function.

This required moving the encoded node type bits from the nodes array
to the children array.

R=dr.volker.dobler, rsc
CC=golang-dev, rsleevi
https://golang.org/cl/7060046
2013-01-09 22:10:50 +11:00
Nigel Tao 0f34b77681 go.net/publicsuffix: tighten the encoding from 8 bytes per node to 4.
On the full list (running gen.go with -subset=false):

Before, there were 6086 nodes (at 8 bytes per node) before. After,
there were 6086 nodes (at 4 bytes per node) plus 354 children entries
(at 4 bytes per node). The difference is 22928 bytes.

In comparison, the (crushed) text is 21082 bytes, and for the curious,
the longest label is 36 bytes: "xn--correios-e-telecomunicaes-ghc29a".

All 32 bits in the nodes table are used, but there's wiggle room to
accomodate future changes to effective_tld_names.dat:

The largest children index is 353 (in 9 bits, so max is 511).
The largest node type is 2 (in 2 bits, so max is 3).
The largest text offset is 21080 (in 15 bits, so max is 32767).
The largest text length is 36 (in 6 bits, so max is 63).

benchmark                old ns/op    new ns/op    delta
BenchmarkPublicSuffix        19948        19744   -1.02%

R=dr.volker.dobler
CC=golang-dev
https://golang.org/cl/6999045
2012-12-22 12:09:13 +11:00
Nigel Tao cbecf2f725 go.net/publicsuffix: use IDNA.
R=dr.volker.dobler
CC=golang-dev
https://golang.org/cl/6930054
2012-12-20 19:36:00 +11:00
Nigel Tao 67a3048087 go.net/publicsuffix: new package.
The tables were generated by:

go run gen.go -subset -version "subset of publicsuffix.org's effective_tld_names.dat, hg revision 05b11a8d1ace (2012-11-09)"       >table.go

go run gen.go -subset -version "subset of publicsuffix.org's effective_tld_names.dat, hg revision 05b11a8d1ace (2012-11-09)" -test >table_test.go

The input data is subsetted so that code review is easier while still
covering the interesting * and ! rules. A follow-up changelist will
check in the unfiltered public suffix list.

Update golang/go#1960.

R=rsc, dr.volker.dobler
CC=golang-dev
https://golang.org/cl/6912045
2012-12-12 15:58:52 +11:00