Apply the correct normalization when an equals sign appears before an
attribute name (e.g. '<tag =>' -> '<tag =="">'), per WHATWG 13.2.5.32.
Change-Id: Id21b428bd86117dd073c502767386bc718a3fb7b
Reviewed-on: https://go-review.googlesource.com/c/net/+/488695
Auto-Submit: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Roland Shoemaker <roland@golang.org>
Reviewed-by: Nigel Tao (INACTIVE; USE @golang.org INSTEAD) <nigeltao@google.com>
Be clearer about the operation of the tokenizer and the parser (and
their differences), and be explicit about the need for re-serialization
when they are being used in security contexts.
Change-Id: Ieb8f2a9d4806fb7a8849a15671667396e81c53b9
Reviewed-on: https://go-review.googlesource.com/c/net/+/484795
Auto-Submit: Roland Shoemaker <roland@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Run-TryBot: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Adds a short security considerations paragraph to the package comment
detailing the differences between the parser and tokenizer.
Change-Id: I9e6840b20f82ffc6bc4088fffd6b4eda97550c0a
Reviewed-on: https://go-review.googlesource.com/c/net/+/459676
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Run-TryBot: Roland Shoemaker <roland@golang.org>
Reviewed-by: Rob Pike <r@golang.org>
Change-Id: Iee11c27052222f017b672c06ced9e129ee51619c
Reviewed-on: https://go-review.googlesource.com/c/net/+/465996
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Reviewed-by: David Chase <drchase@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Properly handle the case where HTML comments begin with exclamation
marks and have no other content, i.e. "<!--!-->". Previously these
comments would cause the tokenizer to consider everything following to
also be considered part of the comment.
Fixesgolang/go#37771
Change-Id: I78ea310debc3846f145d62cba017055abc7fa4e0
Reviewed-on: https://go-review.googlesource.com/c/net/+/442496
Run-TryBot: Roland Shoemaker <roland@golang.org>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Damien Neil <dneil@google.com>
Change-Id: I6c853dd402d296701e38289bbc418730b068dde8
Reviewed-on: https://go-review.googlesource.com/c/net/+/441716
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Joedian Reid <joedian@golang.org>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Make all our package sources use Go 1.17 gofmt format
(adding //go:build lines).
Not strictly necessary but will avoid spurious changes
as files are edited.
Part of //go:build change (#41184).
See https://golang.org/design/draft-gobuild
Change-Id: I5b2b7d93424e828a3c5f76ae3f30ab825aca388e
Reviewed-on: https://go-review.googlesource.com/c/net/+/294371
Trust: Russ Cox <rsc@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Assuming "in head noscript" insertion mode, the scripting flag will be disabled.
Thus, even if nested noscript tags exist,
the tokenizer should not go into the raw text mode.
This change makes the following test happy:
<head><noscript><noscript class="foo"><!--foo--></noscript>
Change-Id: I2620e751d8be3d313c3a2e2f992b1e21ce2dc2ee
Reviewed-on: https://go-review.googlesource.com/c/net/+/263878
Trust: Kunpei Sakai <namusyaka@gmail.com>
Trust: Nigel Tao <nigeltao@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
This follows up on https://golang.org/cl/264977
Change-Id: I5d0e2f39173a8bbd07ca53de4df2a7e8772d4197
Reviewed-on: https://go-review.googlesource.com/c/net/+/265960
Trust: Kunpei Sakai <namusyaka@gmail.com>
Trust: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
This also updates webkit/tests18.dat to latest.
Change-Id: I4ed37e918a7db63afd8d515dd3a2494699cc5b74
Reviewed-on: https://go-review.googlesource.com/c/net/+/264977
Trust: Kunpei Sakai <namusyaka@gmail.com>
Trust: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
As per the current spec, "contentScriptType", "contentStyleType" and
"externalResourcesRequired" have been removed from the svg attribute adjustments.
See: https://html.spec.whatwg.org/multipage/parsing.html#adjust-svg-attributes
Change-Id: I904914691c3a3c3958868f7e49ba10f7d6f9ec09
Reviewed-on: https://go-review.googlesource.com/c/net/+/263398
Trust: Kunpei Sakai <namusyaka@gmail.com>
Trust: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Go Bot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
This commit also adds remaining tests to follow up on golang.org/cl/205617
Change-Id: I8b155f9f605c6a0eb8745c32f5e785f5b4bc1c7e
Reviewed-on: https://go-review.googlesource.com/c/net/+/208937
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
This follows up on golang.org/cl/205617
Ref: c0ffd43f89
Change-Id: I0a7399368bb8c28c5bf65adf3614a84ffeb82b8c
Reviewed-on: https://go-review.googlesource.com/c/net/+/206120
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Those directives are now supported by html5lib-tests.
See: e52ff68cc7/tree-construction/README.md
Also, this fixes missing opts on parsing for identical check
Change-Id: I92f2398ebda0477fd7f6bb438c54f3948063c08d
Reviewed-on: https://go-review.googlesource.com/c/net/+/206118
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Trailing '<' entities in the text token make the tokenizer fail
for escapable raw text elements like title and textarea
Fixesgolang/go#34281
Change-Id: I6fe8f2229b5fd639cf5a02ab1db31f18ea034c8b
GitHub-Last-Rev: 4a9da03177
GitHub-Pull-Request: golang/net#53
Reviewed-on: https://go-review.googlesource.com/c/net/+/196620
Run-TryBot: Kunpei Sakai <kunpei@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
This commit newly introduces a type for configuring a parser
called ParseOption, and implements two functions depending on it.
Along with that, this introduces ParseOptionEnableScripting to
enable setting of the scripting flag.
Fixesgolang/go#16318
Change-Id: Ie7fd7d8ce286e22e7f57182fc2ce353bce578db6
Reviewed-on: https://go-review.googlesource.com/c/net/+/174157
Reviewed-by: Nigel Tao <nigeltao@golang.org>
In section 12.2.6.4.15 of the spec, there is a condition that the current node is a td or th element, which is not implemented. This can lead to a panic when the open elements stack is popped whilst empty, as outlined in golang/go#30600. This commit implements that check.
Fixesgolang/go#30600
Change-Id: I4837815e2edce21b58a985a100d93d146bf71e24
GitHub-Last-Rev: 79084c5a84
GitHub-Pull-Request: golang/net#41
Reviewed-on: https://go-review.googlesource.com/c/net/+/172377
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
In the spec 12.2.6.4.5, the "in head noscript" insertion mode is defined.
However, this package and its parser doesn't have the insertion mode,
because the scripting=false case is not considered currently.
This commit adds a test and a support for the "in head noscript"
insertion mode. This change has no effect on the actual behavior.
Updates golang/go#16318
Change-Id: I9314c3342bea27fa2acf2fa7d980a127ee0fbf91
Reviewed-on: https://go-review.googlesource.com/c/net/+/172557
Reviewed-by: Nigel Tao <nigeltao@golang.org>
The existing implementation behaves differently to all major browsers, for the instance where a self-closing element of an unknown tag type is the child of another element of an unknown tag type. The issue appears to be that nested tags of an differing unknown types will all have an atom value of 0 and `inBodyEndTagOther` will incorrectly match them to one another.
Fixesgolang/go#30961
Change-Id: I62b0aa49c027c8432df7d077ffba135201b3b786
GitHub-Last-Rev: fb25181f9a
GitHub-Pull-Request: golang/net#37
Reviewed-on: https://go-review.googlesource.com/c/net/+/168638
Reviewed-by: Nigel Tao <nigeltao@golang.org>
A bot inside Google was nagging me about this so change plusone to
minusone to hide it from the bot which doesn't understand that it's
just test data.
Change-Id: I2300a3e8c2ca969d61cae2d6bdc0683743b534e1
Reviewed-on: https://go-review.googlesource.com/c/162888
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
The ancestor doesn't always match with the first.
Change-Id: I0edcbffab7e19ba1731e849021ffbb7428ec48d7
Reviewed-on: https://go-review.googlesource.com/c/161857
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
By proceeding without distinguishing namespace, inconsistency will
occur.
This commit makes the method distinguish the HTML namespace.
Fixesgolang/go#27846
Change-Id: I8269f670240c0fe31162a16fbe1ac23acacec00f
Reviewed-on: https://go-review.googlesource.com/c/159397
Run-TryBot: Kunpei Sakai <namusyaka@gmail.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Nigel Tao <nigeltao@golang.org>
If there are nested <template> elements and the <template> node isn't in HTML namespace,
couldn't continue to parse documents correctly.
By this patch, it makes the <template> which is in math namespace be skipped on
resetting insertion mode.
Fixesgolang/go#27702
Change-Id: I6eacdb98fe29eb3c61781afca5bc4d83e72ba4ed
Reviewed-on: https://go-review.googlesource.com/136875
Reviewed-by: Nigel Tao <nigeltao@golang.org>
The <isindex> element has been removed from the spec so that the
<template> element doesn't cover it.
To avoid panic, this commit adds ignoring code as a workaround.
Fixesgolang/go#27704
Change-Id: I847391389285df2fc0eb6a795f8c93b481cdebac
Reviewed-on: https://go-review.googlesource.com/136575
Reviewed-by: Nigel Tao <nigeltao@golang.org>
They implement the HTML5 parsing algorithm, which is very complicated.
Fixesgolang/go#26973
Change-Id: I83a5753ab00fe84f73797fcecd309306d9f24819
Reviewed-on: https://go-review.googlesource.com/133695
Reviewed-by: Kunpei Sakai <namusyaka@gmail.com>
Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>