Граф коммитов

1339 Коммитов

Автор SHA1 Сообщение Дата
Chris Peterson 2ae60bead7 Bug 1744425 - Replace nsContentUtils::GenerateUUID() to nsID::GenerateUUID(). r=nika
Bug 1723674 added a new nsID::GenerateUUID() static factory function to generate UUIDs without the overhead of querying and instantiating an nsIUUIDGenerator object. nsContentUtils::GenerateUUID() is a utility function that amortizes that overhead by holding an nsIUUIDGenerator singleton. That's no longer necessary because code that calls nsContentUtils::GenerateUUID() can now just call nsID::GenerateUUID(). No nsIUUDGenerator is needed.

Differential Revision: https://phabricator.services.mozilla.com/D132866
2022-02-03 04:39:34 +00:00
Henri Sivonen 2c95bf7be8 Bug 1749522 - When plain text encoding speculation fails, restart the plaintext mode of the tokenizer. r=smaug
Differential Revision: https://phabricator.services.mozilla.com/D135830
2022-01-17 09:16:10 +00:00
Sandor Molnar 4d0c0a33eb Backed out changeset a2d8eae8d006 (bug 1749522) for causing reftest failures in tests/reftest/bug1749522-1. CLOSED TREE 2022-01-14 17:50:39 +02:00
Henri Sivonen d20ee45260 Bug 1749522 - When plain text encoding speculation fails, restart the plaintext mode of the tokenizer. r=smaug
Differential Revision: https://phabricator.services.mozilla.com/D135830
2022-01-14 13:36:24 +00:00
Henri Sivonen 4c4a0740a9 Bug 1748234 - Sync HTML parser Java source comments with validator repo. NPOTB r=edgar DONTBUILD
Differential Revision: https://phabricator.services.mozilla.com/D134951
2022-01-04 10:40:18 +00:00
Kagami Sascha Rosylight dc3411ed8f Bug 1539884 - Part 32: Mark nsHtml5SVGLoadDispatcher::Run as CAN_RUN_SCRIPT_BOUNDARY r=masayuki
Differential Revision: https://phabricator.services.mozilla.com/D134415
2021-12-23 16:27:24 +00:00
Henri Sivonen 97da4f66b8 Bug 1745142 - Communicate encoding commitment via speculative load queue. r=smaug
Differential Revision: https://phabricator.services.mozilla.com/D133996
2021-12-22 15:54:49 +00:00
Randell Jesup 18dd863826 Bug 1746412: Parser cleanup r=hsivonen
Differential Revision: https://phabricator.services.mozilla.com/D134021
2021-12-19 16:51:41 +00:00
Henri Sivonen 2e97d5cb50 Bug 1746603 - Make mSpeculationFailureCount atomic. r=jesup
Differential Revision: https://phabricator.services.mozilla.com/D134142
2021-12-18 07:37:52 +00:00
Henri Sivonen fe10b6b5da Bug 1746593 - Turn mTerminated and mInterrupted into atomics. r=jesup
Differential Revision: https://phabricator.services.mozilla.com/D134135
2021-12-18 07:37:15 +00:00
Henri Sivonen 527882bab5 Bug 1736248 - Update the charset source if non-ASCII is seen after the first detector guess but the encoding does not change. r=smaug
Differential Revision: https://phabricator.services.mozilla.com/D133731
2021-12-14 15:01:21 +00:00
Henri Sivonen cdcc0b21b5 Bug 1745139 - Check for termination even after the first flush loop in CommitToInternalEncoding. r=smaug
Differential Revision: https://phabricator.services.mozilla.com/D133342
2021-12-09 12:59:04 +00:00
Henri Sivonen 649a5b63d8 Bug 1701828 - meta charset rewrite. r=smaug
Implements https://github.com/whatwg/html/issues/6962 . Improves performance
when <meta charset> occurs in head but after the first kilobyte and aligns
behavior better with WebKit and Blink.

The main change is to avoid reloads when meta appears within head but
after the first kilobyte. Prior to this change, Gecko reloaded in that
case (in compliance with the spec!) even though WebKit and Blink did not.

Differences from WebKit and Blink:

* WebKit and Blink honor <meta charset> in <noscript>. This implementation
  does not.
* WebKit and Blink look for meta as if the tree builder was unaware of
  foreign content. This implementation is foreign content-aware. This
  makes a difference for CDATA sections that contain a > before the meta
  as well as style and script elements within foreign content. This could
  happen if the CDATA section that has mysteriously been introduced around
  a what looks like a meta tag also contains another prior tag-looking
  run of text.
* This implementation processes rel=preload and speculative loads that are
  seen before <meta charset> has been seen. WebKit and Blink instead first
  look for the meta and rewind before starting speculative parsing.
* Unlike WebKit, if there is neither an honored meta nor syntax resembling
  an XML declaration, detection from content takes place (as in Blink).
* Unlike Blink, if there is neither an honored meta nor syntax resembling
  an XML declaration, the detection from content is not dependent of network
  buffer boundaries.
* Unlike Blink, detection from content can trigger a reload at the end of
  the stream if the guess made at that point differs from the first guess.
  (See below for the definition of the input to the first guess.)

Differences from the old spec and Gecko previously:

* Meta inside script and RCDATA elements is no longer honored.
* Late meta is now ignored and no longer triggers a reload.
* Later meta counts as early enough meta: In addition to the previous
  meta within the first 1024 bytes, now a meta that started within the first
  1024 bytes counts as early enough. Additionally, if by then there hasn't
  been a template start tag and head hasn't ended, meta occurring before the
  earlier of the end of the head or a template start tag counts as early
  enough.
* Meta now counts as not-late even if the encoding label has numeric
  character reference escapes.
* Syntax resembling an XML declaration longer than a kilobyte is honored if
  there is no honored meta.
* If there is neither an honored meta nor syntax resembling an XML declaration,
  the initial chardetng scan is potentially longer than before: the first 1024
  bytes, the token spanning the 1024-byte boundary if there is such a token,
  and, if by then head hasn't ended and there hasn't been a template start tag
  until the end of the template start tag or the end of the token that causes
  head to end, ever comes first. However, if the token implying the end of the
  head is a text token, bytes only to the end of the previous non-text token is
  considered. (This definition avoids depending on network buffer boundaries.)
* XML View Source now uses the code for syntax resembling an XML declaration
  instead of expat for extracting the internal encoding label.

Reftest are added as both WPT and Gecko reftests in order to test both http:
and file: URL scenarios. The Gecko tests retain the WPT <link> tags in order
to use the exact same bytes.

An encoding declaration has been added to a number of old tests that didn't
intend to test the new speculation behavior especially in the context of
https://bugzilla.mozilla.org/show_bug.cgi?id=1727750 .

Differential Revision: https://phabricator.services.mozilla.com/D125808
2021-12-08 11:34:20 +00:00
Cosmin Sabou fdf40d5a31 Backed out changeset 1778ca2ab291 (bug 1744425) for bc failures on browser_xpcom_graph_wait.js. CLOSED TREE 2021-12-08 07:20:54 +02:00
Chris Peterson aae95e46eb Bug 1744425 - Replace nsContentUtils::GenerateUUID() to nsID::GenerateUUID(). r=nika
Bug 1723674 added a new nsID::GenerateUUID() static factory function to generate UUIDs without the overhead of querying and instantiating an nsIUUIDGenerator object. nsContentUtils::GenerateUUID() is a utility function that amortizes that overhead by holding an nsIUUIDGenerator singleton. That's no longer necessary because code that calls nsContentUtils::GenerateUUID() can now just call nsID::GenerateUUID(). No nsIUUDGenerator is needed.

Differential Revision: https://phabricator.services.mozilla.com/D132866
2021-12-08 03:19:11 +00:00
Norisz Fay 1d6984bc21 Backed out changeset 3dfd3c94a105 (bug 1701828) for causing mochitest failures on browser_hsts_host.js CLOSED TREE 2021-12-07 12:05:44 +02:00
Henri Sivonen 58476d7f17 Bug 1701828 - meta charset rewrite. r=smaug
Implements https://github.com/whatwg/html/issues/6962 . Improves performance
when <meta charset> occurs in head but after the first kilobyte and aligns
behavior better with WebKit and Blink.

The main change is to avoid reloads when meta appears within head but
after the first kilobyte. Prior to this change, Gecko reloaded in that
case (in compliance with the spec!) even though WebKit and Blink did not.

Differences from WebKit and Blink:

* WebKit and Blink honor <meta charset> in <noscript>. This implementation
  does not.
* WebKit and Blink look for meta as if the tree builder was unaware of
  foreign content. This implementation is foreign content-aware. This
  makes a difference for CDATA sections that contain a > before the meta
  as well as style and script elements within foreign content. This could
  happen if the CDATA section that has mysteriously been introduced around
  a what looks like a meta tag also contains another prior tag-looking
  run of text.
* This implementation processes rel=preload and speculative loads that are
  seen before <meta charset> has been seen. WebKit and Blink instead first
  look for the meta and rewind before starting speculative parsing.
* Unlike WebKit, if there is neither an honored meta nor syntax resembling
  an XML declaration, detection from content takes place (as in Blink).
* Unlike Blink, if there is neither an honored meta nor syntax resembling
  an XML declaration, the detection from content is not dependent of network
  buffer boundaries.
* Unlike Blink, detection from content can trigger a reload at the end of
  the stream if the guess made at that point differs from the first guess.
  (See below for the definition of the input to the first guess.)

Differences from the old spec and Gecko previously:

* Meta inside script and RCDATA elements is no longer honored.
* Late meta is now ignored and no longer triggers a reload.
* Later meta counts as early enough meta: In addition to the previous
  meta within the first 1024 bytes, now a meta that started within the first
  1024 bytes counts as early enough. Additionally, if by then there hasn't
  been a template start tag and head hasn't ended, meta occurring before the
  earlier of the end of the head or a template start tag counts as early
  enough.
* Meta now counts as not-late even if the encoding label has numeric
  character reference escapes.
* Syntax resembling an XML declaration longer than a kilobyte is honored if
  there is no honored meta.
* If there is neither an honored meta nor syntax resembling an XML declaration,
  the initial chardetng scan is potentially longer than before: the first 1024
  bytes, the token spanning the 1024-byte boundary if there is such a token,
  and, if by then head hasn't ended and there hasn't been a template start tag
  until the end of the template start tag or the end of the token that causes
  head to end, ever comes first. However, if the token implying the end of the
  head is a text token, bytes only to the end of the previous non-text token is
  considered. (This definition avoids depending on network buffer boundaries.)
* XML View Source now uses the code for syntax resembling an XML declaration
  instead of expat for extracting the internal encoding label.

Reftest are added as both WPT and Gecko reftests in order to test both http:
and file: URL scenarios. The Gecko tests retain the WPT <link> tags in order
to use the exact same bytes.

An encoding declaration has been added to a number of old tests that didn't
intend to test the new speculation behavior especially in the context of
https://bugzilla.mozilla.org/show_bug.cgi?id=1727750 .

Differential Revision: https://phabricator.services.mozilla.com/D125808
2021-12-07 07:35:32 +00:00
nchevobbe b31e82ce90 Bug 1238861 - Display a warning message when doctype is not standard. r=hsivonen.
This patchs adds new error messages which are extending existing ones,
providing extra information to the user.
A webconsole mochitest is added in the following patch of this stack.

Differential Revision: https://phabricator.services.mozilla.com/D131889
2021-12-02 22:46:22 +00:00
Chris Peterson f6fdbf028a Bug 1738401 - Remove -Wno-shadow warning suppressions. r=firefox-build-system-reviewers,glandium
-Wshadow warnings are not enabled globally, so these -Wno-shadow suppressions have no effect. I had intended to enable -Wshadow globally along with these suppressions in some directories (in bug 1272513), but that was blocked by other issues.

There are too many -Wshadow warnings (now over 2000) to realistically fix them all. We should remove all these unnecessary -Wno-shadow flags cluttering many moz.build files.

Differential Revision: https://phabricator.services.mozilla.com/D132289
2021-12-01 06:40:04 +00:00
Henri Sivonen e266e0f13a Bug 1741219 - Remove Expat usage from nsHtml5StreamParser. r=smaug
To be overwritten by the patch for https://bugzilla.mozilla.org/show_bug.cgi?id=1701828 .

Differential Revision: https://phabricator.services.mozilla.com/D131215
2021-11-16 16:02:04 +00:00
ssummar 0992acc367 Bug 1603127 - Replaced mozilla::Tuple with std::tuple and applied structured bindings in mozilla/Encoding.h. r=hsivonen
Differential Revision: https://phabricator.services.mozilla.com/D129920
2021-11-08 08:14:00 +00:00
Cristian Tuns 71486b8924 Backed out changeset 7e8e3747c3f8 (bug 1603127) for causing toolchains build bustages (Bug 1739589). CLOSED TREE 2021-11-05 07:23:45 -04:00
ssummar 508562cc85 Bug 1603127 - Replaced mozilla::Tuple with std::tuple and applied structured bindings in mozilla/Encoding.h. r=hsivonen
Differential Revision: https://phabricator.services.mozilla.com/D129920
2021-11-05 05:33:58 +00:00
Edgar Chen 3f791b5050 Bug 1556352 - Part 1: Do not set form owner from parser for form-associated custom element; r=smaug
See steps 14 of https://html.spec.whatwg.org/commit-snapshots/3ad5159be8f27e110a70cefadcb50fc45ec21b05/#create-an-element-for-the-token

From spec perspective, FACE needs this in order to enqueue the formAssociated callback
while FACE is inserted into document. Otherwise it would bail out in
https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#association-of-controls-and-forms:parser-inserted-flag,
and wouldn't run https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#reset-the-form-owner
steps nor https://html.spec.whatwg.org/multipage/custom-elements.html#custom-element-reactions:reset-the-form-owner

From implementation perspective, we don't implement parser inserted flag, but we do
update the form owner from parser. Not doing this would make the subsequent part
which implements formAssociated callback a bit simpler, where we don't need to consider
or handle the case that form owner is set from parser.

Differential Revision: https://phabricator.services.mozilla.com/D129646
2021-10-28 10:29:37 +00:00
Henri Sivonen 8954e892c8 Bug 1735161 - Update MDN URL in speculation failure message. r=smaug DONTBUILD
Differential Revision: https://phabricator.services.mozilla.com/D128095
2021-10-12 06:49:15 +00:00
Henri Sivonen 7588a257ed Bug 1724243 - Make text/plain and MediaDocuments use the Standards Mode. r=smaug,emilio
Differential Revision: https://phabricator.services.mozilla.com/D123318
2021-10-01 12:55:28 +00:00
Cosmin Sabou 0d612db0fb Backed out 4 changesets (bug 1688452) for assertion and bc failures on browser_translation_bing.js.
Backed out changeset 1a720cffc019 (bug 1688452)
Backed out changeset 797a7e243d43 (bug 1688452)
Backed out changeset 00fd325069fa (bug 1688452)
Backed out changeset 23ef68478e93 (bug 1688452)
2021-09-29 20:13:33 +03:00
Deian Stefan 20476da693 Bug 1688452 - Part 4: Add Wasm sandbox support for RLBoxed libexpat r=tjr
Depends on D126369

Differential Revision: https://phabricator.services.mozilla.com/D106254
2021-09-29 14:31:45 +00:00
Deian Stefan 4bb477c01b Bug 1688452 - Part 1: Retrofit nsHtml5StreamParser to use RLBoxed libexpat r=tjr,peterv
Differential Revision: https://phabricator.services.mozilla.com/D117102
2021-09-29 14:31:44 +00:00
Mike Hommey cf6298e15e Bug 1732208 - Silence the unused-but-set-variable warning in parser. r=hsivonen
parser/html/nsHtml5StreamParser.cpp:1046:10: error: variable 'totalRead' set but not used [-Werror,-Wunused-but-set-variable]
  size_t totalRead = 0;
         ^

Differential Revision: https://phabricator.services.mozilla.com/D126458
2021-09-28 00:02:47 +00:00
Edgar Chen eeb0629c16 Bug 1558793 - Remove invalid comment and assertion in nsHtml5TreeOperation::SetFormElement; r=hsivonen
Per https://html.spec.whatwg.org/#form-associated-element, img is a form-associated
element, and we do handle the img case well.

Differential Revision: https://phabricator.services.mozilla.com/D125754
2021-09-16 12:59:03 +00:00
Henri Sivonen 2a9936f40a Bug 1153920 - Conform ampersand error reporting to HTML spec. r=smaug
Created by inlining the `AMBIGUOUS_AMPERSAND` state in
https://github.com/validator/htmlparser/pull/30 back into the states
that transitioned to `AMBIGUOUS_AMPERSAND` in that PR by
Michael(tm) Smith.

Differential Revision: https://phabricator.services.mozilla.com/D81992
2021-09-02 11:13:37 +00:00
Henri Sivonen 5397b4f0a9 Bug 1727491 - Remove support for BOMless unlabeled Latin1 Supplement-range UTF-16LE|BE. r=emk
Differential Revision: https://phabricator.services.mozilla.com/D123596
2021-09-01 09:13:29 +00:00
criss 02cf484af4 Backed out changeset dc6b9ca8f3fa (bug 1727491) for causing mochitest failures on test_bug631751be.html. CLOSED TREE 2021-08-30 11:14:38 +03:00
Henri Sivonen 4233abeb9e Bug 1727491 - Remove support for BOMless unlabeled Latin1 Supplement-range UTF-16LE|BE. r=emk
Differential Revision: https://phabricator.services.mozilla.com/D123596
2021-08-30 07:11:09 +00:00
Henri Sivonen 58e0b2946c Bug 1716290 - Remove protections against the document changing as part of kCharsetFromFinalUserForcedAutoDetection reload. r=emk,emilio
NOTE! In cases where there is no HTTP-layer encoding declaration, and CSS
parsing inherits the encoding from the HTML document, for preloads, this
changes the inherited encoding from windows-1252 to UTF-8 in order to
make the speculative encoding correct in the common `<meta charset=utf-8>`
case.

Differential Revision: https://phabricator.services.mozilla.com/D123593
2021-08-26 18:02:15 +00:00
criss 2be42eea15 Backed out changeset ab805f2926d5 (bug 1716290) for causing failures on link-header-preload.html. CLOSED TREE 2021-08-26 12:07:17 +03:00
Henri Sivonen ff85d45e69 Bug 1716290 - Remove protections against the document changing as part of kCharsetFromFinalUserForcedAutoDetection reload. r=emk
Differential Revision: https://phabricator.services.mozilla.com/D123593
2021-08-26 06:25:31 +00:00
Andi-Bogdan Postelnicu 2fc4f70e9b Bug 1725145 - Preparation for the hybrid build env. r=necko-reviewers,firefox-build-system-reviewers,valentin,glandium
Automatically generated path that adds flag `REQUIRES_UNIFIED_BUILD = True` to `moz.build`
when the module governed by the build config file is not buildable outside on the unified environment.

This needs to be done in order to have a hybrid build system that adds the possibility of combing
unified build components with ones that are built outside of the unified eco system.

Differential Revision: https://phabricator.services.mozilla.com/D122345
2021-08-25 10:46:17 +00:00
Andi-Bogdan Postelnicu 9945b94835 Bug 1519636 - Reformat recent changes to the Google coding style. r=emilio
Updated with clang-format version 12.0.1 (taskcluster-dNZqCRqWRTqa6cZxPKxh7Q)

Differential Revision: https://phabricator.services.mozilla.com/D122814
2021-08-23 09:30:23 +00:00
Henri Sivonen 50f9d31d19 Bug 1726374 - Correctly highlight <!-- a <!-->b in View Source. r=smaug
Differential Revision: https://phabricator.services.mozilla.com/D123061
2021-08-20 09:03:33 +00:00
Brindusan Cristian e80f2b4691 Backed out changeset 8370f995a458 (bug 1726374) for causing reftest failures in view-source:bug1319410-1.html
CLOSED TREE
2021-08-19 18:25:51 +03:00
Henri Sivonen 974eddfb01 Bug 1726374 - Correctly highlight <!-- a <!-->b in View Source. r=smaug
Differential Revision: https://phabricator.services.mozilla.com/D123061
2021-08-19 13:00:25 +00:00
Michael[tm] Smith f3bee8a712 Bug 1541822 - Ensure doctype name is set to null when missing r=smaug
This change ensures that the tokenizer sets the doctype name to null
when the doctype name is missing in the input source.

Otherwise, without this change, the doctype name is set to the empty
string — which doesn’t conform to the requirements in the HTML spec, and
which causes us to fail 9 tests in the html5lib-tests suite.

Relates to https://github.com/validator/htmlparser/issues/35

Differential Revision: https://phabricator.services.mozilla.com/D122936
2021-08-19 10:02:57 +00:00
Michael[tm] Smith d76dfd530f Bug 1541846 - Ensure namespace-aware “clear the stack” handling r=smaug
This change ensures that for all cases with spec requirements in the
form “clear the stack back to a foo context” — which involves checking
for elements with particular names — we only look for elements in the
HTML namespace, rather than additionally looking for elements which
aren’t in the HTML namespace but that also have those particular names.

Otherwise, without this change, we aren’t in conformance with the spec
requirements, and we fail several cases in the html5lib-tests suite.

Fixes https://github.com/validator/htmlparser/issues/33

Differential Revision: https://phabricator.services.mozilla.com/D122722
2021-08-17 12:02:13 +00:00
Michael[tm] Smith 2a15667bbd Bug 1725946 - Conform tokenizer-only U+0000 NUL handling to spec r=smaug
This change brings the tokenizer’s handling of U+0000 NUL characters in
the DATA state and the CDATA section state into conformance with the
requirements in the HTML spec — for the case where only tokenization is
being performed, without tree construction; that is, the case where the
tokenizer() method is called, rather than parse() or parseFragment().

Specifically, the tokenization steps defined in the spec require that
when a U+0000 NUL is consumed in the DATA state or in the CDATA section
state, the parser must then emit a U+0000 NUL. But when performing tree
construction, the spec requires that when a U+0000 NUL is consumed, the
parser must instead emit a U+FFFD REPLACEMENT CHARACTER.

Without this change, the parser always emits a U+FFFD REPLACEMENT
CHARACTER — even when only tokenization is being performed. That causes
us to fail a number of tests in html5lib-tests suite.

For more background on the relevant behavior, see the following:

* https://www.w3.org/Bugs/Public/show_bug.cgi?id=9659
* https://github.com/whatwg/html/commit/d98f83e
* https://github.com/validator/htmlparser/commit/9b9c263

Relates to https://github.com/validator/htmlparser/issues/35

Differential Revision: https://phabricator.services.mozilla.com/D122721
2021-08-17 10:09:10 +00:00
Michael[tm] Smith 21de022eae Bug 1650087 - Support “generate all implied end tags thoroughly” r=smaug
When the parser encounters a `</template>` end tag and there are other
open elements, the HTML spec requires the parser to “generate all
implied end tags thoroughly”, which unlike “generate implied end tags”
also includes generating implied end tags for table-parts elements
(caption, colgroup, tbody, thead, tfoot, td, th, and tr).

Differential Revision: https://phabricator.services.mozilla.com/D82020
2021-08-16 11:10:19 +00:00
Michael[tm] Smith 9b06c493c5 Bug 1650066 - Correct error for EOF in “in template” state r=smaug
Doing `errUnclosedElements(eltPos, "template")` for EOF in the “in
template” state results in the error message “End tag `template` seen, but
there were open elements”, which is all wrong because the actual problem is
that though a `template` end tag was expected, EOF was reached without a
`template` end tag being seen.

So let’s instead when we reach this just report the list of open elements.

Differential Revision: https://phabricator.services.mozilla.com/D122598
2021-08-16 05:16:58 +00:00
Henri Sivonen de1b71fe15 Bug 1650066 preparation - Add errListUnclosedStartTags for HTML tree builder error reporting. r=smaug
Differential Revision: https://phabricator.services.mozilla.com/D122597
2021-08-16 05:16:58 +00:00
Michael[tm] Smith 3e0410518d Bug 1319410 fixup - Stay in the COMMENT_LESSTHAN state, annotate fall-throughs. r=smaug
Differential Revision: https://phabricator.services.mozilla.com/D82163
2021-08-16 05:01:10 +00:00