Implements https://github.com/whatwg/html/issues/6962 . Improves performance
when <meta charset> occurs in head but after the first kilobyte and aligns
behavior better with WebKit and Blink.
The main change is to avoid reloads when meta appears within head but
after the first kilobyte. Prior to this change, Gecko reloaded in that
case (in compliance with the spec!) even though WebKit and Blink did not.
Differences from WebKit and Blink:
* WebKit and Blink honor <meta charset> in <noscript>. This implementation
does not.
* WebKit and Blink look for meta as if the tree builder was unaware of
foreign content. This implementation is foreign content-aware. This
makes a difference for CDATA sections that contain a > before the meta
as well as style and script elements within foreign content. This could
happen if the CDATA section that has mysteriously been introduced around
a what looks like a meta tag also contains another prior tag-looking
run of text.
* This implementation processes rel=preload and speculative loads that are
seen before <meta charset> has been seen. WebKit and Blink instead first
look for the meta and rewind before starting speculative parsing.
* Unlike WebKit, if there is neither an honored meta nor syntax resembling
an XML declaration, detection from content takes place (as in Blink).
* Unlike Blink, if there is neither an honored meta nor syntax resembling
an XML declaration, the detection from content is not dependent of network
buffer boundaries.
* Unlike Blink, detection from content can trigger a reload at the end of
the stream if the guess made at that point differs from the first guess.
(See below for the definition of the input to the first guess.)
Differences from the old spec and Gecko previously:
* Meta inside script and RCDATA elements is no longer honored.
* Late meta is now ignored and no longer triggers a reload.
* Later meta counts as early enough meta: In addition to the previous
meta within the first 1024 bytes, now a meta that started within the first
1024 bytes counts as early enough. Additionally, if by then there hasn't
been a template start tag and head hasn't ended, meta occurring before the
earlier of the end of the head or a template start tag counts as early
enough.
* Meta now counts as not-late even if the encoding label has numeric
character reference escapes.
* Syntax resembling an XML declaration longer than a kilobyte is honored if
there is no honored meta.
* If there is neither an honored meta nor syntax resembling an XML declaration,
the initial chardetng scan is potentially longer than before: the first 1024
bytes, the token spanning the 1024-byte boundary if there is such a token,
and, if by then head hasn't ended and there hasn't been a template start tag
until the end of the template start tag or the end of the token that causes
head to end, ever comes first. However, if the token implying the end of the
head is a text token, bytes only to the end of the previous non-text token is
considered. (This definition avoids depending on network buffer boundaries.)
* XML View Source now uses the code for syntax resembling an XML declaration
instead of expat for extracting the internal encoding label.
Reftest are added as both WPT and Gecko reftests in order to test both http:
and file: URL scenarios. The Gecko tests retain the WPT <link> tags in order
to use the exact same bytes.
An encoding declaration has been added to a number of old tests that didn't
intend to test the new speculation behavior especially in the context of
https://bugzilla.mozilla.org/show_bug.cgi?id=1727750 .
Differential Revision: https://phabricator.services.mozilla.com/D125808
This patchs adds new error messages which are extending existing ones,
providing extra information to the user.
A webconsole mochitest is added in the following patch of this stack.
Differential Revision: https://phabricator.services.mozilla.com/D131889
-Wshadow warnings are not enabled globally, so these -Wno-shadow suppressions have no effect. I had intended to enable -Wshadow globally along with these suppressions in some directories (in bug 1272513), but that was blocked by other issues.
There are too many -Wshadow warnings (now over 2000) to realistically fix them all. We should remove all these unnecessary -Wno-shadow flags cluttering many moz.build files.
Differential Revision: https://phabricator.services.mozilla.com/D132289
The default handler and character-data handler callbacks are identical
and some Windows compilers just reconciled them into a single function.
This, unfortunately, resulted in a RLBox runtime error: the same
callback was registered twice. This patch removes the duplicate handler
implementation and just sets the character-data handler callback as the
default handler.
Depends on D104658
Differential Revision: https://phabricator.services.mozilla.com/D126369
parser/html/nsHtml5StreamParser.cpp:1046:10: error: variable 'totalRead' set but not used [-Werror,-Wunused-but-set-variable]
size_t totalRead = 0;
^
Differential Revision: https://phabricator.services.mozilla.com/D126458
NOTE! In cases where there is no HTTP-layer encoding declaration, and CSS
parsing inherits the encoding from the HTML document, for preloads, this
changes the inherited encoding from windows-1252 to UTF-8 in order to
make the speculative encoding correct in the common `<meta charset=utf-8>`
case.
Differential Revision: https://phabricator.services.mozilla.com/D123593
Automatically generated path that adds flag `REQUIRES_UNIFIED_BUILD = True` to `moz.build`
when the module governed by the build config file is not buildable outside on the unified environment.
This needs to be done in order to have a hybrid build system that adds the possibility of combing
unified build components with ones that are built outside of the unified eco system.
Differential Revision: https://phabricator.services.mozilla.com/D122345
This change ensures that the tokenizer sets the doctype name to null
when the doctype name is missing in the input source.
Otherwise, without this change, the doctype name is set to the empty
string — which doesn’t conform to the requirements in the HTML spec, and
which causes us to fail 9 tests in the html5lib-tests suite.
Relates to https://github.com/validator/htmlparser/issues/35
Differential Revision: https://phabricator.services.mozilla.com/D122936
This change ensures that for all cases with spec requirements in the
form “clear the stack back to a foo context” — which involves checking
for elements with particular names — we only look for elements in the
HTML namespace, rather than additionally looking for elements which
aren’t in the HTML namespace but that also have those particular names.
Otherwise, without this change, we aren’t in conformance with the spec
requirements, and we fail several cases in the html5lib-tests suite.
Fixes https://github.com/validator/htmlparser/issues/33
Differential Revision: https://phabricator.services.mozilla.com/D122722
This change brings the tokenizer’s handling of U+0000 NUL characters in
the DATA state and the CDATA section state into conformance with the
requirements in the HTML spec — for the case where only tokenization is
being performed, without tree construction; that is, the case where the
tokenizer() method is called, rather than parse() or parseFragment().
Specifically, the tokenization steps defined in the spec require that
when a U+0000 NUL is consumed in the DATA state or in the CDATA section
state, the parser must then emit a U+0000 NUL. But when performing tree
construction, the spec requires that when a U+0000 NUL is consumed, the
parser must instead emit a U+FFFD REPLACEMENT CHARACTER.
Without this change, the parser always emits a U+FFFD REPLACEMENT
CHARACTER — even when only tokenization is being performed. That causes
us to fail a number of tests in html5lib-tests suite.
For more background on the relevant behavior, see the following:
* https://www.w3.org/Bugs/Public/show_bug.cgi?id=9659
* https://github.com/whatwg/html/commit/d98f83e
* https://github.com/validator/htmlparser/commit/9b9c263
Relates to https://github.com/validator/htmlparser/issues/35
Differential Revision: https://phabricator.services.mozilla.com/D122721
When the parser encounters a `</template>` end tag and there are other
open elements, the HTML spec requires the parser to “generate all
implied end tags thoroughly”, which unlike “generate implied end tags”
also includes generating implied end tags for table-parts elements
(caption, colgroup, tbody, thead, tfoot, td, th, and tr).
Differential Revision: https://phabricator.services.mozilla.com/D82020
Doing `errUnclosedElements(eltPos, "template")` for EOF in the “in
template” state results in the error message “End tag `template` seen, but
there were open elements”, which is all wrong because the actual problem is
that though a `template` end tag was expected, EOF was reached without a
`template` end tag being seen.
So let’s instead when we reach this just report the list of open elements.
Differential Revision: https://phabricator.services.mozilla.com/D122598
This results in lots of new WPT test passes.
There were also a couple of WPT tests that turned out to be broken;
tab-size-inline-001 and -002 had errors in their reference files such
that they'd never pass anywhere. So those are fixed here.
Depends on D117331
Differential Revision: https://phabricator.services.mozilla.com/D117332
- Add missing include directives and forward declarations.
- Remove some extra include directives.
- Add missing namespace qualifications.
- Move include directives out of namespace in toolkit/xre/GlobalSemaphore.h
Differential Revision: https://phabricator.services.mozilla.com/D98894
Previously, the DocGroup type was not cycle-collected, as it needed to have
references from other threads for Quantum DOM. Nowadays the only off-main-thread
use of DocGroup is for dispatching runnables to the main thread which should be
tracked using a performance counter for about:performance. This means we can
remove the DocGroup references from these dispatching callsites, only storing
the Performance Counter we're interested in, and simplify make DocGroup be
cycle-collected itself.
This fixes a leak caused by adding the WindowGlobalChild getter to
WindowContext, by allowing cycles between the document and its BrowsingContext
to be broken by the cycle-collector.
Differential Revision: https://phabricator.services.mozilla.com/D108865
Note that this patch only transforms the use of the nsDataHashtable type alias
to a directly equivalent use of nsTHashMap. It does not change the specification
of the hash key type to make use of the key class deduction that nsTHashMap
allows for in some cases. That can be done in a separate step, but requires more
attention.
Differential Revision: https://phabricator.services.mozilla.com/D106008
GetValue is going to be removed in a subsequent patch. It is no longer needed,
because it can be replaced by functions already provided by nsBaseHashtable,
in particular Lookup and Contains.
Also, its name was confusing, since it specifically returns a pointer that
allows and is intended for modifying the entry within the hashtable, rather
than returning by-value. According to the naming rules to be set on
nsBaseHashtable, it would also needed to be renamed to "Lookup*. Removing
its uses saves this effort.
Differential Revision: https://phabricator.services.mozilla.com/D105476
This makes the naming more consistent with other functions called
Insert and/or Update. Also, it removes the ambiguity whether
Put expects that an entry already exists or not, in particular because
it differed from nsTHashtable::PutEntry in that regard.
Differential Revision: https://phabricator.services.mozilla.com/D105473
There are no code changes, only #include changes.
It was a fairly mechanical process: Search for all "AUTO_PROFILER_LABEL", and in each file, if only labels are used, convert "GeckoProfiler.h" into "ProfilerLabels.h" (or just add that last one where needed).
In some files, there were also some marker calls but no other profiler-related calls, in these cases "GeckoProfiler.h" was replaced with both "ProfilerLabels.h" and "ProfilerMarkers.h", which still helps in reducing the use of the all-encompassing "GeckoProfiler.h".
Differential Revision: https://phabricator.services.mozilla.com/D104588
This is equivalent to the check in Document::MaybePreLoadImage, since
otherwise this code won't check the preload service at all. Without
this, we can trigger the assertion in PreloaderBase::NotifyOpen when we
have identical Link header and speculative link element preloads.
It's not a correctness nor perf issue, because the CSS loader will
coalesce the stylesheet load anyways, but it seems better to do this
than to remove the assertion, specially given images already do that.
The test-case in the following patch triggers the issue.
Depends on D103568
Differential Revision: https://phabricator.services.mozilla.com/D103569
This is an issue I found while going through this code and
writing/debugging a test for the bug at hand. Without this, the test in
the actual fix for this bug will fail to actually reuse the preloaded
stylesheet.
It seems reasonable to assume that the intersection of quirks mode
documents using link preload headers is small (and in that case we'd
parse the sheet twice, but oh well).
Differential Revision: https://phabricator.services.mozilla.com/D103567
<link media> applies to both <link rel="stylesheet"> (see
HTMLLinkElement::GetStyleSheetInfo) and all link rel="preload" links,
regardless of as value (see HTMLLinkElement::CheckPreloadAttrs), so pass it
down and check them for all of those cases.
Note that in the <link rel="stylesheet"> case we'd still have to load it, but
it doesn't block rendering and we defer its loading until more important
stylesheets are done (see SheetLoadData::ShouldDefer() which returns false if
the media attribute didn't match). So speculatively loading it seems
counter-productive.
Differential Revision: https://phabricator.services.mozilla.com/D103565
Just drive-by cleanup, no behavior change.
There's an using namespace mozilla, so also remove some useless
namespace qualifications while at it.
Depends on D103562
Differential Revision: https://phabricator.services.mozilla.com/D103563
Take a step towards replacing the encoding menu with a single menu item that
triggers the autodetection manually. However, don't remove anything for now.
* Add an autodetect item.
* Add telemetry for autodetect used in session.
* Add telemetry for non-autodetect used in session.
* Restore and revise telemetry for how the encoding that is being overridden
was discovered.
Differential Revision: https://phabricator.services.mozilla.com/D81132
Storing the charset on cache entries makes the code path uselessly different
when loading from cache relative to uncached loads. Also, for future
telemetry purposes, caching the charset obscures its original source.
Differential Revision: https://phabricator.services.mozilla.com/D101570
For custom element reaction invocation in the parser, we need a way to distinguish the tree appending operation is from fragments or not,
since we don't want to execute reactions until innerHTML finishes running.
(See spec changes https://github.com/whatwg/html/issues/4025)
We don't need to do anything for opAppendText and opAppendCommand since the text and command won't have any chance to
be a custom element.
Differential Revision: https://phabricator.services.mozilla.com/D10226
Allow-list all Python code in tree for use with the black linter, and re-format all code in-tree accordingly.
To produce this patch I did all of the following:
1. Make changes to tools/lint/black.yml to remove include: stanza and update list of source extensions.
2. Run ./mach lint --linter black --fix
3. Make some ad-hoc manual updates to python/mozbuild/mozbuild/test/configure/test_configure.py -- it has some hard-coded line numbers that the reformat breaks.
4. Make some ad-hoc manual updates to `testing/marionette/client/setup.py`, `testing/marionette/harness/setup.py`, and `testing/firefox-ui/harness/setup.py`, which have hard-coded regexes that break after the reformat.
5. Add a set of exclusions to black.yml. These will be deleted in a follow-up bug (1672023).
# ignore-this-changeset
Differential Revision: https://phabricator.services.mozilla.com/D94045
Allow-list all Python code in tree for use with the black linter, and re-format all code in-tree accordingly.
To produce this patch I did all of the following:
1. Make changes to tools/lint/black.yml to remove include: stanza and update list of source extensions.
2. Run ./mach lint --linter black --fix
3. Make some ad-hoc manual updates to python/mozbuild/mozbuild/test/configure/test_configure.py -- it has some hard-coded line numbers that the reformat breaks.
4. Make some ad-hoc manual updates to `testing/marionette/client/setup.py`, `testing/marionette/harness/setup.py`, and `testing/firefox-ui/harness/setup.py`, which have hard-coded regexes that break after the reformat.
5. Add a set of exclusions to black.yml. These will be deleted in a follow-up bug (1672023).
# ignore-this-changeset
Differential Revision: https://phabricator.services.mozilla.com/D94045
Allow-list all Python code in tree for use with the black linter, and re-format all code in-tree accordingly.
To produce this patch I did all of the following:
1. Make changes to tools/lint/black.yml to remove include: stanza and update list of source extensions.
2. Run ./mach lint --linter black --fix
3. Make some ad-hoc manual updates to python/mozbuild/mozbuild/test/configure/test_configure.py -- it has some hard-coded line numbers that the reformat breaks.
4. Add a set of exclusions to black.yml. These will be deleted in a follow-up bug (1672023).
# ignore-this-changeset
Differential Revision: https://phabricator.services.mozilla.com/D94045
I guess in order to make this 100% sound we should check the whole
template mode stack, but that seemed more expensive than what I'd really
like, and I think it's not likely to be an issue in practice (maybe we
can too-eagerly preload some images inside tables inside templates, or
something of that sort?).
Differential Revision: https://phabricator.services.mozilla.com/D92773
The `clobber` targets are superseded by `mach clobber`, so we don't need them for any reason. The `clean` target is meant to get you to a post-`configure` state, but it doesn't really work, and if it's necessary for you to be in that state for some reason you can just clobber and re-`configure`, so it doesn't seem worth it to get it working again. Instead, delete all of them. Also delete `everything` which is not useful when `clobber` doesn't exist.
Differential Revision: https://phabricator.services.mozilla.com/D93514
I guess in order to make this 100% sound we should check the whole
template mode stack, but that seemed more expensive than what I'd really
like, and I think it's not likely to be an issue in practice (maybe we
can too-eagerly preload some images inside tables inside templates, or
something of that sort?).
Differential Revision: https://phabricator.services.mozilla.com/D92773
Mostly mechanical change, with some extra work where non-literal names are provided.
Also, when this is the only profiler call in a file, `#include "GeckoProfiler.h"` can be changed to `#include "mozilla/ProfilerMarkers.h"`.
Differential Revision: https://phabricator.services.mozilla.com/D89415
Unlike other engine vendors, we process meta elements
at parser, instead of when they are inserted. This
leads some web compact issues.
This patch aligns us with other vendors.
Differential Revision: https://phabricator.services.mozilla.com/D84545
* Replace org.xml.sax.helpers.LocatorImpl with nu.validator.htmlparser.impl.LocatorImpl
* instanceof before cast to Locator2
Differential Revision: https://phabricator.services.mozilla.com/D82153
In `nu.validator.htmlparser.impl.TreeBuilder`, this change adds
`@SuppressWarnings("unchecked")` to the `getUnusedStackNode()` method.
Differential Revision: https://phabricator.services.mozilla.com/D82149