Fix recipe import from some WordPress sites

I want to import recipes from a site that uses WordPress Recipe Maker

At least on this site, the schema is structured like so:

{
  "@schema": "https://schema.org/",
  "@graph": [
    { /* blah */ },
    { /* blah */ },
    { "@type": "Recipe", "otherStuff": True }
  ]
}

Notably missing is the @schema on the Recipe object. It's intended
to be inherited from the parent, but HttpJsonLdParser didn't support
this kind of inheritance.

Well, now it does! =)

Signed-off-by: Nathaniel <I@nathaniel.land>
This commit is contained in:
Nathaniel 2024-06-21 14:24:09 -05:00
Родитель 2859faba06
Коммит ae5d8a2d25
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: DA03B047248BC6C3
2 изменённых файлов: 24 добавлений и 5 удалений

Просмотреть файл

@ -122,9 +122,17 @@ class HttpJsonLdParser extends AbstractHtmlParser {
*/
private function mapGraphField(array &$json) {
if (isset($json['@graph']) && is_array($json['@graph'])) {
$tmp = $this->searchForRecipeInArray($json['@graph']);
// Sometimes the context is set once on the top level object for children to inherit
$parentSetsContext = isset($json['@context']) &&
$this->jsonService->isSchemaContext($json['@context']);
$tmp = $this->searchForRecipeInArray($json['@graph'], $parentSetsContext);
if ($tmp !== null) {
// If the child wants to inherit context from parent, copy it on down
if ($parentSetsContext && !isset($tmp['@context'])) {
$tmp['@context'] = $json['@context'];
}
$json = $tmp;
}
}
@ -153,13 +161,14 @@ class HttpJsonLdParser extends AbstractHtmlParser {
/**
* Search for a recipe object in an array
* @param array $arr The array to search
* @param bool $haveSchemaContext Whether Schema context is given, so child needn't set it
* @return array|NULL The found recipe or null if no recipe was found in the array
*/
private function searchForRecipeInArray(array $arr): ?array {
private function searchForRecipeInArray(array $arr, bool $haveSchemaContext = false): ?array {
// Iterate through all objects in the array ...
foreach ($arr as $item) {
// ... looking for a recipe
if ($this->jsonService->isSchemaObject($item, 'Recipe', true, false)) {
if ($this->jsonService->isSchemaObject($item, 'Recipe', !$haveSchemaContext, false)) {
// We found a recipe in the array, use it
return $item;
}

Просмотреть файл

@ -21,13 +21,13 @@ class JsonService {
* @return bool true, if $obj is an object and optionally satisfies the type check
*/
public function isSchemaObject($obj, ?string $type = null, bool $checkContext = true, bool $uniqueType = true): bool {
if (!is_array($obj)) {
if (! is_array($obj)) {
// Objects must bve encoded as arrays in JSON
return false;
}
if ($checkContext) {
if (!isset($obj['@context']) || !preg_match('@^https?://schema\.org/?$@', $obj['@context'])) {
if (!isset($obj['@context']) || ! $this->isSchemaContext($obj['@context'])) {
// We have no correct context property
return false;
}
@ -64,6 +64,16 @@ class JsonService {
return (strcmp($obj['@type'], $type) === 0);
}
/**
* Check if the value of a schema key matches that of a schema.org object
*
* @param string $context The value of some object's @schema property
* @return bool true, if the schema matches that of a schema.org object
*/
public function isSchemaContext(string $context): bool {
return preg_match('@^https?://schema\.org/?$@', $context);
}
/**
* Check if $obj is a schema.org object and contains a named property.
*