-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Google Structured Data Testing Tool doesn't like mixing Schema.org and Dublin Core in our RDFa #1074
Comments
That last bit with the subjects is my fault. That comes from the subject's href linking a resource the tester can't access. It actually works fine. |
@seth-shaw-unlv If we give a resource both a schema and dc type then is it ok? Making everything a schema:article by default (in addition to pcdm) seems appropriate. Not sure about dc (I'm assuming there's a generic thing from the dcmi types we can use or something). In theory that should appease our semantic robot overlords. |
@dannylamb Nope. Still grumpy. I added dcterms:BibliographicResource to the RDFa and Google was still mad, because the dcterm is being applied to a schema Type. It looks like Google doesn't like other vocabularies being used near schema things. It looks like either all schema for a Node or none at all as far as Google is concerned. In semi-related news, the Schema.org architypes proposal was accepted and added to schema.org! This makes schema-only descriptions a bit easier to do. BTW, has anyone thought to do a Dublin Core -> schema.org comparison/map? Could a repository conceivably abandon Dublin Core for pure schema.org without (much) loss? |
@seth-shaw-unlv, 2 cents here: one of the reasons why mixing and matching ontologies and properties from different ones is not such a good idea without making sure one property is valid in another's class definition/domain/ontology. Its a bit like the work on MODS to RDF mapping that happened in that great working group: It works for internal use, but is not semantically correct for exposing the data to the outside(and by saying that now i deserve to be hated). Google tries to apply its Ontology validation correctly and in that one, if an Object is of type Schema:thing, only properties in that domain are valid. And google can not do Ontology Intersection, aligning nor inference, so specifically in RDFa it will try to match any property given to all classes. A better way of getting away with this is avoiding other ontologies in the RDFa(stick with schema) but embed a JSON-LD as script in the body. it is what Zenodo and DataCite are doing with great success. In that case your JSON-LD can have many contexts and Google will not comply (namespaces will match also because the expansion will only apply to the right RDF (or OWL) Class). Still, its good to check if a certain group of properties can freely be moved between ontologies, i highly recommend not doing that without validating. |
@DiegoPino I'm not seeing any examples that would allow us to use multiple ontologies in the JSON-LD and Google still not freaking out. The multiple contexts seem to mostly be used as namespace definitions (multiple mappings of predicates to a field names) but the resulting set of edges still results in a mixing of ontologies. The datacite examples I found of JSON-LD only use schema.org. Having one set in the JSON-LD script tag and another in the RDFa doesn't work because Google appears to ignore the RDFa when it finds JSON-LD. So, really, it looks like anything we want to hand off to Google needs to ontological consistency but we can index in our Fedora and triple-store whatever we want. This implies to me that we need to keep the JSON-LD just for indexing and have some way to either filter what gets pushed into the RDFa v. JSON-LD OR separate configs for each. |
Hi, i will share some examples with you tomorrow(on the phone now), google
can handle some other stuff if inside json-ld. Contexts can in fact contain
many ontologies (thats what namespaces are for amongs others) e.g the the
iiif presentation context, uses quite a few. But also, you just answered
your own issue :). Since you have basically no control in your islandora 8
architecture to remove some predicates from rdfa without affecting every
other mapping you have in drupal to talk to fedora, etc, by having a
simpler json-ld (and with that i say schema.org only seems the lowest
barrier) embedded, you ensure google is happy and you can keep your full
blown mix and match for your rdfa and triple store needs. Seems like a win
win situation. Now you just need to embed it.
El El mar, 2 de abr. de 2019 a las 17:05, Seth Shaw <
notifications@github.com> escribió:
@DiegoPino <https://github.com/DiegoPino> I'm not seeing any examples
that would allow us to use multiple ontologies in the JSON-LD and Google
still not freaking out. The multiple contexts seem to mostly be used as
namespace definitions (multiple mappings of predicates to a field names)
but the resulting set of edges still results in a mixing of ontologies. The datacite
examples I found <https://blog.datacite.org/schema-org-register-dois/> of
JSON-LD only use schema.org.
Having one set in the JSON-LD script tag and another in the RDFa doesn't
work because Google appears to ignore the RDFa when it finds JSON-LD.
So, really, it looks like anything we want to hand off to Google needs to
ontological consistency but we can index in our Fedora and triple-store
whatever we want. This implies to me that we need to keep the JSON-LD just
for indexing and have some way to either filter what gets pushed into the
RDFa v. JSON-LD OR separate configs for each.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1074 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGn857-BmTpQLGOHsjAGfxpFcOhC_shYks5vc8YkgaJpZM4cSpl5>
.
--
Diego Pino Navarro
Digital Repositories Developer
Metropolitan New York Library Council (METRO)
|
@seth-shaw-unlv @DiegoPino https://www.drupal.org/project/schema_metatag does just that. We can set up how we want stuff for google and that gets embedded as jsonld. At that point there is a discrepancy between the RDFa and the embedded JSONLD and what goes in Fedora/Triplestore, but I guess Google's behaviour works in our favor there w/rt/t RDFa vs. embedded JSONLD. And really, we have no choice but to separate what Google wants and how users choose to model their data. |
Based on the devel call this week, this issue will likely wait until someone has an Islandora 8 site live and indexed by Google/Bing so we can test the real-world impact of multiple ontologies. If it truly is a problem, then we can probably have a module pull in the JSON-LD and do a simple filter or map so only schema.org appears in the page's script element and trim the "_format=jsonld" off the URIs. |
Related to our SEO issue #882:
This is probably a documentation issue, but the Google Structured Data Tool doesn't like mixing Dublin Core and Schema.org terms.
Declaring something with a Schema.org Type (e.g. schema:ImageObject) and adding Dublin Core elements to it will throw errors because those properties are not in their scope. E.g.
The reverse is also true, if you add Schema.org properties (e.g. schema:sameas) to something that doesn't have a Schema.org type it will also complain:
It doesn't complain that the property is there, just that the PCDM type is not known to Google.
The text was updated successfully, but these errors were encountered: