-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract parsed CSS property value definitions #494
Comments
I think that it makes more sense to only have unparsed CSS (and IDL) texts in webref. So I would favor publishing css-grammar-parser.js as a standalone parser. For instance, in the case of IDL and as discussed in #489 (comment), the serialized version of the parsed structure as JSON is good for analyses but it cannot be used directly to re-serialize the structure as an IDL string or as HTML (other info, present in the in-memory AST when the original IDL text is parsed, is needed to be able to call the WebIDL writer). In other words, these parsed objects are only half-useful, depending on what you want to use them for. Ping @dontcallmedom who developed the CSS grammar parser and who usually loves to make tools standalone :) |
I'm absolutely supportive of moving css-grammar-parser.js to its own package if there is a use for it; I'm more neutral on whether having parsed or unparsed content in webref, but I think it's a reasonable path forward to start by just having the unparsed fragments and the parser available separately. |
Just an FYI but css-tree also includes a parser/generator for CSS value definition syntax: |
@foolip I'm happy to look into moving the css grammar parser as a separate tool, but I wonder if the parser provided in CSS Tree would already fulfill that need - could you check if it fits the bill? if not, I'll be happy to package ours out. |
@dontcallmedom would it make sense to try using css-tree in Reffy? I imagine it would reveal a bunch of cases which are handled differently by each parser, but if we're lucky css-tree might handle more things than what css-grammar-parser.js does? It would be a nice outcome if CSS definitions and IDL definitions are handled similarly, both using an externally maintained library, but where Reffy depends on a specific version, so that consumers of webref can use that same version and be confident it will work. |
That sounds like a good goal indeed; I'm happy to check if css-tree is able to parse as much as our own parser, but I'm not sure yet how to enforce the broader invariant - will have to think more about it. |
I assume that it would be hopeless to ever get all specs to be valid at the same time, so just like for IDL, we'd need to take an approach of applying local fixes, see w3c/webref#63. |
so the invariant you suggest is that all the CSS fragments available in |
here are the errors that I get today trying to parse the said values with the csstree parser
(haven't looked in the details yet) |
One note about the above output. SVG specs use some custom notations not included in the CSS value definition syntax. See w3c/svgwg#642 (comment) Also, css-tree recently implemented a service that can detect problematic syntax in https://github.com/w3c/csswg-drafts/. It would probably be better (easier) to adapt this service to use webref rather than parsing the repo directly. |
@dontcallmedom that is indeed the invariant I have in mind, with manual fixups to make it possible in practice. Can you tell at a glance if css-tree handles more or less of the scraped syntax than the existing parser? |
I can explain the output.
Looks like missed escaping
Not pretty sure here. CSS Values & Units spec (where CSS Definition Syntax is defined) says nothing about sequence of combinators (is it legit or not). Tab Atkins commented that "stacked combinators" are legit. However, there are just 3 places across all the specs, where it used and, for sure, it can be avoided. I guess it should be discussed and a resolution added to the spec (CSS Values & Units).
Definitions can't have comments, it should be removed.
Invisible zero-width symbol here. I had already fixed that w3c/csswg-drafts#4275, not sure why it appeared again.
Combinators are never used with a comma, a parenthesis etc. I believe
|
@lahmatiy Why couldn't combinators be used with a comma? According to the Value Definition Syntax, the literal Also, the
Well, 3 places is already a good start ;) Based on the discussion you reference, there seems to be agreement in the CSS WG already that stacked operators are the right way to express certain constraints. Any plan to support that in the parser? |
I note that the Value Definition Syntax is clear that comments (and whitespaces) are allowed pretty much everywhere: That said, in this particular case, Reffy's extraction logic should simply not assign the comment to this value, see #534. |
looking in the stacked modifiers discussion, I've filed an issue to try to get more clarity at least in the spec about which modifiers are allowed to be stacked w3c/csswg-drafts#6085 |
Code follows the same structure as for the `@webref/idl` package. Tests make sure that all CSS fragments can be parsed with the Value Definition Syntax parser of CSSTree. A couple of CSS patches need to be applied to the extracts to fix value definitions. Tests currently only guarantee that the value definitions can be parsed with CSSTree, with the exception of a handful value definitions that are a priori valid but that cannot be parsed with the CSSTree parser for the time being, see discussion in: w3c/reffy#494 (comment) Co-authored-by: Philip Jägenstedt <philip@foolip.org>
Code follows the same structure as for the `@webref/idl` package. Tests make sure that all CSS fragments can be parsed with the Value Definition Syntax parser of CSSTree. A couple of CSS patches need to be applied to the extracts to fix value definitions. Tests currently only guarantee that the value definitions can be parsed with CSSTree, with the exception of a handful value definitions that are a priori valid but that cannot be parsed with the CSSTree parser for the time being, see discussion in: w3c/reffy#494 (comment) Co-authored-by: Philip Jägenstedt <philip@foolip.org>
Closing the issue as css-tree added support for all missing syntaxes and successfully parses all CSS value definitions. Enforced through tests in webref. |
Back #113 (comment), @tidoust pointed out that reffy parses CSS value definitions and saves the result in a
parsedValue
incrawl.json
, which for properties happens here:reffy/src/cli/crawl-specs.js
Lines 132 to 141 in eba2d30
In order to test CSS property keyword values for BCD, it would be convenient to have this extracted into webref, or perhaps if css-grammar-parser.js was published as a standalone parser which one could use together with webref.
#110 has a burndown list of remaining syntax issues, but getting everything to 100% isn't necessary in order to provide value here.
The text was updated successfully, but these errors were encountered: