I'm looking at this
|
<production name="IdentifierStart" rr:inline="true" oc:lexer="true"> |
|
<description> |
|
Based on the unicode identifier and pattern syntax |
|
(http://www.unicode.org/reports/tr31/) |
|
And extended with a few characters. |
|
</description> |
|
<alt> |
|
<character set="ID_Start"/> |
|
<character set="Pc"/> <!-- Punctuation connectors (underscores) --> |
|
</alt> |
|
</production> |
|
|
|
<production name="IdentifierPart" rr:inline="true" oc:lexer="true"> |
|
<description> |
|
Based on the unicode identifier and pattern syntax |
|
(http://www.unicode.org/reports/tr31/) |
|
And extended with a few characters. |
|
</description> |
|
<alt> |
|
<character set="ID_Continue"/> |
|
<character set="Sc"/> <!-- Currency symbols --> |
|
</alt> |
|
</production> |
well actually the EBNF version that can be downloaded from the website (The legacy one).
How should i implement this? When you look for unicode ID_Start there a jungle of documentation. But i only need the regex of this. Fortunately unicode has made a tool for this https://unicode.org/cldr/utility/regex.jsp?a=%5B%3AID_Start%3A%5D&b=
That leaves the question what about And extended with a few characters?
So perhaps to clear up this section of the spec some of the following could be done:
- Specify the "extended few characters"
- Give regular expression in the comments.
- Links to the unicode tools.
I think the unicode tool gives the regular expression in some standard regex format, but there can be other engines like PCRE which can define other shorthand classes which could be used instead. Therefor it can be useful to include regular expressions in comments.
I'm looking at this
openCypher/grammar/basic-grammar.xml
Lines 718 to 740 in a514465
How should i implement this? When you look for unicode ID_Start there a jungle of documentation. But i only need the regex of this. Fortunately unicode has made a tool for this https://unicode.org/cldr/utility/regex.jsp?a=%5B%3AID_Start%3A%5D&b=
That leaves the question what about
And extended with a few characters?So perhaps to clear up this section of the spec some of the following could be done:
I think the unicode tool gives the regular expression in some standard regex format, but there can be other engines like PCRE which can define other shorthand classes which could be used instead. Therefor it can be useful to include regular expressions in comments.