Skip to content

Fixing the string tokenization #199

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 29, 2021
Merged

Fixing the string tokenization #199

merged 4 commits into from
Nov 29, 2021

Conversation

mamazu
Copy link
Contributor

@mamazu mamazu commented Nov 22, 2021

In the current version of the scanner, the scanner splits the string by spaces and then in a second pass it just gets the spaces in order to put the strings back together in the QueryConverter. This doesn't make much sense and can be very harmful to bigger queries. (as we found out even preg_match has a size limitation).

After this merge request it parses the strings in the query directly into one token which also simplifies the QueryConverter.

Copy link
Member

@dbu dbu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for this pull request, looks like a very good idea. i also really like the tests you added.

i ran the test suite of jackalope-doctrine-dbal with your changes, and unfortunately there are some failures:

Caused by
PHPCR\Query\InvalidQueryException: Error parsing query, unknown query part "'"" in: 
            SELECT data.quotes
            FROM [nt:unstructured] AS data
            WHERE data.quotes = "\"'"        

vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2ToQomQueryConverter.php:126
src/Jackalope/Transport/DoctrineDBAL/Client.php:2453
vendor/jackalope/jackalope/src/Jackalope/Query/Query.php:107
vendor/phpcr/phpcr-api-tests/tests/Query/CharacterTest.php:86

a couple of special cases with escaping seems to get broken in this PR.
https://github.com/phpcr/phpcr-api-tests/blob/master/tests/Query/CharacterTest.php

as well as the phpcr-api-tests cases PHPCR\Tests\Query\QOM\ConvertQueriesBackAndForthTest::testBackAndForth and PHPCR\Tests\Query\QOM\Sql2ToQomConverterTest::testQueries

could you please look into those? we should probably duplicate some of the escape testing into the unit tests in this repository.

$scanner = new Sql2Scanner(<<<'SQL'
SELECT page.*
FROM [nt:unstructured] AS page WHERE name ="Hello world"
SQL);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems to not work with PHP 7.1

which made me realize we should also build with 7.2 and 7.3 - i added that to master.

while i don't mind dropping old PHP versions for good reasons, i would prefer here to use the old nowdocs syntax for this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean removing the quotes around the SQL in the HEREdoc string? I did that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reading https://www.php.net/manual/en/language.types.string.php#language.types.string.syntax.nowdoc i think the problem is the )

i think you best define the query as a string like in the test above. SQL; must stand on its own line without any other code and no indention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed.

Now the parser can understand escaped characters in the string
@mamazu
Copy link
Contributor Author

mamazu commented Nov 23, 2021

Thanks for the quick reply. I am glad that you also checked the test cases of the bigger repository and found a bug in the parser.
I have now also added support for escaped characters and tests for it in the tokenizer.
This should also increase the speed of the whole project as we don't have do any expensive regex matching anymore.

@dbu
Copy link
Member

dbu commented Nov 23, 2021

thanks for the updates. would be interesting to benchmark how much speed we gain with this cleanup.

i still have these 2 tests that fail:

1) PHPCR\Tests\Query\QOM\ConvertQueriesBackAndForthTest::testBackAndForth
PHPCR\Query\InvalidQueryException: Syntax error: Expected ')', found '''' in SELECT * FROM [nt:file] AS file WHERE CONTAINS(file.prop, 'expr''')

vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2Scanner.php:94
vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2ToQomQueryConverter.php:553
vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2ToQomQueryConverter.php:422
vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2ToQomQueryConverter.php:361
vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2ToQomQueryConverter.php:118
vendor/phpcr/phpcr-api-tests/tests/Query/QOM/ConvertQueriesBackAndForthTest.php:69

2) PHPCR\Tests\Query\QOM\Sql2ToQomConverterTest::testQueries
PHPCR\Query\InvalidQueryException: Syntax error: Expected ')', found '''' in SELECT * FROM [nt:file] AS file WHERE CONTAINS(file.prop, 'expr''')

vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2Scanner.php:94
vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2ToQomQueryConverter.php:553
vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2ToQomQueryConverter.php:422
vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2ToQomQueryConverter.php:361
vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2ToQomQueryConverter.php:118
vendor/phpcr/phpcr-api-tests/tests/Query/QOM/Sql2ToQomConverterTest.php:75

@dbu
Copy link
Member

dbu commented Nov 23, 2021

@wachterjohannes is sulu using sql2 queries? if so, it would be interesting if you could have a look at these changes to validate if they speed up things and if you hit edge cases (something about quote escaping seems to be amiss, but otherwise it looks good to me). there might be more edge cases buried in it...

@mamazu
Copy link
Contributor Author

mamazu commented Nov 23, 2021

About the failing tests. What exactly is the token stream that you are expecting from such an expression? I haven't really worked with SQL enough to know how this works.

@wachterjohannes
Copy link
Contributor

@dbu yes we are using SQL2 Queries which load partial nodes. We will take a look at that! @mamazu have you found that performance issue in sulu?

/cc @alexander-schranz

@mamazu
Copy link
Contributor Author

mamazu commented Nov 24, 2021

Well we are currently having outages because of some super large and complicated SQL queries and looking into the log files there were a lot of warnings about this kind of errors (paraphrased):
Could not compile regex pattern size exceeded at character 63032

Seeing that we don't need to regex parse this expression I thought this might help. (At least it would get rid of the warning messages.

@dbu
Copy link
Member

dbu commented Nov 25, 2021

i refactored the tests a bit to get information which case actually fails: phpcr/phpcr-api-tests#195

the problem is with this query: https://github.com/phpcr/phpcr-api-tests/blob/9db8e412e1b5a995e6a9ba4645a0a4e8c66c7f16/tests/Query/QOM/Sql2TestQueries.php#L161

it seems that jcr knows a second way to escape literal ' in a string by typing '' 🙈

i think that is missing from your refactoring.

@mamazu
Copy link
Contributor Author

mamazu commented Nov 25, 2021

So I have also added the SQL string escaping. I think I have added all the needed tests for it. But I could be wrong.

@mamazu
Copy link
Contributor Author

mamazu commented Nov 25, 2021

I did some performance testing.

Results

Before:

Complicated Parse: 0.0023
Simple Parse: 0.0000085

After:

Complicated parse: 0.0016 (30% faster, and doesn't throw warnings from preg_match)
Simple parse: 0.0000097 (14% slower)

The script that I used to test it:

<?php

include "vendor/autoload.php";

function timeParse($sql) {
    $runs = [];
    for($i = 0; $i < 100; $i++) {
        $start = microtime(true);
        new \PHPCR\Util\QOM\Sql2Scanner($sql);
        $end = microtime(true);
        $runs[] = $end - $start;
    }

    echo (array_sum($runs) / count($runs));
    echo PHP_EOL;
}

echo "Complicated pass: ";
timeParse(<<<SQL
SELECT n0.path AS n0_path, n0.identifier AS n0_identifier, n0.props AS n0_props FROM phpcr_nodes n0 WHERE n0.workspace_name = 'default_live' AND n0.type IN ('nt:unstructured', 'rep:root') AND ((EXTRACTVALUE(n0.props, 'count(//sv:property[@sv:name="jcr:mixinTypes"]/sv:value[text()="sulu:page"]) > 0') OR EXTRACTVALUE(n0.props, 'count(//sv:property[@sv:name="jcr:mixinTypes"]/sv:value[text()="sulu:home"]) > 0')) AND ((0 != FIND_IN_SET("2", REPLACE(EXTRACTVALUE(n0.props, '//sv:property[@sv:name="i18n:de_de-state"]/sv:value'), " ", ",")) OR EXTRACTVALUE(n0.props, '//sv:property[@sv:name="i18n:de_de-shadow-on"]/sv:value[1]') = '1') AND (((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((n0.identifier = '48461e37-dd7b-49b6-bed1-2624cb59e1dd' OR n0.identifier = '1667c2f1-c4fd-49e7-849e-10c250465705') OR n0.identifier = 'b3750d14-5353-4170-88d9-e3073825f014') OR n0.identifier = 'e20ae640-d0e6-490e-a2b9-14912b00294f') OR n0.identifier = '2da5e03e-970d-4fa1-9c8c-6822bc3f9c67') OR n0.identifier = 'a9edea09-c116-4cde-a461-855b27a1136d') OR n0.identifier = '4b8ddc1f-4d0f-4715-8600-be41a0da71d7') OR n0.identifier = '26a4a733-b94d-49b6-b680-b1f615025c25') OR n0.identifier = 'ec4a8979-5946-42fc-abf6-e5f5592faa60') OR n0.identifier = '0d428ece-3beb-4c31-8dd5-0c4c35042a92') OR n0.identifier = '4852dcfe-fe5e-499c-bc90-ca7bea5a8df0') OR n0.identifier = '7a3160b3-76ad-4034-9474-c4ae406d3890') OR n0.identifier = 'e1ca72f7-d3b9-454a-9214-f00f019334d0') OR n0.identifier = 'aa82cb73-1521-4d08-b85f-09217d932bc1') OR n0.identifier = '931523d2-b1fb-4ffc-b955-ca7463066dc1') OR n0.identifier = '8801b5e9-e951-4ecf-813d-ec76a9036892') OR n0.identifier = '60e5a79f-13dd-4d38-ac31-e3d8b6437f2e') OR n0.identifier = 'f3536be3-c0d4-49e7-84d1-9ae9a0dca642') OR n0.identifier = 'e0a0edd8-0c88-48ad-8c1f-424b5b56fe60') OR n0.identifier = '377eb180-b3fd-42ae-bce9-c493bcabfe04') OR n0.identifier = 'f0797e8a-d98e-403e-b090-2643f9ad3560') OR n0.identifier = 'ca019094-a4ca-44b1-bc8c-1d169ebcd1ba') OR n0.identifier = '8efbfac2-d5d2-4e54-8802-6ba176a26991') OR n0.identifier = '0b00b633-4b49-495a-84ec-ddb463a62f83') OR n0.identifier = 'ccb02784-2bfd-4848-b93b-c3ce7d3ca297') OR n0.identifier = '11e610ff-48c6-4cc6-b04a-a7162ea5957d') OR n0.identifier = '799d8197-5998-4304-978e-1aacd9d10214') OR n0.identifier = '7d5f35a7-79f6-4ed8-8b0c-c2ec6368cd07') OR n0.identifier = 'bf81e54d-9a25-44fa-8cb5-99ffbb5b7023') OR n0.identifier = 'db2b8b98-f5d0-4873-b51c-689e4874598e') OR n0.identifier = '78f78b3d-aa15-4278-8a6d-ed1677a3d534') OR n0.identifier = '16f77bad-927e-4a47-aeaf-89fa0de1dd95') OR n0.identifier = '209121f0-fd0c-48df-af39-97a7515da95a') OR n0.identifier = '5de47fbd-be2d-4b38-952a-91b00328b0ef') OR n0.identifier = '611ec220-2e3f-4819-9292-9e6f35f7d03b') OR n0.identifier = 'd520b40d-bb25-4ea8-a954-35a76c706ae8') OR n0.identifier = '4a197136-d692-4646-b231-1adc08030637') OR n0.identifier = 'd48f0243-8bc3-4d72-afa5-c811b99e762a') OR n0.identifier = 'c6957555-00e8-494e-b4eb-ab33ce10ef50') OR n0.identifier = 'c044f128-9461-4fc2-85eb-3c1787643357') OR n0.identifier = '245bc2d9-5205-4418-ba91-aafea2e4cf10') OR n0.identifier = '69552bf0-0834-4cd7-9ab9-72105b0a9171') OR n0.identifier = '5f6c1d55-fe8a-4772-b5dd-80b67a4e9569') OR n0.identifier = '4d919a53-7d4a-462b-98bf-1f609c77a77d') OR n0.identifier = 'c226f1f5-034b-4b43-8b19-a09c3a6b8c5d') OR n0.identifier = '3abd9011-a6ab-44fd-aec8-5b517979c5ca') OR n0.identifier = 'f1a33bd9-59dc-4873-91f7-302a641290de') OR n0.identifier = '02dc4649-5717-4501-9729-01ce1e656900') OR n0.identifier = '6a00cc57-fcf0-4bdb-bde4-ecefd37630ca') OR n0.identifier = '43ceef80-51dc-4e22-a4a1-cdad78ae88be') OR n0.identifier = '1a65b935-3c40-4a45-818b-5769b3f2a0c7') OR n0.identifier = '1dcc4825-bd51-4f21-be93-88dbf75b8c21') OR n0.identifier = '8b35d35c-333b-490c-bc04-9655ebc3483e') OR n0.identifier = '379301a3-1b67-4845-8fa6-e58a4662adf0') OR n0.identifier = 'c9226f8d-ec9c-43ce-ad26-df1d175bf6e7') OR n0.identifier = 'ceed27d8-b8e6-451d-9a01-0c3838c875f5') OR n0.identifier = '12b0aa76-54f1-49c3-abd7-87f12242e672') OR n0.identifier = '5f7d3bd4-00f6-4277-b18f-faffab62db23') OR n0.identifier = 'ed34b826-7d41-45d1-9d6f-c26c90177ec0') OR n0.identifier = 'e7f2b3bc-400c-476c-8dc4-6b3be605525f') OR n0.identifier = '32d97f7b-41c0-45dd-880f-c9f566dfa7f0') OR n0.identifier = 'efb58e05-25d4-4693-9d30-fbf4c9d66428') OR n0.identifier = 'c2474834-8b6f-466a-be32-15d0de343108') OR n0.identifier = '94e905d7-9682-4492-8ed8-7d9cbcadedd5') OR n0.identifier = 'fbba020e-e730-47f0-bc6d-2af456d45650') OR n0.identifier = 'e3354fea-f8d2-44e4-a5ee-c1cba34324b1') OR n0.identifier = 'e81b1df5-4b4a-4376-9863-6e2eac5571b0') OR n0.identifier = 'd3d6047f-6721-4711-afcd-6773375ce2fb') OR n0.identifier = '591c5736-5b81-404e-8e27-2fd8c8aafc87') OR n0.identifier = '2faf81be-9809-4a75-9e67-9505f66571d3') OR n0.identifier = 'b7cd9058-2a6f-40a3-bd54-0770e8d6b92e') OR n0.identifier = '0ed94d59-8729-4c35-9726-3bd468ec2990') OR n0.identifier = '8b0ca290-727c-4bc7-bd7e-89b101aef36c') OR n0.identifier = 'cc1c3925-e998-44be-aeb7-b2f351c5ad5e') OR n0.identifier = '66a6a0d5-94b3-4353-a4b9-9e23e6e63d80') OR n0.identifier = '4dd65bc6-a3c9-45aa-b441-1910c3c2a936') OR n0.identifier = 'f456f1ee-6e86-434e-a690-aadaa58d5a79') OR n0.identifier = 'fbef0bea-db09-4674-ac18-5b566b11b852') OR n0.identifier = '5efa4f9b-6bc3-4eef-8f72-995d95e96579') OR n0.identifier = 'fb53e17a-fd93-4ec6-adaf-50ca16f72c03') OR n0.identifier = 'b98fd2d5-7893-4751-a600-3f5e51d3c004') OR n0.identifier = '3760b591-2050-4505-b76e-ba1c2cb76acd') OR n0.identifier = 'e5b332d7-f74f-41c7-851a-f8710b4b5bb4') OR n0.identifier = '2f76c813-20de-4e3e-9ad2-9a40e86e4678') OR n0.identifier = '296bbe47-f474-4634-b73c-6eb1b0a68335') OR n0.identifier = 'b1d78d01-499e-4691-bcf0-1b8b39e12b32') OR n0.identifier = 'b9be9e74-799a-4585-a36e-ab0403f56e48') OR n0.identifier = 'c1b5c898-2a31-4b62-b453-e1d9a15aa9fc') OR n0.identifier = '09950cf7-e6ef-43ef-8e71-12f2143dac37') OR n0.identifier = 'c0db5ee9-ceea-48e9-b976-2a1c24392047') OR n0.identifier = 'c2f6e13a-686e-4e53-923a-cc1442b3240f') OR n0.identifier = '967bd7a7-c2b4-4152-83be-97e7ba9593fb') OR n0.identifier = 'edd5c731-8733-4903-a582-6165be66b445') OR n0.identifier = 'ff92ff2b-ef4c-4a09-8fa0-7aef661c5464') OR n0.identifier = '9c00179e-a5a8-4f9b-a9e8-baac31fefcd6') OR n0.identifier = '1136ef56-32d0-43cd-88a0-0aaf6997bc39') OR n0.identifier = '610387d7-df4b-44dc-91de-40a4afae46e5') OR n0.identifier = 'c1b7897e-1dfe-45f2-9d12-6db0256435f0') OR n0.identifier = '115c534e-7810-4ddc-b896-354d646cb9b9') OR n0.identifier = '1f53bef6-65ea-4ee6-94e2-d0c2538b945f') OR n0.identifier = '4d8aaf94-55f1-4e84-a613-9a8644484e9c') OR n0.identifier = 'f2417a52-bece-4027-81a6-2739c25d0966') OR n0.identifier = 'b09c62f8-4375-4fb8-a7c8-32548d07fa5f') OR n0.identifier = 'a53ae560-6d11-4624-906b-a2631cf841c3') OR n0.identifier = '17916432-ac0e-47ee-b120-722cef7deb83') OR n0.identifier = '3d317fa8-09aa-40da-9147-2adb3825686d') OR n0.identifier = '7bd6108a-be54-49c2-82fd-1ad9b9b91f08') OR n0.identifier = '35d7d939-a9d2-4899-affc-2c2b587c2a20') OR n0.identifier = '9f043d55-d574-4854-bc6d-37eab63b6693') OR n0.identifier = 'd4872afc-8fd5-42c1-8336-cc2eeb8aa983') OR n0.identifier = '6b889600-3097-426c-b2b2-6f4cd3d51f2d') OR n0.identifier = '4d9e1db0-a980-455a-9b5d-6222fdb15e5a') OR n0.identifier = 'a9778aae-31cd-4c44-8d6f-12b5d4453031') OR n0.identifier = 'c594d70b-1a3f-4e68-8760-d020761f3fd0') OR n0.identifier = 'd8bf4ced-8232-4810-887d-d114a1f10789') OR n0.identifier = '24ef92cc-09c9-48b8-ae87-d33092676268') OR n0.identifier = '8565dc8c-cd03-416f-9a53-fa530b5e07cd') OR n0.identifier = 'ef5d274a-c7cc-4288-ac48-a0686fa3615b') OR n0.identifier = '9c6fa7ff-cf3f-4ce4-b355-0af982fd0f90') OR n0.identifier = 'ada31878-ea35-4af8-9523-7765e894cf8b') OR n0.identifier = '9c4a6f89-985a-4dcf-81f5-6ad3dedc5e7b') OR n0.identifier = '505f9f55-18a8-4045-91b5-807fcf19d892') OR n0.identifier = '2d5e4a64-3f25-45c8-9951-3a6584a933c1') OR n0.identifier = 'c1dfb405-89a1-45e7-a211-d284e44f730b') OR n0.identifier = 'e05b1434-28a0-42bd-846b-7c86fb14d0dd') OR n0.identifier = '94bd79ae-72df-4273-9bfe-91ea4e0ccb9b') OR n0.identifier = 'e9207fcd-73bb-459b-b867-f203cf2d5864') OR n0.identifier = '07679b2d-f2b1-4e60-8ef6-75bca21499cb') OR n0.identifier = '790d271d-67d2-4665-985c-459f19192be3') OR n0.identifier = '222e3286-3e69-4050-9a92-5a4a5281469d') OR n0.identifier = '78ebb106-7684-48b7-8560-0077f1c2ce35') OR n0.identifier = 'a778d386-4045-4479-be42-ab94f4a3f654') OR n0.identifier = 'c8c22daf-71cd-40de-b0a3-8dcab41c3e9f') OR n0.identifier = '211cb466-7d57-4c69-9d04-08b08e36164a') OR n0.identifier = '1c003e68-219a-47d9-9b0b-efe294438dab') OR n0.identifier = 'aa4b3bdd-653a-497c-88aa-9aebca505a9d') OR n0.identifier = '02ce1d61-aff4-47b5-81f7-9ac16582ab3a') OR n0.identifier = '5a4e5ff6-c737-45ee-98dc-66d5de7414df') OR n0.identifier = '66061db2-520a-4c4a-b834-cc432ad456ec') OR n0.identifier = '7f11ca5f-7a02-4bf8-9b15-fb50e76692c9') OR n0.identifier = '2c811caa-c18d-4716-ac1c-6a621e910553') OR n0.identifier = 'dcb30f32-d7e4-43b0-a7e7-9c1fa3894896') OR n0.identifier = 'ab89596e-ce9f-4efd-9017-9d63d997e3e3') OR n0.identifier = 'a94bd5a6-5479-4163-97d4-1ca035c4c0d2') OR n0.identifier = 'ef821d25-b26d-4637-a319-33b6f0601c1a') OR n0.identifier = '1afa2542-d50e-446f-bcd6-d3289fa0043e') OR n0.identifier = 'f094c3e1-324d-4bc0-a227-2f94f6f4bc2e') OR n0.identifier = '355b85ba-3c70-4b5e-b967-09bb0e003bec') OR n0.identifier = '0070ea5a-c876-470c-909f-e1f15c088b58') OR n0.identifier = '67cc5544-aaa4-48fe-90ab-10a2c5e5c38b') OR n0.identifier = 'c326658c-c264-4c33-a0b2-29207e6a11a4') OR n0.identifier = '5ad587cf-78c5-4a40-8c47-d3de132129ed') OR n0.identifier = '46f762cf-b573-41ac-8f30-fc2243cffac7') OR n0.identifier = '8211b825-2033-43d3-b189-e3d9ab44a492') OR n0.identifier = 'e5e7a4b9-0d7f-4705-94e4-091ff8543b73') OR n0.identifier = 'cf753896-d2bc-427b-9f36-148c5583ef62') OR n0.identifier = 'da7be706-9906-41d6-921a-9da44c0cf7d8') OR n0.identifier = '2f78c568-148f-474b-bd7a-ca82f59d3c4b') OR n0.identifier = '0fa2ef37-1626-406c-b102-ad438a419205') OR n0.identifier = '081045da-7924-4f3b-ab21-6d68b3367dd4') OR n0.identifier = '0aff1616-0df5-476c-98f0-f30f7e4149a8') OR n0.identifier = 'fdc03c60-f3a7-4dd8-86c3-464811dcf529') OR n0.identifier = '67fd531f-25a2-4517-9e24-383c960e18c9') OR n0.identifier = 'a3d95903-ddd9-4a5a-981d-98e39d2aa153') OR n0.identifier = '72c64473-a47e-4283-bcfd-bfff740d3537') OR n0.identifier = '63dc2713-13a9-485a-9f52-d54bd892d97e') OR n0.identifier = '9c6975e5-a068-4277-86d8-3002802baa15') OR n0.identifier = '743f160d-92a9-40fa-a950-07710443f489') OR n0.identifier = 'c1bdf5a5-c6ee-4693-84f8-0af8f574df25') OR n0.identifier = '72b360b0-b641-43b1-8efc-31792dfb30fe') OR n0.identifier = '38bbe53d-bb2c-4af4-b2b6-ffc8250c6f7b') OR n0.identifier = '185f2646-1816-4d25-95c0-30ba2ae683a0') OR n0.identifier = '70a1a9fd-3ef6-429a-83a2-f3b8096cfd2f') OR n0.identifier = 'ecc71ba9-0165-450f-998b-bd83c0a42bef') OR n0.identifier = '6376c68c-fd34-4a22-a6a0-6f04d7e20b8d') OR n0.identifier = '8bac30e2-7514-4828-8125-a5a69ef14f42') OR n0.identifier = 'c0e78081-4784-4ff1-8405-7d65b0c7b834') OR n0.identifier = '406eee5e-4261-4b79-9d6d-74ba33dc23b9') OR n0.identifier = 'a54d0bf1-5c37-4e56-8d8f-92d80a5f72f6') OR n0.identifier = '6e6070fc-2ad9-47d5-8bd9-42a3210d8ca0') OR n0.identifier = 'e3afb273-8777-4850-af78-e02c994205c1') OR n0.identifier = '1072adac-95d5-4b1a-8163-2ca41b5774bc') OR n0.identifier = 'b980b9c5-081c-4d89-adeb-414f1e05ed80') OR n0.identifier = '1965c29d-ba4e-47f7-91db-9a4db1ef4020') OR n0.identifier = 'd1545672-d6a0-49e0-92af-c980a296e358') OR n0.identifier = '21d4b1c3-66f7-4ea0-96c2-7b107c5e87cb') OR n0.identifier = '94c048a9-1435-43d9-8395-d82d8b62cde2') OR n0.identifier = '6c518417-86ea-43aa-a08e-f4d2e3518165') OR n0.identifier = '2da3a814-0d45-4438-9aeb-2f2f476800dd') OR n0.identifier = 'a79890c5-b6c4-45e0-aab9-805a9f2a5a55') OR n0.identifier = '203ccd31-0600-4675-94f1-d00afb03c342') OR n0.identifier = '0ab711ae-2aa1-43a1-b35c-bab281aabafe') OR n0.identifier = '9f58366c-803e-4c58-8f3a-88bcd03f3bdc') OR n0.identifier = '8aef42b6-324a-4768-b304-cb4794eb6b3a') OR n0.identifier = '3f5c092d-6e97-4b45-87ce-3cf1e9e391b1') OR n0.identifier = 'b8e76908-9d2a-4f28-976f-edd4dffb549b') OR n0.identifier = 'a6b9c437-1a26-41e2-b83c-2ebccdd086a2') OR n0.identifier = '346d11ad-faf1-43b6-8ca8-ee1a996faf4c') OR n0.identifier = '750a8459-9c55-4b23-973f-843e7b800d24') OR n0.identifier = '57e4f077-6a28-4ee4-b1c3-2f284040d51e') OR n0.identifier = '272e97d8-ac1f-47e6-a588-d60d0a2699c6') OR n0.identifier = 'eb6c9737-d3d2-41e6-8ab3-d5bb6885b99b') OR n0.identifier = 'e751e1e0-0e00-41f6-9846-b021cb913174') OR n0.identifier = '4c12e6a9-12a2-4dc5-8193-0f72e110fa3f') OR n0.identifier = 'e2238c31-ebf7-4e79-a077-20ed551418f7') OR n0.identifier = '05babbb8-7a28-4c82-abb9-065f3a6658db') OR n0.identifier = 'b410d373-a791-4bce-9f3f-c153672a8581') OR n0.identifier = 'a6fc2afb-c36b-4254-a61d-bde1dcaa775f') OR n0.identifier = 'a6a305ca-a262-4700-8ce7-d0f4e7a4f4c1') OR n0.identifier = '9cbfe846-fdab-458b-ad78-36d6819c4039') OR n0.identifier = 'a2fc1262-2c73-468d-95d1-8c7ea7422ce9') OR n0.identifier = '6b55c244-8983-40c0-9d94-9ac743dfc961') OR n0.identifier = 'e04163e2-2917-4385-a14f-f7ac8dd6eaa1') OR n0.identifier = '64f1e40f-04de-4df5-9156-600740b40a1c') OR n0.identifier = '30cc4b14-e18e-44af-ab8b-9af836332525') OR n0.identifier = '12a004a3-bb24-4d4b-8e61-aa9eaabe2aea') OR n0.identifier = 'ebb2ecf9-e4aa-4881-82fb-bf48f91828af') OR n0.identifier = '6a9661cd-1301-42fe-80e8-85bf0e9741c2') OR n0.identifier = '6c9964e6-2753-4555-84e3-6d0922f6e290') OR n0.identifier = '4591fb45-0455-4c04-abb5-89c4567b12a9'))) ORDER BY CAST(EXTRACTVALUE(n0.numerical_props, '//sv:property[@sv:name="sulu:order"]/sv:value[1]') AS DECIMAL) ASC, EXTRACTVALUE(n0.props, '//sv:property[@sv:name="sulu:order"]/sv:value[1]') ASC
SQL
);

echo "Simple pass: ";
timeParse(<<<SQL
SELECT page.* FROM [nt:unstructured] AS page WHERE page.quotes = "\"'"
SQL
);

@mamazu
Copy link
Contributor Author

mamazu commented Nov 26, 2021

I also found some issues that might be resolved with this merge request:
#80 #100 #87

@dbu dbu merged commit b814637 into phpcr:master Nov 29, 2021
@dbu
Copy link
Member

dbu commented Nov 29, 2021

i ran the tests in jackalope-doctrine-dbal and they are also green. lets get this in, thanks a lot!

regarding the potentially fixed issues: #100 is not fixed, when i try that test i get Syntax error: Expected ')', found ']' in SELECT * FROM [nt:file] AS file WHERE ISSAMENODE(file, ["/home node"]) in vendor/phpcr/phpcr-utils/src/PHPCR/Util/QOM/Sql2Scanner.php:94.

the SQL2 syntax is weird, it seems that ["/home node"] is a valid path specification. [/home node] should also work, but that makes the parser find [/home. may I ask you to add test cases for this as well and try to fix that? i think before, the square brackets worked but not the whitespace in the path. now the whitespace is fine but square brackets are not seen.

@mamazu
Copy link
Contributor Author

mamazu commented Nov 29, 2021

So the current tokenization is ["home node" and ]. The easy thing would be to make it tokenize as [ and "home node" and ]. But tokenizing it as one token would also be a possibility.

Currently there is also still a bug (but I think that was there before which tokenization ignores " so AS"all" tokenizes as one token.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants