Skip to content

Fixed parse_url(): can not recognize port without scheme #7844

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from

Conversation

PandaLIU-1111
Copy link
Contributor

fixed Before

var_dump(parse_url("127.0.0.1:9999"));
var_dump("==========");
var_dump(parse_url("127.0.0.1:9999?"));

output

array(2) {
  ["host"]=>
  string(9) "127.0.0.1"
  ["port"]=>
  int(9999)
}
string(10) "=========="
array(3) {
  ["scheme"]=>
  string(9) "127.0.0.1"
  ["path"]=>
  string(4) "9999"
  ["query"]=>
  string(0) ""
}

# fixed After

var_dump(parse_url("127.0.0.1:9999?"));

output

array(2) {
  ["host"]=>
  string(9) "127.0.0.1"
  ["port"]=>
  int(9999)
 ["query"]=>
  string(0) ""
}

Copy link
Member

@twose twose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://3v4l.org/RBvOW
It shows when we add the scheme, its behavior back to normal, so I think it is correct to have the same behavior in the non-scheme branch.

@@ -154,6 +154,10 @@ PHPAPI php_url *php_url_parse_ex2(char const *str, size_t length, bool *has_port
goto parse_port;
}

if (*p == '?') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should fix it by this way on line 153:

if ((p == ue || *p == '/'  || *p == '?') && (p - e) < 7) {

@PandaLIU-1111 PandaLIU-1111 changed the title Fixed parse_url("127.0.0.1:9999?") host and port parse is error Fixed parse_url(): can not recognize port without scheme Dec 28, 2021
@cmb69
Copy link
Member

cmb69 commented Dec 29, 2021

Thank you for the PR! I'm not against fixing this particular issue for partial URLs (although https://3v4l.org/vtrZb should probably fixed as well), but I consider it more important to have an URI parser which strictly conforms to the specification, and which doesn't even attempt to parse partial URIs (that would need to be another function for BC reasons, though).

@PandaLIU-1111
Copy link
Contributor Author

PandaLIU-1111 commented Dec 29, 2021

so you think we need create an URI parser to replace parse_url? @cmb69

@cmb69
Copy link
Member

cmb69 commented Dec 29, 2021

so you think we need create an URI parser to replace parse_url?

Yes, I think we need a proper URI parser, but not to replace parse_url(), but rather in addition to parse_url(). In the long run parse_url() might be deprecated and removed, but we need to keep it for now for backward compatibility reasons.

@PandaLIU-1111
Copy link
Contributor Author

PandaLIU-1111 commented Dec 29, 2021

ok, i prepare to copy parse_url method code and change it for fix scheme , so how we name the method ? parse_uri? @cmb69

@cmb69
Copy link
Member

cmb69 commented Dec 29, 2021

The name of the new function can still be decided later. But I wouldn't reuse the code in ? parse_url()`, because in my opinion, it is broken beyond repair.

@PandaLIU-1111
Copy link
Contributor Author

@cmb69 hello, How progress this pr?

@cmb69 cmb69 closed this in 72d8370 May 16, 2022
@cmb69
Copy link
Member

cmb69 commented May 16, 2022

Thanks for the reminder! I have merged the PR after renaming and slightly tweaking the test case.

@Berdir
Copy link

Berdir commented Sep 14, 2022

This caused a regression in Drupal, where we support URL's that look like this: internal:#foo, now internal is recognized as a path and not as a scheme anymore: https://3v4l.org/nFqLN

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants