Skip to content

[mbstring][PHP 8.4] Add mb_ucfirst and mb_lcfirst to polyfills #466

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ Polyfills are provided for:
- the `str_increment` and `str_decrement` functions introduced in PHP 8.3;
- the `Date*Exception/Error` classes introduced in PHP 8.3;
- the `SQLite3Exception` class introduced in PHP 8.3;
- the `mb_ucfirst` and `mb_lcfirst` functions introduced in PHP 8.4;

It is strongly recommended to upgrade your PHP version and/or install the missing
extensions whenever possible. This polyfill should be used only when there is no
Expand Down
47 changes: 47 additions & 0 deletions src/Mbstring/Mbstring.php
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@
* - mb_strstr - Finds first occurrence of a string within another
* - mb_strwidth - Return width of string
* - mb_substr_count - Count the number of substring occurrences
* - mb_ucfirst - Make a string's first character uppercase
* - mb_lcfirst - Make a string's first character lowercase
*
* Not implemented:
* - mb_convert_kana - Convert "kana" one from another ("zen-kaku", "han-kaku" and more)
Expand Down Expand Up @@ -871,6 +873,51 @@ public static function mb_str_pad(string $string, int $length, string $pad_strin
}
}

public static function mb_ucfirst(string $string, ?string $encoding = null): string
{
if (null === $encoding) {
$encoding = self::mb_internal_encoding();
}

try {
$validEncoding = @self::mb_check_encoding('', $encoding);
} catch (\ValueError $e) {
throw new \ValueError(sprintf('mb_ucfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}

// BC for PHP 7.3 and lower
if (!$validEncoding) {
throw new \ValueError(sprintf('mb_ucfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}
Comment on lines +880 to +891
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self::mb_internal_encoding() performs validation checks when setting internal values, so adding a try-catch to the else avoids a small amount of overhead. What do you think?

Suggested change
}
try {
$validEncoding = @self::mb_check_encoding('', $encoding);
} catch (\ValueError $e) {
throw new \ValueError(sprintf('mb_ucfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}
// BC for PHP 7.3 and lower
if (!$validEncoding) {
throw new \ValueError(sprintf('mb_ucfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}
} else {
try {
$validEncoding = @self::mb_check_encoding('', $encoding);
} catch (\ValueError $e) {
throw new \ValueError(sprintf('mb_ucfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}
// BC for PHP 7.3 and lower
if (!$validEncoding) {
throw new \ValueError(sprintf('mb_ucfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right I understand the else block would avoid the non-null calls.i tried to follow the rest of the polyfill methods, so I'd prefer to keep this one like it is now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this code can be condensed by combining it into a method like self::assertEncoding(). Let's try another PR.


$firstChar = mb_substr($string, 0, 1, $encoding);
$firstChar = mb_convert_case($firstChar, MB_CASE_TITLE, $encoding);

return $firstChar . mb_substr($string, 1, null, $encoding);
}

public static function mb_lcfirst(string $string, ?string $encoding = null): string
{
if (null === $encoding) {
$encoding = self::mb_internal_encoding();
}

try {
$validEncoding = @self::mb_check_encoding('', $encoding);
} catch (\ValueError $e) {
throw new \ValueError(sprintf('mb_lcfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}

// BC for PHP 7.3 and lower
if (!$validEncoding) {
throw new \ValueError(sprintf('mb_lcfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}
$firstChar = mb_substr($string, 0, 1, $encoding);
$firstChar = mb_convert_case($firstChar, MB_CASE_LOWER, $encoding);

return $firstChar . mb_substr($string, 1, null, $encoding);
}

private static function getSubpart($pos, $part, $haystack, $encoding)
{
if (false === $pos) {
Expand Down
8 changes: 8 additions & 0 deletions src/Mbstring/bootstrap.php
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,14 @@ function mb_str_split($string, $length = 1, $encoding = null) { return p\Mbstrin
function mb_str_pad(string $string, int $length, string $pad_string = ' ', int $pad_type = STR_PAD_RIGHT, ?string $encoding = null): string { return p\Mbstring::mb_str_pad($string, $length, $pad_string, $pad_type, $encoding); }
}

if (!function_exists('mb_ucfirst')) {
function mb_ucfirst(string $string, ?string $encoding = null): string { return p\Mbstring::mb_ucfirst($string, $encoding); }
}

if (!function_exists('mb_lcfirst')) {
function mb_lcfirst(string $string, ?string $encoding = null): string { return p\Mbstring::mb_lcfirst($string, $encoding); }
}

if (extension_loaded('mbstring')) {
return;
}
Expand Down
8 changes: 8 additions & 0 deletions src/Mbstring/bootstrap80.php
Original file line number Diff line number Diff line change
Expand Up @@ -132,6 +132,14 @@ function mb_str_split(?string $string, ?int $length = 1, ?string $encoding = nul
function mb_str_pad(string $string, int $length, string $pad_string = ' ', int $pad_type = STR_PAD_RIGHT, ?string $encoding = null): string { return p\Mbstring::mb_str_pad($string, $length, $pad_string, $pad_type, $encoding); }
}

if (!function_exists('mb_ucfirst')) {
function mb_ucfirst($string, ?string $encoding = null): string { return p\Mbstring::mb_ucfirst($string, $encoding); }
}

if (!function_exists('mb_lcfirst')) {
function mb_lcfirst($string, ?string $encoding = null): string { return p\Mbstring::mb_lcfirst($string, $encoding); }
}

if (extension_loaded('mbstring')) {
return;
}
Expand Down
45 changes: 45 additions & 0 deletions src/Php84/Php84.php
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,49 @@
*/
final class Php84
{
public static function mb_ucfirst(string $string, ?string $encoding = null): string
{
if (null === $encoding) {
$encoding = mb_internal_encoding();
}

try {
$validEncoding = @mb_check_encoding('', $encoding);
} catch (\ValueError $e) {
throw new \ValueError(sprintf('mb_ucfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}

// BC for PHP 7.3 and lower
if (!$validEncoding) {
throw new \ValueError(sprintf('mb_ucfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}

$firstChar = mb_substr($string, 0, 1, $encoding);
$firstChar = mb_convert_case($firstChar, MB_CASE_TITLE, $encoding);

return $firstChar . mb_substr($string, 1, null, $encoding);
}

public static function mb_lcfirst(string $string, ?string $encoding = null): string
{
if (null === $encoding) {
$encoding = mb_internal_encoding();
}

try {
$validEncoding = @mb_check_encoding('', $encoding);
} catch (\ValueError $e) {
throw new \ValueError(sprintf('mb_lcfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}

// BC for PHP 7.3 and lower
if (!$validEncoding) {
throw new \ValueError(sprintf('mb_lcfirst(): Argument #2 ($encoding) must be a valid encoding, "%s" given', $encoding));
}

$firstChar = mb_substr($string, 0, 1, $encoding);
$firstChar = mb_convert_case($firstChar, MB_CASE_LOWER, $encoding);

return $firstChar . mb_substr($string, 1, null, $encoding);
}
}
2 changes: 2 additions & 0 deletions src/Php84/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@ Symfony Polyfill / Php84

This component provides features added to PHP 8.4 core:

- [`mb_ucfirst` and `mb_lcfirst`](https://wiki.php.net/rfc/mb_ucfirst)

More information can be found in the
[main Polyfill README](https://github.com/symfony/polyfill/blob/main/README.md).

Expand Down
9 changes: 9 additions & 0 deletions src/Php84/bootstrap.php
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,12 @@
if (\PHP_VERSION_ID >= 80400) {
return;
}


if (!function_exists('mb_ucfirst')) {
function mb_ucfirst($string, ?string $encoding = null): string { return p\Php84::mb_ucfirst($string, $encoding); }
}

if (!function_exists('mb_lcfirst')) {
function mb_lcfirst($string, ?string $encoding = null): string { return p\Php84::mb_lcfirst($string, $encoding); }
}
53 changes: 53 additions & 0 deletions tests/Mbstring/MbstringTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -657,6 +657,20 @@ public function testMbStrPadInvalidArguments(string $expectedError, string $stri
mb_str_pad($string, $length, $padString, $padType, $encoding);
}

/**
* @dataProvider ucFirstDataProvider
*/
public function testMbUcFirst(string $string, string $expected): void {
$this->assertSame($expected, mb_ucfirst($string));
}

/**
* @dataProvider lcFirstDataProvider
*/
public function testMbLcFirst(string $string, string $expected): void {
$this->assertSame($expected, mb_lcfirst($string));
}

public static function paddingStringProvider(): iterable
{
// Simple ASCII strings
Expand Down Expand Up @@ -727,4 +741,43 @@ public static function mbStrPadInvalidArgumentsProvider(): iterable
yield ['mb_str_pad(): Argument #4 ($pad_type) must be STR_PAD_LEFT, STR_PAD_RIGHT, or STR_PAD_BOTH', '▶▶', 6, ' ', 123456];
yield ['mb_str_pad(): Argument #5 ($encoding) must be a valid encoding, "unexisting" given', '▶▶', 6, ' ', \STR_PAD_BOTH, 'unexisting'];
}

public static function ucFirstDataProvider(): array {
return [
['', ''],
['test', 'Test'],
['TEST', 'TEST'],
['TesT', 'TesT'],
['ab', 'Ab'],
['ABS', 'ABS'],
['đắt quá!', 'Đắt quá!'],
['აბგ', 'აბგ'],
['lj', 'Lj'],
["\u{01CA}", "\u{01CB}"],
["\u{01CA}\u{01CA}", "\u{01CB}\u{01CA}"],
["łámał", "Łámał"],
// Full case-mapping and case-folding that changes the length of the string only supported
// in PHP > 7.3.
["ßst", PHP_VERSION_ID < 70300 ? "ßst" : "Ssst"],
];
}

public static function lcFirstDataProvider(): array {
return [
['', ''],
['test', 'test'],
['Test', 'test'],
['tEST', 'tEST'],
['Ab', 'ab'],
['ABS', 'aBS'],
['Đắt quá!', 'đắt quá!'],
['აბგ', 'აბგ'],
['Lj', PHP_VERSION_ID < 70200 ? 'Lj' : 'lj'],
["\u{01CB}", PHP_VERSION_ID < 70200 ? "\u{01CB}" : "\u{01CC}"],
["\u{01CA}", "\u{01CC}"],
["\u{01CA}\u{01CA}", "\u{01CC}\u{01CA}"],
["\u{212A}\u{01CA}", "\u{006b}\u{01CA}"],
["ß", "ß"],
];
}
}
53 changes: 53 additions & 0 deletions tests/Php84/Php84Test.php
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,61 @@

namespace Symfony\Polyfill\Tests\Php84;

use PHPUnit\Framework\Attributes\DataProvider;
use PHPUnit\Framework\TestCase;

class Php84Test extends TestCase
{
/**
* @dataProvider ucFirstDataProvider
*/
public function testMbUcFirst(string $string, string $expected): void {
$this->assertSame($expected, mb_ucfirst($string));
}

/**
* @dataProvider lcFirstDataProvider
*/
public function testMbLcFirst(string $string, string $expected): void {
$this->assertSame($expected, mb_lcfirst($string));
}

public static function ucFirstDataProvider(): array {
return [
['', ''],
['test', 'Test'],
['TEST', 'TEST'],
['TesT', 'TesT'],
['ab', 'Ab'],
['ABS', 'ABS'],
['đắt quá!', 'Đắt quá!'],
['აბგ', 'აბგ'],
['lj', 'Lj'],
["\u{01CA}", "\u{01CB}"],
["\u{01CA}\u{01CA}", "\u{01CB}\u{01CA}"],
["łámał", "Łámał"],
// Full case-mapping and case-folding that changes the length of the string only supported
// in PHP > 7.3.
["ßst", PHP_VERSION_ID < 70300 ? "ßst" : "Ssst"],
];
}

public static function lcFirstDataProvider(): array {
return [
['', ''],
['test', 'test'],
['Test', 'test'],
['tEST', 'tEST'],
['Ab', 'ab'],
['ABS', 'aBS'],
['Đắt quá!', 'đắt quá!'],
['აბგ', 'აბგ'],
['Lj', PHP_VERSION_ID < 70200 ? 'Lj' : 'lj'],
["\u{01CB}", PHP_VERSION_ID < 70200 ? "\u{01CB}" : "\u{01CC}"],
["\u{01CA}", "\u{01CC}"],
["\u{01CA}\u{01CA}", "\u{01CC}\u{01CA}"],
["\u{212A}\u{01CA}", "\u{006b}\u{01CA}"],
["ß", "ß"],
];
}
}