PERF: `nunique` is slower than `unique.apply(len)`  on a `groupby`

### Pandas version checks

- [X] I have checked that this issue has not already been reported.

- [X] I have confirmed this bug exists on the [latest version](https://pandas.pydata.org/docs/whatsnew/index.html) of pandas.

- [ ] I have confirmed this bug exists on the [main branch](https://pandas.pydata.org/docs/dev/getting_started/install.html#installing-the-development-version-of-pandas) of pandas.


### Reproducible Example

```python
import pandas as pd
import numpy as np
unique_values = np.arange(30, dtype=np.uint32)
data = np.random.choice(unique_values, size=1_000_000)
s = pd.Series(data)

%timeit s.groupby(s).nunique()
%timeit s.groupby(s).unique().apply(len)
```


### Issue Description

I expect `nunique()` to be at least as fast as `unique.apply(len)`, however, it is much slower in the reproducible example I provided (3x slower). 
On my computer I have the following performance:
%timeit s.groupby(s).nunique()
103 ms ± 942 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit s.groupby(s).unique().apply(len)
31.1 ms ± 603 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

### Expected Behavior

`nunique()` is at least  as fast as `unique.apply(len)`,

### Installed Versions

<details>

INSTALLED VERSIONS
------------------
commit           : 0f437949513225922d851e9581723d82120684a6
python           : 3.11.5.final.0
python-bits      : 64
OS               : Linux
OS-release       : 6.5.11-300.fc39.x86_64
Version          : #1 SMP PREEMPT_DYNAMIC Wed Nov  8 22:37:57 UTC 2023
machine          : x86_64
processor        : 
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 2.0.3
numpy            : 1.24.3
pytz             : 2023.3.post1
dateutil         : 2.8.2
setuptools       : 68.0.0
pip              : 23.3
Cython           : 3.0.0
pytest           : None
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : None
lxml.etree       : 4.9.3
html5lib         : None
pymysql          : None
psycopg2         : None
jinja2           : 3.1.2
IPython          : 8.15.0
pandas_datareader: 0.10.0
bs4              : 4.12.2
bottleneck       : 1.3.5
brotli           : 1.0.9
fastparquet      : None
fsspec           : 2023.9.2
gcsfs            : None
matplotlib       : 3.8.0
numba            : 0.57.1
numexpr          : 2.8.7
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : 11.0.0
pyreadstat       : None
pyxlsb           : None
s3fs             : None
scipy            : 1.11.3
snappy           : 
sqlalchemy       : 2.0.21
tables           : None
tabulate         : 0.8.10
xarray           : 2023.6.0
xlrd             : None
zstandard        : None
tzdata           : 2023.3
qtpy             : 2.2.0
pyqt5            : None

</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

PERF: `nunique` is slower than `unique.apply(len)` on a `groupby` #55972

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

INSTALLED VERSIONS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

PERF: nunique is slower than unique.apply(len) on a groupby #55972

Description

Pandas version checks

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

INSTALLED VERSIONS

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

PERF: `nunique` is slower than `unique.apply(len)` on a `groupby` #55972