Skip to content

CLN: Refactor string special methods to common decorator + safe unicode everywhere #4090

Closed
@jtratner

Description

@jtratner

I was implementing some new objects for another PR and noticed that string methods are duplicated throughout (particularly on objects that don't inherit from each other). Wrote this up on this branch - https://github.com/jtratner/pandas/tree/refactor_string_magic_methods .

If you thing this is worthwhile, I'll split it up a little, add a bit better documentation and submit.

This is used multiple times:

    def __str__(self):
        """
        Return a string representation for a particular object.

        Invoked by str(obj) in both py2/py3.
        Yields Bytestring in Py2, Unicode String in py3.
        """

        if py3compat.PY3:
            return self.__unicode__()
        return self.__bytes__()

    def __bytes__(self):
        """
        Return a string representation for a particular object.

        Invoked by bytes(obj) in py3 only.
        Yields a bytestring in both py2/py3.
        """
        from pandas.core.config import get_option

        encoding = get_option("display.encoding")
        return self.__unicode__().encode(encoding, 'replace')

    def __repr__(self):
        """
        Return a string representation for a particular object.

        Yields Bytestring in Py2, Unicode String in py3.
        """
        return str(self)

Unicode tends to vary, but often is like this:

    def __unicode__(self):
        """
        Return a string representation for a particular object.

        Invoked by unicode(obj) in py2 only. Yields a Unicode String in both
        py2/py3.
        """
        prepr = pprint_thing(self, escape_chars=('\t', '\r', '\n'), quote_strings=True)
        return '%s(%s)' % (type(self).__name__, prepr)

Additionally, a number of objects aren't using a unicode-safe representation of themselves, so this would resolve that as well. Would this be useful to include?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions