Skip to content

Contributing Guide for Type Hints #27050

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Aug 25, 2019
Merged
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions doc/source/development/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -696,6 +696,107 @@ You'll also need to

See :ref:`contributing.warnings` for more.

Type Hints
----------

*pandas* strongly encourages the use of :pep:`484` style type hints. New development should contain type hints and pull requests to annotate existing code are accepted as well!

Syntax Requirements
~~~~~~~~~~~~~~~~~~~

Because *pandas* still supports Python 3.5, :pep:`526` does not apply and variables **must** be annotated with type comments. Specifically, this is a valid annotation within pandas:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this comment lower, e.g. show the common / easy case first


.. code-block:: python

primes = [] # type: List[int]

Whereas this is **NOT** allowed:

.. code-block:: python

primes: List[int] = [] # not supported in Python 3.5!

Note that function signatures can always be annotated per :pep:`3107`:

.. code-block:: python

def sum_of_primes(primes: List[int] = []) -> int:

Style Guidelines
~~~~~~~~~~~~~~~~

Types imports should follow the ``from typing import ...`` convention. So rather than

.. code-block:: python

import typing

primes = [] # type: typing.List[int]

You should write

.. code-block:: python

from typing import List

primes = [] # type: List[int]

``Optional`` should be used where applicable, so instead of

.. code-block:: python

from typing import Union

maybe_primes = [] # type: Union[int, None]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it should be

Suggested change
maybe_primes = [] # type: Union[int, None]
maybe_primes = [] # type: List[Union[int, None]]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea great catch!


You should write

.. code-block:: python

from typing import Optional

maybe_primes = [] # type: Optional[int]

If a function accepts multiple arguments, every parameter should appear on a separate line. So rather than

.. code-block:: python

def some_func(a: str, b: float, c: Union[int, float]) -> float:

The preferred style would be

.. code-block:: python

def some_func(a: str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should make style recommendations that aren't enforced by a linter,

Black formats this as

def some_func(a: str, b: float, c: Union[int, float]) -> float:
    pass

And when a wrap is required

def some_func(
    a_long_name_really_long_name: str,
    b_long_name: float,
    c_long_name: Union[int, float],
) -> float:
    pass

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know. Haven't used that yet (have seen other issue, will look into it) but what you are saying makes sense.

@topper-123 I think you suggested something like this as well so FYI

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it fits within 79 chars, I don't the point to wrap.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm +1 on requiring multiline defs to be wrapped in the style guide. It makes for much clearer code.

OTOH, if we go with Black, such choices will be made for us?

b: float,
c: Union[int, float]
) -> float:

When dealing with parameters with a default argument of ``None``, you should not use ``Optional`` as this will be inferred by the static type checker. So instead of:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback drawing your attention here as you might disagree

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback we are going with Optional or not?


.. code-block:: python

def maybe_upcase(value: Optional[str] = None) -> Optional[str]:

You should write

.. code-block:: python

def maybe_upcase(value: str = None) -> Optional[str]:

Pandas-specific Types
~~~~~~~~~~~~~~~~~~~~~

Commonly used types specific to *pandas* will appear in pandas._typing and you should use these where applicable. This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small list of the types of things found here may be helpful (ArrayLike, Dtype, etc.)

I think inlining an example here would be very helpful; early on, contributors may be adding to this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes adding stuff in _typing.py sometimes confuses me. Some example will be nice.


Validating Type Hints
~~~~~~~~~~~~~~~~~~~~~

*pandas* uses `mypy <http://mypy-lang.org>`_ to statically analyze the code base and type hints. After making any change you can ensure your type hints are correct by running

.. code-block:: shell

mypy pandas
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also type check single files or submodules? (for quicker development turnover, if you are trying out type checking whole pandas takes a while)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could but not generically useful as mypy doggedly follows all imports so wouldn't necessarily save much time:

https://mypy.readthedocs.io/en/latest/running_mypy.html#following-imports

If type checking speed is a concern the suggested approach is to use a daemon:

https://mypy.readthedocs.io/en/latest/mypy_daemon.html#mypy-daemon-mypy-server

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche yes you can do something like this: mypy pandas/core/something.py, or mypy pandas/core/generic
It saves a bit of time but not much, but adding how to do that in the contributing guide might not be a bad idea. Personally when I am working with typing I run mypy for just the single file I am working on. For whole project I run it only before committing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t plan on adding this - it’s not value added to do

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's true, someone can easily figure out the command for a single file/folder by making a wise guess or going to mypy docs.


.. _contributing.ci:

Expand Down