Skip to content

Make the spec implementation in this repo type checkable #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Jul 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
5273b64
wip
MarcoGorelli Jun 24, 2023
da5324e
get mypy --strict passing
MarcoGorelli Jun 24, 2023
436b819
add CI
MarcoGorelli Jun 24, 2023
e8f5ca7
typo
MarcoGorelli Jun 24, 2023
8ce48df
wip
MarcoGorelli Jun 24, 2023
38e0276
keep to .py, but disable empty-body code
MarcoGorelli Jun 24, 2023
8f46009
add missing files
MarcoGorelli Jun 24, 2023
69c3283
fixup!
MarcoGorelli Jun 24, 2023
08f085e
fixup some types
MarcoGorelli Jun 24, 2023
4465e67
some corrections
MarcoGorelli Jun 24, 2023
51f5425
fixup again
MarcoGorelli Jun 24, 2023
b654dad
wip
MarcoGorelli Jun 24, 2023
a096091
getting there?
MarcoGorelli Jun 24, 2023
b3847d0
fixup
MarcoGorelli Jun 24, 2023
ffcd9e0
getting there?
MarcoGorelli Jun 24, 2023
a474d59
export DataFrame and Column
MarcoGorelli Jun 24, 2023
38259ab
ignore some nitpicks for now
MarcoGorelli Jun 24, 2023
a4d81fc
more fixups
MarcoGorelli Jun 24, 2023
cb2f1e4
remove unnecessary file
MarcoGorelli Jun 24, 2023
b7c00fb
few more corrections
MarcoGorelli Jun 24, 2023
cfaafa5
use mypy.ini
MarcoGorelli Jun 24, 2023
e92a784
revert making DataFrame generic
MarcoGorelli Jun 27, 2023
b67398e
revert making Scalar generic
MarcoGorelli Jun 28, 2023
f28ddcd
remove DType class
MarcoGorelli Jun 29, 2023
06f8aa8
fixup
MarcoGorelli Jun 29, 2023
7a26f3a
dont export Scalar, replace it with Any
MarcoGorelli Jun 29, 2023
56c6293
make self: Column[DType] explicit
MarcoGorelli Jun 29, 2023
314ab42
preserve Column[int] in docs
MarcoGorelli Jun 29, 2023
4ddbae1
simplify further
MarcoGorelli Jun 29, 2023
ea7a81a
reduce diff
MarcoGorelli Jun 29, 2023
4a740b9
reduce diff
MarcoGorelli Jun 29, 2023
d6a6e87
get docs building again
MarcoGorelli Jun 29, 2023
526a5d7
further reduce diff
MarcoGorelli Jun 29, 2023
e136060
Merge remote-tracking branch 'upstream/main' into make-type-checkable
MarcoGorelli Jul 7, 2023
5486e22
introduce Scalar type alias
MarcoGorelli Jul 7, 2023
6a8a428
fixup
MarcoGorelli Jul 7, 2023
e2d3068
fixup mypy
MarcoGorelli Jul 7, 2023
59140c2
reduce diff
MarcoGorelli Jul 7, 2023
9fd840a
fix return types;
MarcoGorelli Jul 7, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 32 additions & 0 deletions .github/workflows/mypy.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: mypy

on:
pull_request:
push:
branches: [main]

jobs:
tox:
strategy:
matrix:
python-version: ["3.8", "3.11"]
os: [ubuntu-latest]

runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}
- name: Cache multiple paths
uses: actions/cache@v3
with:
path: |
~/.cache/pip
$RUNNER_TOOL_CACHE/Python/*
~\AppData\Local\pip\Cache
key: ${{ runner.os }}-build-${{ matrix.python-version }}
- name: install-reqs
run: python -m pip install --upgrade mypy==1.4.0
- name: run mypy
run: cd spec/API_specification && mypy dataframe_api
5 changes: 5 additions & 0 deletions spec/API_specification/.mypy.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
[mypy]
strict=True

[mypy-dataframe_api.*]
disable_error_code=empty-body
38 changes: 18 additions & 20 deletions spec/API_specification/dataframe_api/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,19 @@
from typing import Mapping, Sequence, Any

from .column_object import *
from .dataframe_object import *
from .dataframe_object import DataFrame
from .groupby_object import *

from ._types import DType

__all__ = [
"__dataframe_api_version",
"__dataframe_api_version__",
"DataFrame",
"Column",
"column_from_sequence",
"concat",
"dataframe_from_dict",
"is_null",
"null",
"DType",
"Int64",
"Int32",
"Int16",
Expand Down Expand Up @@ -59,7 +60,7 @@ def concat(dataframes: Sequence[DataFrame]) -> DataFrame:
"""
...

def column_from_sequence(sequence: Sequence[object], *, dtype: DType) -> Column:
def column_from_sequence(sequence: Sequence[Any], *, dtype: Any) -> Column[Any]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dtype: DType seems more specific that it actually needs to be a dtype instance, and not eg a string repr?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that too, but I think it can't be addressed with a type alias like I suggested for Scalar, because DType is already a TypeVar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's address #189 first, then we can figure out how to type this

for now I've marked as Any to not block the benefits which this PR brings

"""
Construct Column from sequence of elements.

Expand All @@ -78,7 +79,7 @@ def column_from_sequence(sequence: Sequence[object], *, dtype: DType) -> Column:
"""
...

def dataframe_from_dict(data: Mapping[str, Column]) -> DataFrame:
def dataframe_from_dict(data: Mapping[str, Column[Any]]) -> DataFrame:
"""
Construct DataFrame from map of column names to Columns.

Expand Down Expand Up @@ -144,38 +145,35 @@ def is_null(value: object, /) -> bool:
# Dtypes #
##########

class DType:
"""Base class for all dtypes."""

class Int64(DType):
class Int64:
"""Integer type with 64 bits of precision."""

class Int32(DType):
class Int32:
"""Integer type with 32 bits of precision."""

class Int16(DType):
class Int16:
"""Integer type with 16 bits of precision."""

class Int8(DType):
class Int8:
"""Integer type with 8 bits of precision."""

class UInt64(DType):
class UInt64:
"""Unsigned integer type with 64 bits of precision."""

class UInt32(DType):
class UInt32:
"""Unsigned integer type with 32 bits of precision."""

class UInt16(DType):
class UInt16:
"""Unsigned integer type with 16 bits of precision."""

class UInt8(DType):
class UInt8:
"""Unsigned integer type with 8 bits of precision."""

class Float64(DType):
class Float64:
"""Floating point type with 64 bits of precision."""

class Float32(DType):
class Float32:
"""Floating point type with 32 bits of precision."""

class Bool(DType):
class Bool:
"""Boolean type with 8 bits of precision."""
5 changes: 4 additions & 1 deletion spec/API_specification/dataframe_api/_types.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,11 @@
)
from enum import Enum

# Type alias: Mypy needs Any, but for readability we need to make clear this
# is a Python scalar (i.e., an instance of `bool`, `int`, `float`, `str`, etc.)
Scalar = Any

array = TypeVar("array")
Scalar = TypeVar("Scalar")
device = TypeVar("device")
DType = TypeVar("DType")
SupportsDLPack = TypeVar("SupportsDLPack")
Expand Down
Loading