Get and set column names

Regarding column names, the next proposal, similar to what pandas currently does, uses a `columns` property to set and get columns names.

In  #7, the preference is to restrict column names to string, and not allow duplicates.

The proposed API with an example is:
```python
>>> df = dataframe({'col1': [1, 2], 'col2': [3, 4]})
>>> df.columns = 'foo', 'bar'
>>> df.columns = ['foo', 'bar']
>>> df.columns = map(str.upper, df.columns)
>>> df.columns
['FOO', 'BAR']
```
And the next cases would fail:
```python
>>> df.columns = 1
TypeError: Columns must be an iterable, not int
>>> df.columns = 'foo'
TypeError: Columns must be an iterable, not str
>>> df.columns = 'foo', 1
TypeError: Column names must be str, int found
>>> df.columns = 'foo', 'bar', 'foobar'
ValueError: Expected 2 column names, found 3
>>> df.columns = 'foo', 'foo'
ValueError: Column names cannot be duplicated. Found duplicates: foo
```

Some things that people may want to discuss:
- Using a different name for the property (e.g. `column_names`)
- Being able to set a single column `df.columns[0] = 'foo'` (the proposal don't allow it)
- The return type of the columns (the proposal returns a Python list, pandas returns an Index)
- Setting the column of a dataframe with one column with `df.columns = 'foo'` (the proposal requires an iterable, so `df.columns = ['foo']` or equivalent is needed).

In case it's useful, this is the implementation of the examples:
```python
import collections
import typing


class dataframe:
    def __init__(self, data):
        self._columns = list(data)

    @property
    def columns(self) -> typing.List[str]:
        return self._columns
    
    @columns.setter
    def columns(self, names: typing.Iterable[str]):
        if not isinstance(names, collections.abc.Iterable) or isinstance(names, str):
            raise TypeError(f'Columns must be an iterable, not {type(names).__name__}')

        names = list(names)

        for name in names:
            if not isinstance(name, str):
                raise TypeError(f'Column names must be str, {type(name).__name__} found')
        
        if len(names) != len(self._columns):
            raise ValueError(f'Expected {len(self._columns)} column names, found {len(names)}')

        if len(set(names)) != len(self._columns):
            duplicates = set(name for name in names if names.count(name) > 1)
            raise ValueError(f'Column names cannot be duplicated. Found duplicates: {", ".join(duplicates)}')

        self._columns = names
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Get and set column names #21

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Get and set column names #21

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions