Skip to content

BUG: Key Error: range exception when printing DataFrame #3869

Closed
@dmlockhart

Description

@dmlockhart

Here's a reprodu

df = pd.DataFrame({ 'A' : ['foo',"~:{range}:0"], 'B' : ['bar','bah'] })
df
             A    B
0          foo  bar
1  ~:{range}:0  bah

df.set_index(['A']).info()
*** KeyError: 'range'

in core/index.py, check if head/tail is not already an instance of a str

    def summary(self, name=None):
        if len(self) > 0:
            head = self[0]
            if hasattr(head,'format'):
                head = head.format()
            tail = self[-1]
            if hasattr(tail,'format'):
                tail = tail.format()

Printing a DataFrame created from two Series objects (previously columns in other DataFrames) results in a "Key Error: 'range'" exception being raised. The DataFrame creation seems to work fine. Printing other DataFrames with the same "problematic" index also works okay.

Test code:

import pstats
import pandas as pd

# Import some cProfile data
run1 = pstats.Stats('run1.prof')
run2 = pstats.Stats('run2.prof')

# Utility function to convert pstats dict into a list of lists
def pstats_to_list( stats ):
  plist = []
  for key, value in stats.strip_dirs().stats.items():
    filename, lineno, func_name = key
    ccalls, ncalls, total_time, cum_time, callers = value
    name = "{}:{}:{}".format( filename, func_name, lineno )
    plist.append( [name, ncalls, total_time, cum_time] )
  return plist

jit_list   = pstats_to_list( run1 )
nojit_list = pstats_to_list( run2 )

# Create DataFrames for the profile run data
columns=['name','ncalls','ttime', 'ctime']
jdf = pd.DataFrame( jit_list,   columns = columns )
ndf = pd.DataFrame( nojit_list, columns = columns )

# Set the 'name' column to be the index (for plotting)
jdf = jdf.set_index( 'name' )
ndf = ndf.set_index( 'name' )

# These DataFrames print fine
print jdf
print ndf

# Extract out the 'ttime' columns
x = ndf['ttime']
y = jdf['ttime']

# Create a new DataFrame using the 'ttime' Series from jdf and ndf
z = pd.DataFrame( {'jit': x, 'nojit': y } )

# Print some data.... this works
print z[0:10]

# Print some data.... this raises "KeyError: 'range'"
print z

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions