Skip to content

TST: add html5lib to travis #3655

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 21, 2013
Merged

TST: add html5lib to travis #3655

merged 2 commits into from
May 21, 2013

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented May 20, 2013

closes #3654

@cpcloud
Copy link
Member Author

cpcloud commented May 20, 2013

darn :| tox gave me false hope

@jreback
Copy link
Contributor

jreback commented May 20, 2013

any luck?

@cpcloud
Copy link
Member Author

cpcloud commented May 20, 2013

nah i'm not sure what the problem is, haven't had a chance to get into it yet...2.7 fails (slightly weird since that's what i'm running), and all tox versions pass on my machine. my gut tells me that there some simple thing going wrong here, but i need to investigate. i had this problem locally once when bs4 was installed both from the launchpad source and from the pypi mirrors, but I don't see where this could've happened on travis.

@jreback
Copy link
Contributor

jreback commented May 20, 2013

I have a vagrant 32-bit environ setup....just tested and it fails like travis (which is 32-bit btw)...

@cpcloud
Copy link
Member Author

cpcloud commented May 20, 2013

oh gosh. thanks. i have a similar machine at home i will be able to test it like this later...

@jreback
Copy link
Contributor

jreback commented May 20, 2013

First failure

115         def test_banklist(self):
116             df1 = self.run_read_html(self.banklist_data, '.*Florida.*',
117  ->                                  attrs={'id': 'table'})
387         def _parse_tables(self, doc, match, attrs):
388             element_name = self._strainer.name
389             tables = doc.find_all(element_name, attrs=attrs)
390             if not tables:
391  ->             raise AssertionError('No tables found')

@jreback
Copy link
Contributor

jreback commented May 20, 2013

maybe bs4 has a weird thing on 32-bit

@jreback
Copy link
Contributor

jreback commented May 20, 2013

you can setup www.vagrantup.com. ....any type of machine you want (for ssh)...btw...free....

I setup a 32-bit because I can't easily test that otherwise

@cpcloud
Copy link
Member Author

cpcloud commented May 20, 2013

oh that is sweet. thanks.

@cpcloud
Copy link
Member Author

cpcloud commented May 20, 2013

still though if bs4 has a weird 32-bit thing then why are 2.6 and 3.2 passing? so strange

@cpcloud
Copy link
Member Author

cpcloud commented May 20, 2013

sorry 3.3

@jreback
Copy link
Contributor

jreback commented May 20, 2013

yes...that is weird....prob is something really simple

@cpcloud
Copy link
Member Author

cpcloud commented May 20, 2013

i've setup vagrant so i can get to this a little later or this evening

@cpcloud
Copy link
Member Author

cpcloud commented May 20, 2013

okay i can reproduce this...it's a start

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

progress: it works with the ubuntu repo install sounds like a venv prob ugh

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

@jreback not sure what to do here. worth figuring out this bug? don't think it's pandas related

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

soln is to install read_html deps using apt

@jreback
Copy link
Contributor

jreback commented May 21, 2013

oh...that looks fine.....
can you squash to one?

@jreback
Copy link
Contributor

jreback commented May 21, 2013

so looks like will test with full_deps on 2.7, but not 3.2?

its fine, only think I would maybe be concerned about is unicode issues....e.g.

a 3.3/3.2 person parses a table which is treated as unicode (though you can 'simulate') this as well

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

no sorry i fixed so installed on all versions that's what all those commits were for before :)

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

crap unless they need a "python3" prefix...grumbles

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

(2.6 and 3.3 passed because the tests were ignored due to lack of deps)

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

i think this is a bs4 bug, will report. older version of bs4 included in apt works fine whereas pip version doesn't work. this still holds no matter which version of html5lib or lxml i use.

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

@jreback oh joy this worked!

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

should i note this in the docs? something like "if you're on a 32-bit machine use version X of bs4"

@jreback
Copy link
Contributor

jreback commented May 21, 2013

so if I understand

32-bit, use apt-get
64-bit, use pip (OR apt-get)?

yes, just put it in the doc string / docs / install.rst (but be as brief as possibly)....

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

one slight issue there that might matter: if u want (need) to use the apt-get bs4 then you cannot use lxml that comes with it (if u want to use lxml) you must use the pip version. this is so annoying...

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

travis fails bs4 + lxml tests when u use the apt-get version of lxml

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

i'm ready to just write my own f---ing html table parser in cython...

@jreback
Copy link
Contributor

jreback commented May 21, 2013

just put a big warning/note in the docs

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

i just remem'd that doesn't matter ATM since we're disallowing lxml anyway...(the lxml install issue doesn't matter atm the other issue does)

@jreback
Copy link
Contributor

jreback commented May 21, 2013

true....but apparently io.data uses lxml for some parsing (but if that's an issue then the user can deal with it)....
just put in a note

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

oh yeah i forgot about that, some stuff with yahoo stocks i think...almost finished

cpcloud added 2 commits May 21, 2013 08:55
add prefix for python3 versions

bs4lxml failed before let us install lxml from pip

here we go again
@jreback
Copy link
Contributor

jreback commented May 21, 2013

looks great...ready to merge?

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

Yep
On May 21, 2013 9:20 AM, "jreback" notifications@github.com wrote:

looks great...ready to merge?


Reply to this email directly or view it on GitHubhttps://github.com//pull/3655#issuecomment-18207940
.

@jreback
Copy link
Contributor

jreback commented May 21, 2013

ok.cross fingers

jreback added a commit that referenced this pull request May 21, 2013
@jreback jreback merged commit 79cda50 into pandas-dev:master May 21, 2013
@jreback
Copy link
Contributor

jreback commented May 21, 2013

all green on master!

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

Nice!
On May 21, 2013 10:07 AM, "jreback" notifications@github.com wrote:

all green on master!


Reply to this email directly or view it on GitHubhttps://github.com//pull/3655#issuecomment-18210802
.

@cpcloud
Copy link
Member Author

cpcloud commented May 21, 2013

Thanks for your help!
On May 21, 2013 10:07 AM, "jreback" notifications@github.com wrote:

all green on master!


Reply to this email directly or view it on GitHubhttps://github.com//pull/3655#issuecomment-18210802
.

@cpcloud cpcloud deleted the read-html-travis-fix-3654 branch May 22, 2013 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TST: read_html tests failing when html5lib included on travis
2 participants