Skip to content

Removing data files for PROJ.4 #240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 2, 2017
Merged

Removing data files for PROJ.4 #240

merged 3 commits into from
Feb 2, 2017

Conversation

micahcochran
Copy link
Contributor

These are data files for PROJ.4. They are the same as the ones located here: https://github.com/OSGeo/proj.4/tree/master/nad

Micah Cochran added 2 commits December 14, 2015 10:17
These are data files for PROJ.4.  They are the same as the ones located
here:  https://github.com/OSGeo/proj.4/tree/master/nad
Removing more PROJ.4 files.   Same as these in pyproj:
https://github.com/jswhit/pyproj/tree/master/datumgrid
@WeatherGod
Copy link
Member

Failure on Travis (previous commit):

$ python lib/mpl_toolkits/basemap/test.py
Traceback (most recent call last):
  File "lib/mpl_toolkits/basemap/test.py", line 2, in <module>
    from mpl_toolkits.basemap import Basemap, shiftgrid
  File "/home/travis/build/matplotlib/basemap/venv/lib/python2.6/site-packages/mpl_toolkits/basemap/__init__.py", line 141, in <module>
    epsgf = open(os.path.join(basemap_datadir,'epsg'))
IOError: [Errno 2] No such file or directory: '/home/travis/build/matplotlib/basemap/venv/lib/python2.6/site-packages/mpl_toolkits/basemap/data/epsg'

@WeatherGod
Copy link
Member

Some other things I worry about with this PR:

  1. The test framework isn't anywhere near complete enough to ensure that something needed one of these files. We are going to need to be meticulous with checking over the codebase for any possible use of these files. This can get tricky as we can't just simply grep for their names; the file name could be pieced together from parameters.
  2. That, somehow, some of these files are needed as part of the process for generating updated packaged data (which wouldn't be covered by unit tests at all).
  3. That some people have come to rely on some of these files being here (although, I am more willing to let go of these than I am about imports).

Perhaps @jswhit could chime in with any knowledge you could impart?

@micahcochran
Copy link
Contributor Author

I, obviously, didn't test it out. I have no qualms with you being wary of these changes.

I made a patch to move the epsg file location from basemap's datadir to pyproj datadir. I'll more fully respond when I have some time.

@WeatherGod
Copy link
Member

Understood. No rush. Btw, how much smaller does this make the repository?

On Mon, Dec 14, 2015 at 1:45 PM, Micah Cochran notifications@github.com
wrote:

I, obviously, didn't test it out. I have no qualms with you being wary of
these changes.

I made a patch to move the epsg file location from basemap's datadir to
pyproj datadir. I'll more fully respond when I have some time.


Reply to this email directly or view it on GitHub
#240 (comment).

@micahcochran
Copy link
Contributor Author

Removing these files makes the uncompressed repository almost 6 MiB smaller.

@micahcochran
Copy link
Contributor Author

@WeatherGod Here's my response:

  1. You can grep "basemap_datadir". There are 11 instances of it in __init__.py. Most of which are involved in initialization or are file locations for datasets that are a part of basemap.

Coverage can be a way to identifying what code is not being run by unit testing, but it has its limits.

Geopandas does image comparison unit testing in test_plotting.py using compare_images from matplotlib. Examples plotmap_oo, simpletest_oo, geos_demo_2, and save_background could be used for image comparison, since simply generate an image. Those would be fairly easy to code unit tests (it is mostly coded) that better test matplotlib's capabilites. The downside is that minor changes that improve the image rendering (margin changes, better type, and so on) could be automatically flagged by the unit test. So, future matplotlib upgrades could cause these tests to fail. On the whole, I think image comparison unit testing would be a big benefit to the project.

  1. I visually scanned setup.py and the files in utils/ and none of the files proposed to be removed seems to be used by those files.

  2. All of these are projection oriented data files or for PROJ4. Many of these are lookup tables for a particular projection. epsg is a lookup table by number projection name with a proj4 string to define it. test* files are unit tests for proj4. The datumgrid files are grids that convert between datums NAD27 to NAD83. They should be able to make small changes to upgrade to this version. pyproj will still have them through a correctly installed PROJ4 library.

Caution is a reasonable response to a big change. I understand and respect that.

@WeatherGod
Copy link
Member

  1. That would probably cover most situations, yes. We would have to watch out for uses outside of __init__.py under a different name, though.

As for the coverage comment -- yes, I am quite aware that just because code is covered doesn't mean that it is sufficiently tested. However, code that isn't covered at all isn't tested at all, and the coverage in Basemap is dismal. So, I am just saying that at the moment, a green light from Travis doesn't mean much when we are taking stuff out.

As for image tests -- yes, I do want to start utilizing image testing in basemap. I am actually quite familiar with the framework and how it can be used properly. In particular, v1.5 of matplotlib updated the framework to include a default "style" argument to the image_comparison() decorator. This will allow matplotlib and other projects to keep their current baseline images as-is through the upcoming v2.0 style change release for as long as they want (so long as the specified style is supported). There has also been significant amount of work recently greatly reducing sources of variance across platforms and freetype releases. This to the point now that the vast majority of matplotlib's image_comparison tests (on master and the v2.0.x branch) are now running at zero tolerance level!

  1. Agreed, to a point. The problem is that I have yet to have actually gone through this process myself. Therefore, I just simply don't know that for sure yet. I have been bitten too many times in other projects by hidden and undocumented dependencies. This is why I really want @jswhit's opinion on this as the creator.

  2. That is my thinking as well. It isn't like this was properly packaged data in the first place...

@micahcochran
Copy link
Contributor Author

@jswhit Can you please review this PR?

@jswhit
Copy link

jswhit commented Jan 20, 2016

The original rationale for bundling pyproj (and the proj4 source code inside pyproj) was to make it easier to install basemap - one less external dependency. It also guards against problems caused by an old, or problematic proj4 installation. Having said that, I get that packagers don't want duplicate code, and there is value in slimming down basemap. Are we checking the pyproj and proj4 versions in setup.py? Since we need the hammer inverse projection, don't we need proj4 4.9.3 (which is not yet released)?

@micahcochran
Copy link
Contributor Author

One quick point proj4 code was deleted by PR #236. I see this PR as a cleanup.

Good point jswhit. setup.py is currently not checking pyproj and proj versions. That would be fairly easily to modify the code in diagnostics.py.

You are correct that proj4 4.9.3 is needed (but the git repository currently still has the version number at 4.9.2). The alternative would be to test for the inverse hammer projection. At the moment that test is a no-go because of the segfault that you get when testing for inverse projections that don't exist.

I requiring either a proj4 version greater than 4.9.2. If that is not the case, check if pyproj version greater than 1.9.5.1. If so, test if the inverse hammer test works. If not, give error message "cannot determine that the inverse hammer projection is present". Allow the user to have a '--force-install-proj4'. Carefully avoid segfaulting. Heck, I guess you could --force-inv-hammer test. How do you tell the user that in case this segfaults, your system doesn't have the inverse hammer projection? That is really bad design.

That is a bit messy. The next logical question is, do you need to do that same testing when in the __init__ code to check for inverse hammer?

I'm beginning to understand the method to your madness. 😃

@jdkloe
Copy link

jdkloe commented Jan 21, 2016

one addition from my side here: pyproj has been patched now, so it will no longer segfault if the hammer inverse is not available. So after doing a new release of pyproj it would be safe to just require that release version in basemap. No runtime segfault checking needed (if that is feasible at all...)
At that point you can just do a runtime check for the presence of the hammer inverse, and give users a proper warning if it is missing.
Ofcourse users will still not have this inverse, but they could install a pyproj version with bundled proj sources or a devel version of proj.4 if they really need it.

@micahcochran
Copy link
Contributor Author

Yes that is patched in the development version and requiring the next release of pyproj is a way to solve it. I think it could raise some ire, when this is just one feature of the whole basemap package.

Here's what I'm thinking of adding to diagnotics.py

Here's an ipython session of what that code outputs on my machine:

In [9]: from pyproj import __version__ as pyproj_version

In [10]: pyproj_version
Out[10]: '1.9.5.1'

In [11]: proj4_version()
Out[11]: '4.9.2'

In [12]: check_proj_inv_hammer()
Out[12]: 'Unknown'

In [13]: check_proj_inv_hammer(segfault_protection_off=True)
Out[13]: True

It looks like there might be a way to "catch" a segfault. I just found that and the above code does not reflect this knowledge.

@micahcochran
Copy link
Contributor Author

After digesting the how to "catch" a segfault, I updated that gist and added check_proj_inv_hammer2(). No parameters for this version.

The signal seems to works on Python 2.7. (Running pyproj 1.9.5.1, and proj4 4.9.2 recent git development version.) I've not tested this on pyproj with a proj4 version that is does not support the inverse hammer projection.

This should work okay for setup.py or __init__.py. For setup.py, the users would have to have pyproj already installed for the test (otherwise, ImportError).

@micahcochran
Copy link
Contributor Author

The segfault catching didn't work out. That code caused some systems to hang, thanks @jdkloe for the feedback. I'm going back to the first version of the gist.

@jswhit Here's what I can think of. This may not be the most elegant way to approach this.

  1. In setup.py run check_proj_inv_hammer(segfault_protection_off=False). If True meaning it has the inverse hammer project, continue to install. If false or unknown exit setup.py.
  2. Allow the user, to do a test say $ python setup.py test_inv_hammer_allow_segfault tell the user that if it segfaults that they do not have the inverse hammer projection installed (which is recommended for basemap). The user can then choose to force install $ pip basemap --install-option="force-install=True".

The problem with this is it will currently break all of the Travis installations except "nightly", which uses the pyproj git development version, unless all the other versions are "forced" to install.

The segfault issue very much makes the next version of pyproj a recommendation. I can make a PR for this, but I would recommend waiting on merging this for PROJ 4.9.3 or the next version of pyproj.

@jdkloe
Copy link

jdkloe commented Feb 3, 2016

Yes, having a test option for users would be useful, but even if mentioned in the INSTALL notes, how can you be sure the users are aware of this?
Maybe the setup.py should print a warning and exit (your option 1) unless an environment setting is defined called "PYPROJ_I_DID_READ_THE_SEGFAULT_WARNING" ? In travis you could set the env variable to ensure the builds work as before.
The warning message itself could then give the user details on how to do this test, or how to skip it if he doesn't need the hammer inverse.

@WeatherGod
Copy link
Member

@micahcochran, what is the state of this PR?

@micahcochran
Copy link
Contributor Author

@WeatherGod
I think it is complete. However, I've not looked at this in months. The intent was to remove the rest of the PROJ.4 and leave that to pyproj's responsibility to install.

If it needs to be review or if you need more details, it'll be Monday before I can get to it.

@micahcochran
Copy link
Contributor Author

@WeatherGod

I reviewed my PR. What it contains are PROJ.4 specific files that include unit test shell scripts, unit test result comparison files, projection code files (like epsg), and ascii grid shift files (.lla).

The first two are useless if you don't have bash and aren't used by pyproj itself for testing. The other two are used pyproj, but are installed in its setup.

Also, Inverse of hammer was merged into PROJ.4 release 4.9.3. I assume that version will be in Ubuntu 17.04.

In my opinion, it is ready to merge.

@jenshnielsen
Copy link
Member

I just submitted a pr to upgrade the homebrew version of proj Homebrew/homebrew-core#5289 to get the Inverse hammer fix in homebrew

@WeatherGod WeatherGod merged commit a8dca0b into matplotlib:master Feb 2, 2017
@micahcochran micahcochran deleted the rm-proj4 branch April 5, 2019 03:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants