Datahandle test improvements and Endianness fixes #337

gab1one · 2018-09-28T14:38:18Z

The coverage of DataHandle by DataHandleTest was not sufficient, a lot of methods where untested. This PR greatly extends the test coverage and fixes some discovered bugs.

ctrueden

Thanks for working on this! I think it's great. There is just the performance concern I mentioned. Can you make a byte[8] upfront and just reuse it in all the places you need a small byte array?

ctrueden · 2018-10-02T23:47:47Z

src/main/java/org/scijava/io/handle/DataHandle.java

-		}
-		if ((ch0 | ch1) < 0) throw new EOFException();
-		return (short) ((ch0 << 8) + (ch1 << 0));
+		final byte[] buf = new byte[2];


This will destroy performance. We cannot allocate a new byte array with every short read. You could use an instance field buffer of sufficient size to avoid the issue.

How can we cache this in the interface though? Should we require all DataHandle implementations to provide a buffer array which is used for conversions? Should be easy to put into the AbstractDataHandle

ctrueden · 2018-10-02T23:49:12Z

src/main/java/org/scijava/io/handle/DataHandle.java

-		}
-		if ((ch0 | ch1 | ch2 | ch3) < 0) throw new EOFException();
-		return ((ch0 << 24) + (ch1 << 16) + (ch2 << 8) + (ch3 << 0));
+		final byte[] buf = new byte[4];


Same: performance will tank. Need a small reusable byte[] as instance field. This should be OK, since these methods are already not thread-safe.

ctrueden · 2018-10-02T23:49:23Z

src/main/java/org/scijava/io/handle/DataHandle.java

-			ch0 = read();
-		}
-		if ((ch0 | ch1 | ch2 | ch3 | ch4 | ch5 | ch6 | ch7) < 0) {
+		byte[] buf = new byte[8];


Need to use a reusable buffer.

gab1one · 2018-10-04T09:32:54Z

@ctrueden, I addressed your performance concern in 7b7e03b
I am not sure I like this solution though, it might be better to just push the affected methods into AbstractDataHandle and not expose the conversionBuffer in the interface.

This is both more consistent and correct

- StringBuilder is faster and recommended for most cases

RandomAccessFile is always BigEndian, so we need to use the interface methods for conversion.

The was not sufficient before which resulted in various problems not being discovered.

- this avoids allocating a new byte array for each read

ctrueden · 2018-10-13T15:36:03Z

@gab1one I considered different solutions to address the issue of conversionBuffer() as public API in the interface.

One idea I had was to use a static ThreadLocal<byte[]> in a utility class, to give back a unique byte[] per thread. I benchmarked it, and the performance is equivalent to your direct field storing the byte[] per instance of DataHandle.

Then I thought: where will this utility class live? What will it be called? It is pretty specialized, always returning a byte[8]. It could be generalized, but maybe YAGNI applies here. I figured we want this to be an internal implementation detail, so the utility class itself should maybe be package-protected.

But then I wondered: is it ever useful to be able to override the conversion buffer access? I am torn. There might be cases where it is helpful downstream; I don't know. (But this also might be a case of YAGNI.)

I am definitely not eager to move so many default method implementations in the abstract layer, since it makes implementing the interface directly much more complicated, even though as you point out it would certainly have better encapsulation regarding this issue.

In the absence of a clearly superior alternative, I'm going to merge this as-is with the conversionBuffer() method exposed in the public API.

By the way, in my investigations, I was curious whether interfaces can have inner classes, and indeed they can! But they are always public static. (I was thinking this is a way to embed static fields in an interface: nest them inside an inner class as static fields of that class. But then the interface has a weird inner class exposed to the world, which is still not very nice.)

gab1one added bug enhancement labels Sep 28, 2018

gab1one requested a review from ctrueden September 28, 2018 14:38

ctrueden added the in progress label Sep 28, 2018

ctrueden force-pushed the datahandle-endianness branch from a399c42 to d2ff6ce Compare October 2, 2018 14:14

ctrueden removed the in progress label Oct 2, 2018

ctrueden requested changes Oct 2, 2018

View reviewed changes

ctrueden added the to do label Oct 3, 2018

ctrueden added in progress and removed to do labels Oct 9, 2018

gab1one added 5 commits October 12, 2018 17:44

DataHandle: use Bytes helper methods for correct endian coding

3e72152

This is both more consistent and correct

DataHandle#readLine() use StringBuilder instead of StringBuffer

64f6419

- StringBuilder is faster and recommended for most cases

FileHandle: use DataHandle methods for all endianness affected IO

5835b94

RandomAccessFile is always BigEndian, so we need to use the interface methods for conversion.

DataHandleTest: extend to cover all methods in DataHandle

e1290fa

The was not sufficient before which resulted in various problems not being discovered.

DataHandle: Use persistent buffer for conversions.

e9f820a

- this avoids allocating a new byte array for each read

gab1one force-pushed the datahandle-endianness branch from 7b7e03b to e9f820a Compare October 12, 2018 15:45

ctrueden merged commit 2ea19d2 into master Oct 13, 2018

ctrueden deleted the datahandle-endianness branch October 13, 2018 15:40

ctrueden removed the in progress label Oct 13, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Datahandle test improvements and Endianness fixes #337

Datahandle test improvements and Endianness fixes #337

Uh oh!

gab1one commented Sep 28, 2018 •

edited by ctrueden

Loading

Uh oh!

ctrueden left a comment

Uh oh!

ctrueden Oct 2, 2018

Uh oh!

gab1one Oct 4, 2018 •

edited

Loading

Uh oh!

ctrueden Oct 2, 2018

Uh oh!

ctrueden Oct 2, 2018

Uh oh!

gab1one commented Oct 4, 2018 •

edited

Loading

Uh oh!

ctrueden commented Oct 13, 2018 •

edited

Loading

Uh oh!

Uh oh!

Datahandle test improvements and Endianness fixes #337

Datahandle test improvements and Endianness fixes #337

Uh oh!

Conversation

gab1one commented Sep 28, 2018 • edited by ctrueden Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ctrueden left a comment

Choose a reason for hiding this comment

Uh oh!

ctrueden Oct 2, 2018

Choose a reason for hiding this comment

Uh oh!

gab1one Oct 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ctrueden Oct 2, 2018

Choose a reason for hiding this comment

Uh oh!

ctrueden Oct 2, 2018

Choose a reason for hiding this comment

Uh oh!

gab1one commented Oct 4, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ctrueden commented Oct 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

gab1one commented Sep 28, 2018 •

edited by ctrueden

Loading

gab1one Oct 4, 2018 •

edited

Loading

gab1one commented Oct 4, 2018 •

edited

Loading

ctrueden commented Oct 13, 2018 •

edited

Loading