Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add iteration mechanism #131

Merged
merged 18 commits into from
Nov 16, 2023
Merged

Add iteration mechanism #131

merged 18 commits into from
Nov 16, 2023

Conversation

faktas2
Copy link
Contributor

@faktas2 faktas2 commented Nov 2, 2023

No description provided.

@faktas2 faktas2 force-pushed the fatih/add-iterator branch 3 times, most recently from 9d31fc8 to b801661 Compare November 2, 2023 15:10
@faktas2 faktas2 force-pushed the fatih/add-iterator branch from b801661 to fb269dd Compare November 2, 2023 15:11
Copy link
Member

@oschwald oschwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to have to do another closer pass, but here are some initial comments.

* Please note that a MaxMind DB may map IPv4 networks into several locations
* in an IPv6 database. This iterator will iterate over all of these locations
* separately. To only iterate over the IPv4 networks once, use the
* SkipAliasedNetworks option.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I mentioned this in the story, but I think we should make skipping the aliased networks the default. It is what most users want. The Go library is the other way only for backwards compatibility. Perhaps we could rename it to be something like includeAliasedNetworks as well.

Copy link
Contributor Author

@faktas2 faktas2 Nov 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it is enabled by default below. I think the comment is misleading. I've fixed it.

}

/**
* Returns the next NetworksItem. You need to set the class using
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first, I thought "NetworksItem" referred to a particular type, but I don't see it used anywhere else. Maybe we could just say "the next item".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, that is supposed to be DataRecord. I had created another class just like DataRecord until realizing DataRecord did the job.

* Returns if Networks had any errors.
* @return Exception The exception to the Networks iteration.
*/
public Exception getErr() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do other iterators in Java use this pattern? I don't recall seeing it before. I am a bit concerned people will not bother checking it and miss that there was an exception. Perhaps a better alternative would be to create a NetworksIterationException (or something similar) that extends RuntimeException and then wrap the checked exceptions in that and rethrow them.

}
while (!this.nodes.isEmpty()) {
// Pop the last one.
NetworkNode node = this.nodes.remove(this.nodes.size() - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if a Stack would be a more natural data structure than an ArrayList.

* @throws ClosedDatabaseException Exception for a closed databased.
* @throws InvalidDatabaseException Exception for an invalid database.
*/
public <T> Networks<T> networksWithIn(Network network, boolean skipAliasedNetworks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public <T> Networks<T> networksWithIn(Network network, boolean skipAliasedNetworks)
public <T> Networks<T> networksWithin(Network network, boolean skipAliasedNetworks)

* @param bitCount The prefix.
* @return int[]
*/
public int[] traverseTree(byte[] ip, int node, int bitCount)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be private. It doesn't make sense for it to be part of the public API.

@@ -149,6 +153,10 @@ public <T> T get(InetAddress ipAddress, Class<T> cls) throws IOException {
return getRecord(ipAddress, cls).getData();
}

protected int getIpv4Start() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make more sense if this and the other methods were package private rather than protected. I suppose it doesn't matter too much as the class is final, but I think it would make it clearer that it isn't part of the API.

@faktas2 faktas2 force-pushed the fatih/add-iterator branch from 3563ca7 to 34c911f Compare November 3, 2023 20:47
Copy link
Member

@oschwald oschwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a few more comments. Also, could you add something to the README.md and the CHANGELOG.md files?


return new DatabaseRecord<T>(data, InetAddress.getByAddress(ip), prefixLength);
} catch (IOException e) {
throw new NetworksIterationException(e.getMessage());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should wrap the whole and set the cause to the original exception. This will allow users to access all the data in the original.

Suggested change
throw new NetworksIterationException(e.getMessage());
throw new NetworksIterationException(e);

this.nodes.push(new NetworkNode(ipRight, node.prefix, rightPointer));
node.pointer = this.reader.readNode(this.buffer, node.pointer, 0);
} catch (InvalidDatabaseException e) {
throw new NetworksIterationException(e.getMessage());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
throw new NetworksIterationException(e.getMessage());
throw new NetworksIterationException(e);

* Sets the Class for the data type in DataRecord.
* @param cls The class object. ( For example, Map.class )
*/
public void setDataClass(Class<T> cls) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason why this can't just be part of the constructor?

If we do need it, it should be package private, but I don't think we need it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in go lets people retrieve data into different types when network(result any) is called. I thought we would want keep that. But, I agree. I'll move it to the constructor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that would have worked here anyway given that the class Networks<T> has a type parameter and the T here needs to match that.

}
}

public boolean isInIpv4Subtree(byte[] ip) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be private.

* @param prefix The prefix of the node.
* @param pointer The node number
*/
public NetworkNode(byte[] ip, int prefix, int pointer) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can drop public here.

*
* @param <T> The type of data returned by the iterator.
*/
public class Networks<T> implements Iterator<DatabaseRecord<T>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this final.

@faktas2 faktas2 force-pushed the fatih/add-iterator branch from 6a54119 to 899d6ec Compare November 7, 2023 20:30
Comment on lines 73 to 75
* For example,
* networks.prepareForClass(Map.Class);
* Map test = networks.next();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The references to prepareForClass should be removed.

}

/*
* Next prepares the next network for reading with the Network method. It
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasNext?

/*
* Next prepares the next network for reading with the Network method. It
* returns true if there is another network to be processed and false if there
* are no more networks or if there is an error.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"or if there is an error" can probably be removed now.

I think we should document that it throws NetworksIterationException.

/**
* Creates a Networks iterator and skips aliased networks.
* Please note that a MaxMind DB may map IPv4 networks into several locations
* in an IPv6 database. This iterator will iterate over all of these locations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is wrong now. It will only iterate over the canonical location, not the aliases.

* Please note that a MaxMind DB may map IPv4 networks into several locations
* in an IPv6 database. This iterator will iterate over all of these locations
* separately. To disable skipAliasedNetworks, which iterates over the IPv4 networks
* only once, use the SkipAliasedNetworks parameter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* only once, use the SkipAliasedNetworks parameter.
* only once, use the skipAliasedNetworks parameter.

* Please note that a MaxMind DB may map IPv4 networks into several locations
* in an IPv6 database. This iterator will iterate over all of these locations
* separately. To set the iteration over the IPv4 networks once, use the
* SkipAliasedNetworks option.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* SkipAliasedNetworks option.
* skipAliasedNetworks option.

Comment on lines 239 to 242
InetAddress ipv4 = InetAddress.getByAddress(new byte[4]);
InetAddress ipv6 = InetAddress.getByAddress(new byte[16]);
Network ipAllV4 = new Network(ipv4, 0); // Mask 32.
Network ipAllV6 = new Network(ipv6, 0); // Mask 128.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would save allocations to only create these when you actually need them (i.e., move the v6 stuff to the if block and the v4 stuff after the if block).

* @param bitCount The prefix.
* @return int[]
*/
private int[] traverseTree(byte[] ip, int node, int bitCount)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this could be shared with getRecord. Also, that has some improvements over this, e.g., the use of startNode.

"mixed",
new String[]{
// ::/96
"1.1.1.1/32",
Copy link
Contributor Author

@faktas2 faktas2 Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to change the ::/96 results to the ipv4 versions after making the change you suggested. Could you please check if this is correct?

Copy link
Member

@oschwald oschwald Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is right based on the database specification, but the ones at ::ffff:0:0/96 should look like ::ffff:1.1.1.1/128 or ::ffff:101:101/128. This is what I was referring to when we talked.

However, I just looked at the InetAddress docs, and it looks like this has behavior similar to net.IP in Go:

In InetAddress and Inet6Address, it is used for internal representation; it has no functional role. Java will never return an IPv4-mapped address. These classes can take an IPv4-mapped address as input, both in byte array and text representation. However, it will be converted into an IPv4 address.

Given that it isn't possible to return ::ffff:1.1.1.1 or whatever, I suppose it would be best to just embrace it. I think we should switch it back to something more similar to what you had. Sorry about that!

Another option would be to just remove the option. Including the aliased networks seems kind of a niche need. I didn't include them with the Python version.

README.md Outdated

```java
Reader reader = new Reader(file);
Networks networks = reader.networks(false, Map.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Networks networks = reader.networks(false, Map.class);
Networks networks = reader.networks(Map.class);

Given that we have the overload, we may as well use it here.

Reader reader = new Reader(file);
Networks networks = reader.networks(false, Map.class);

while(networks.hasNext()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would have been nice if we could implement Iterable, but I guess that isn't really possible given that we need to pass the class through. I guess it doesn't matter too much.

"mixed",
new String[]{
// ::/96
"1.1.1.1/32",
Copy link
Member

@oschwald oschwald Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is right based on the database specification, but the ones at ::ffff:0:0/96 should look like ::ffff:1.1.1.1/128 or ::ffff:101:101/128. This is what I was referring to when we talked.

However, I just looked at the InetAddress docs, and it looks like this has behavior similar to net.IP in Go:

In InetAddress and Inet6Address, it is used for internal representation; it has no functional role. Java will never return an IPv4-mapped address. These classes can take an IPv4-mapped address as input, both in byte array and text representation. However, it will be converted into an IPv4 address.

Given that it isn't possible to return ::ffff:1.1.1.1 or whatever, I suppose it would be best to just embrace it. I think we should switch it back to something more similar to what you had. Sorry about that!

Another option would be to just remove the option. Including the aliased networks seems kind of a niche need. I didn't include them with the Python version.

Copy link
Member

@oschwald oschwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I had a couple of small style changes.

/**
* This class represents an error encountered while iterating over the networks.
* The most likely causes are corrupt data in the database, or a bug in the reader code.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to have a <p> here?

/**
* Returns the node number and the prefix for the network.
* @param ip The ip address to travese.
* @param node The number of the node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This param doesn't exist.


/**
* Returns the node number and the prefix for the network.
* @param ip The ip address to travese.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"traverse"

this.reader = reader;
this.includeAliasedNetworks = includeAliasedNetworks;
this.buffer = reader.getBufferHolder().get();
this.nodes = new Stack<NetworkNode>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this.nodes = new Stack<NetworkNode>();
this.nodes = new Stack<>();

prefixLength -= 96;
}

return new DatabaseRecord<T>(data, InetAddress.getByAddress(ip), prefixLength);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return new DatabaseRecord<T>(data, InetAddress.getByAddress(ip), prefixLength);
return new DatabaseRecord<>(data, ipAddr, prefixLength);

Comment on lines 326 to 330
Networks<T> networks = new Networks<T>(this, includeAliasedNetworks,
new Networks.NetworkNode[]{new Networks.NetworkNode(ipBytes, prefix, node)},
typeParameterClass);

return networks;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Networks<T> networks = new Networks<T>(this, includeAliasedNetworks,
new Networks.NetworkNode[]{new Networks.NetworkNode(ipBytes, prefix, node)},
typeParameterClass);
return networks;
return new Networks<>(this, includeAliasedNetworks,
new Networks.NetworkNode[]{new Networks.NetworkNode(ipBytes, prefix, node)},
typeParameterClass);

@oschwald oschwald merged commit 47492ef into main Nov 16, 2023
@oschwald oschwald deleted the fatih/add-iterator branch November 16, 2023 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants