-
Notifications
You must be signed in to change notification settings - Fork 114
AddressBook + HTTP Proxy: store only unique subscription addresses #835
Conversation
4db8e67
to
37a7b30
Compare
New helper functions validate human readable hostnames, and ensure only unique entries are loaded from subscriptions. Refactor subscription validation + saving with new helpers. Refs monero-project#835
Refactor test cases and fixture for added utility/clarity. New test cases for validating only unique, well-formed entries are added to the address book. Refs monero-project#835
Save jump service addresses to in-memory address book, and remove singleton used for directly storing to disk. Add a URL parser, and refactor various components to use it. Refs monero-project#835
Ensure that saving jump service addresses are handled properly. Refs monero-project#835
37a7b30
to
eec1345
Compare
New helper functions validate human readable hostnames, and ensure only unique entries are loaded from subscriptions. Refactor subscription validation + saving with new helpers. Refs monero-project#835
Refactor test cases and fixture for added utility/clarity. New test cases for validating only unique, well-formed entries are added to the address book. Refs monero-project#835
Save jump service addresses to in-memory address book, and remove singleton used for directly storing to disk. Add a URL parser, and refactor various components to use it. Refs monero-project#835
HTTP jump service requests save valid addresses to in-memory address book, separating storage from HTTP message processing. Refs monero-project#835
Update test fixture with a destination that has a valid length. Previous destination fails identity validation checks, and throws a length error when decoded from base64. Refs monero-project#835
eec1345
to
22f9202
Compare
Refactor fixture for added utility + clarity. New test cases for validating only unique, well-formed entries are added to the address book. Refs monero-project#835
Save jump service addresses to in-memory address book, and remove singleton used for directly storing to disk. Add a URL parser, and refactor various components to use it. Refs monero-project#835
HTTP jump service requests save valid addresses to in-memory address book, separating storage from HTTP message processing. Refs monero-project#835 + monero-project#838
Rewrite to fix bugs uncovered by unit-tests. Refs monero-project#835 + monero-project#838
22f9202
to
5944bdd
Compare
Rewrite to fix bugs uncovered by unit-tests. Refs monero-project#835 + monero-project#838
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At 47 NACKs, I'll finish the review once the issues are resolved. Also look at your commit messages: they are inaccurate:
- You didn't rewrite the unit-test: you did unnecessary refactoring and added test-cases.
- You didn't rewrite address saving: you moved parsing from extraction and added a check.
- "HTTPProxy: rewrite to fix uncovered bugs" doesn't describe the refactor, nor does it describe any of the bugs anywhere (message title or body).
@@ -165,7 +209,9 @@ BOOST_AUTO_TEST_CASE(PGPClearSign) { | |||
"=P8Ug" | |||
"-----END PGP SIGNATURE-----"; | |||
lines.push_back(line); | |||
BOOST_CHECK(Validate()); | |||
//TODO(oneiric): PGP-signed subscriptions should pass validation | |||
// mark true after implementing PGP signature verification |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implementing PGP signature verification
There's no real reason to do that. You probably mean ignoring pgp signature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't sure if PGP signatures are actually verified by AddressBook
. Thought it might be a good idea for validating the shipped hosts.txt
+ other subscriptions.
If AddressBook
ignores PGP signatures in subscriptions, is PGP verification a feature somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If AddressBook ignores PGP signatures in subscriptions, is PGP verification a feature somewhere else?
Yes, the reason for this is because our monero publisher service pulls the default subscription directly from github (see the pkg directory). I PGP sign the subscription to prevent any tampering; either by github or the very rare potential "I broke your TLS" mitm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thank you, that answers my question completely.
file << host << "=" << ident.ToBase64() << '\n'; // TODO(anonimal): this is not optimal, especially for large subscriptions | ||
// Add to address book | ||
m_Storage->AddAddress(ident); | ||
m_Addresses[host] = ident.GetIdentHash(); // TODO(anonimal): setter? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Add to address book
m_Storage->AddAddress(ident);
m_Addresses[host] = ident.GetIdentHash(); // TODO(anonimal): setter?
.....do I really have to say why removing this is a bad idea?...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, you don't. This was left over from previous edits, and I re-added it to the try-catch block below.
Thanks for catching it.
src/client/address_book/impl.cc
Outdated
LOG(warning) << "AddressBook: malformed address, skipping"; | ||
continue; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No way. Not only does this vastly decrease efficiency (you're inserting for every line!), but insertion should stay decoupled from validation. There was no reason to refactor this function nor setup a vector nor use boost::split.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was my mistake, I was trying to check if addresses were already loaded in memory during validation. This function is now unchanged.
src/client/address_book/impl.cc
Outdated
std::string line; | ||
// To ensure valid hostname | ||
// Note: uncomment if this regexp fails on some locales (to not rely on [a-z]) | ||
//const std::string alpha = "abcdefghijklmnopqrstuvwxyz"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't remove this unless you've adequately tested across multiple locales.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I re-added the comment. Wasn't sure if this was leftover, and still don't really understand how the alpha string would help if the regex below fails.
src/client/address_book/impl.cc
Outdated
@@ -332,16 +338,8 @@ bool AddressBook::SaveSubscription( | |||
const std::map<std::string, kovri::core::IdentityEx> | |||
AddressBook::ValidateSubscription(std::istream& stream) { | |||
LOG(debug) << "AddressBook: validating subscription"; | |||
// Map host to address identity |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no functional reason to remove this comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, it got caught up when I was trying to separate out host validation from subscription validation.
src/client/proxy/http.cc
Outdated
if (helper_ident.size() != 2) | ||
throw std::length_error("HTTPProxyHandler: invalid number of jump service parameters"); | ||
return helper_ident; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overkill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
src/client/proxy/http.cc
Outdated
m_Base64Destination = boost::network::uri::decoded(base64); | ||
return true; | ||
std::vector<std::string> const helper_ident = ParseJumpServiceQuery(); | ||
if (!helper_ident[1].empty()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needing to access like this exposes bad design. Don't repeat guzzi's mess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try not to. I feel infected by it.
src/client/proxy/http.cc
Outdated
pos = pos2; | ||
std::vector<std::string> const helper_ident = ParseJumpServiceQuery(); | ||
for (const auto& helper : m_JumpService) | ||
if (helper_ident[0] == helper) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needing to access like this exposes bad design. Don't repeat guzzi's mess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try not to. I feel infected by it.
src/client/proxy/http.cc
Outdated
return false; | ||
} | ||
if (m_Address.empty()) | ||
ParseURL(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By now, you shouldn't need to parse anything more than once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% agree, I went overkill on this. I'll clean up this area in the HTTP proxy rewrite.
src/client/proxy/http.cc
Outdated
ParseURL(); | ||
core::IdentityEx ident; | ||
// Extra sanity check to ensure base64 destination is uri-decoded | ||
ident.FromBase64(boost::network::uri::decoded(m_Base64Destination)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By now, you shouldn't need to parse anything more than once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I was considering not adding this extra URI-decoding step. Reverted, and will keep in mind for the HTTP proxy rewrite.
Note: if your intent is only to |
New helper checks if host + address already loaded into memory. Refactor functions that store addresses to use the new helper. Refs monero-project#835
New helper to validate I2P hostnames, and refactor subscription validation to use it. Refs monero-project#835
New test case to ensure address book only inserts unique entries. Added struct + helper to test fixture for extracting an address book entry from a subscription line. Refs monero-project#835
Save jump service addresses to in-memory address book, and remove singleton used for directly storing to disk. Add a URL parser, and refactor various components to use it. Refs monero-project#835
HTTP jump service requests save valid addresses to in-memory address book, separating storage from HTTP message processing. Refs monero-project#835 + monero-project#838
Rewrite to fix bugs uncovered by unit-tests. Refs monero-project#835 + monero-project#838
Implement inserting an address to the in-memory address book. Will replace the singleton used by HTTP proxy, and is generic to be used by other member functions or external classes. Refs monero-project#835
New test case to ensure address book only inserts unique entries. Added struct + helper to test fixture for extracting an address book entry from a subscription line. Refs monero-project#835
Save jump service addresses to in-memory address book, and remove singleton used for directly storing to disk. Add a URL parser, and refactor various components to use it. Refs monero-project#835
HTTP jump service requests save valid addresses to in-memory address book, separating storage from HTTP message processing. Refs monero-project#835 + monero-project#838
New setter for in-memory address book to insert unique entries. Refactor functions that store addresses to use the new setter. Refs monero-project#835
New test case to ensure address book only inserts unique entries. Added struct + helper to test fixture for extracting an address book entry from a subscription line. Refs monero-project#835
5944bdd
to
0bbd406
Compare
Painful lesson. I get over-eager to do refactoring as I see the problem. Hard habit to break. I moved my HTTP Proxy refactor attempts (the few parts worth saving), and your recommendations to a separate branch dedicated to an HTTP message rewrite. Again, thank you for your patience, and our conversation in IRC about future refactors. Apologize for any time wasted, or blood vessels popped. |
0bbd406
to
75b5712
Compare
New test case to ensure address book only inserts unique entries. New test fixture struct to convert a subscription line into an address book entry. Refs monero-project#835
75b5712
to
40fd3da
Compare
New test case to ensure address book only inserts unique entries. New test fixture struct to convert a subscription line into an address book entry. Refs monero-project#835
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making all the necessary changes.
src/client/address_book/impl.cc
Outdated
{ | ||
const auto it = m_Addresses.find(host); | ||
if (it != m_Addresses.end()) | ||
throw std::runtime_error("AddressBook: host already loaded"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto it = m_Addresses.find(host);
if (it != m_Addresses.end())
throw std::runtime_error("AddressBook: host already loaded");
So, not only is this incredibly expensive, but it's not necessary because of m_Addresses[host] = address
std::map::operator[] returns a reference to the value that is mapped to a key equivalent to key, performing an insertion if such key does not already exist.
A much better solution is to tackle file writing / address saving after the map is properly updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize std::map
worked that way, pretty cool feature. Removed the quoted block, but kept the check for loaded addresses. Eventually, likely after multiple subscriptions PR, there should be a reverse address => hostname
lookup, making the current naive lookup unnecessary.
A much better solution is to tackle file writing / address saving after the map is properly updated.
I will keep this in mind while working on the multiple subscription PR. Do we want to follow Java I2P naming for filenames, i.e. userhosts.txt, privatehosts.txt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to follow Java I2P naming for filenames, i.e. userhosts.txt, privatehosts.txt?
There's no need, and userhosts.txt
is also confusing.
struct AddressBookEntry | ||
{ | ||
public: | ||
AddressBookEntry() : m_Host(""), m_Address() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initializing a member string like this is unnecessary. Initializing m_Address
like this is unnecessary because we do that during Tag
construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't sure here about initialization, so I added it just in case. Removed now.
AddressBookEntry() : m_Host(""), m_Address() {} | ||
|
||
explicit AddressBookEntry(const std::string& subscription_line) | ||
: m_Host(""), m_Address() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initializing a member string like this is unnecessary. Initializing m_Address
like this is unnecessary because we do that during Tag
construction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't sure here about initialization, so I added it just in case. Removed now.
} | ||
} | ||
|
||
inline const std::string& host() const { return m_Host; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inline
Not necessary as this is implicitly inlined. inline
s are also suggestions which are not always observed by the compiler and may go ignored. The compiler may also inline when not explicitly stated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the inline
suggestion, obvious now it doesn't help. Will look more into inline
for future use.
|
||
inline const std::string& host() const { return m_Host; } | ||
|
||
inline const kovri::core::IdentHash& address() const { return m_Address; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inline
Not necessary as this is implicitly inlined. inline
s are also suggestions which are not always observed by the compiler and may go ignored. The compiler may also inline when not explicitly stated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the inline
suggestion, obvious now it doesn't help. Will look more into inline
for future use.
inline const kovri::core::IdentHash& address() const { return m_Address; } | ||
|
||
private: | ||
/// @var m_Host |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// @var m_Host
Not necessary. In fact, with members, you can do std::string m_Host; ///< Human-readable I2P hostname
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So much cleaner! Updated now.
/// @var m_Host | ||
/// @brief Human-readable I2P hostname | ||
std::string m_Host; | ||
/// @var m_Address |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// @var m_Address
Not necessary. In fact, with members, you can do kovri::core::IdentHash m_Address; ///< I2P address hash
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So much cleaner! Updated now.
BOOST_AUTO_TEST_CASE(RejectDuplicateEntry) | ||
{ | ||
// Ensure valid subscription line creates an entry | ||
BOOST_CHECK_NO_THROW(AddressBookEntry const entry(subscription.front())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AddressBookEntry const
const
not necessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, guess it is a bit too pedantic given the context.
// Ensure valid subscription line creates an entry | ||
BOOST_CHECK_NO_THROW(AddressBookEntry const entry(subscription.front())); | ||
|
||
AddressBookEntry const entry(subscription.front()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AddressBookEntry const
const
not necessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, guess it is a bit too pedantic given the context.
New setter for in-memory address book to insert unique entries. Refactor functions that store addresses to use the new setter. Refs monero-project#835
40fd3da
to
c691441
Compare
New test case to ensure address book only inserts unique entries. New test fixture struct to convert a subscription line into an address book entry. Refs monero-project#835
You're welcome, thanks for the thorough review. |
New test case to ensure address book only inserts unique entries. New test fixture struct to convert a subscription line into an address book entry. Refs monero-project#835
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimizations (design, not compiler) will eventually be needed because you want to iterate through the entire map for every single address in a subscription.
Works well enough for now.
c691441
to
a673466
Compare
By submitting this pull-request, I confirm the following:
Fixes #814
edit: HTTP Proxy is changed indirectly through the Address Book singleton the proxy uses to save jump service addresses.