Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple URLs in a single entry #1769

Closed
wants to merge 3 commits into from
Closed

Support multiple URLs in a single entry #1769

wants to merge 3 commits into from

Conversation

yan12125
Copy link
Contributor

@yan12125 yan12125 commented Mar 22, 2018

WARNING: This work is far from complete. Use at your own risk.

Description

This patch adds several custom fields to an entry to support more complex features than the existing URL field, including regular expressions and URL blacklists, for URL matching with KeePassXC-Browser.

There are 4 new fields:

  • altURLs: Matching against additional URLs
  • regExURLs: Matching against regular expressions
  • blockedURLs: A blacklist for URL matching
  • regExBlockedURLs: Similar to blockedURLs but using regular expressions

Currently only the first two are implemented. To use them, add custom fields from the Advanced page of an entry. For example, in my Facebook entry, I have https://www.messenger.com/ in URL and the following additional fields:

default
default

With those settings, KeePassXC-Browser will match https://www.messenger.com/, https://www.facebookcorewwwi.onion/ and all HTTPS URLs ending with facebook.com, like https://www.facebook.com/ and https://zh-tw.facebook.com/.

Motivation and context

As described in #398, supporting multiple URLs in a single bring convenience and benefits over using references.

How has this been tested?

I've used this for months.

Screenshots (if appropriate):

N/A

Types of changes

  • ✅ New feature (non-breaking change which adds functionality)

Checklist:

  • ✅ I have read the CONTRIBUTING document. [REQUIRED]
  • ✅ My code follows the code style of this project. [REQUIRED]
  • All new and existing tests passed. [REQUIRED]
  • I have compiled and verified my code with -DWITH_ASAN=ON. [REQUIRED]
  • My change requires a change to the documentation and I have updated it accordingly. (TODO)
  • I have added tests to cover my changes. (TODO)

Discussions

  • 4 custom attributes, too many or too few?

I steal this idea from KeeFox as described in #398 (comment). Another implementation for similar purpose can be found at pfn/keepasshttp#340, which implements one additional field RegExp. Maybe there's a better design than the two existing implementations.

  • Compatibility issues

KeePassHTTP is dying, and KeePassXC is a pioneer for a native-messaging-based protocol, but I think we should at least talk with the author of https://github.com/smorks/keepassnatmsg on the design.

  • Is it better to put additional URLs in KeePassXC-Browser Settings than put them in standalone custom attributes? Currently I choose the latter approach for easy hand-editing, which might not be an issue if a GUI is implemented.

  • Extensions to the Allow key in KeePassXC-Browser Settings? For regExURLs, users need to update the Allow key whenever a new domain name is accepted (by checking "Remember this decision" in confirm access dialog). Maybe adding a regular expression to Allow is a good choice?

TODO

Known issues

  • With KeePassXC-Browser, submitUrl is empty if there are no <form>s. As a result, TOTP does not work with those additional fields.

@droidmonkey
Copy link
Member

droidmonkey commented Mar 22, 2018

I like this, but for "magic attributes" we would want a prefix of some kind like "URL_ALT" and "URL_REGEX" and stick with uppercase. I assume in future revisions you will add a gui dialog or similar to prompt the user for the values and the details will be handled by the gui.

@TheZ3ro
Copy link
Contributor

TheZ3ro commented Mar 22, 2018

I was having an idea about implementing this in a different way.

Just 1 additional attribute with a list of urlencoded urls separated with a ; character.
Some of them can have a regex with * character like in your proposal

Also I don't see the need of the blocked url, you either have a whitelist or a blacklist. Having both is a no sense. Can you explain the need of the blockedURLs ?

@yan12125
Copy link
Contributor Author

Sorry for the late response. Was busy on other stuffs.

for "magic attributes" we would want a prefix of some kind like "URL_ALT" and "URL_REGEX" and stick with uppercase

Sounds reasonable. Added to TODO

Just 1 additional attribute with a list of urlencoded urls separated with a ; character.
Some of them can have a regex with * character like in your proposal

Here's an issue - how to know which is a regular expression and which is not?

Also I don't see the need of the blocked url, you either have a whitelist or a blacklist. Having both is a no sense. Can you explain the need of the blockedURLs ?

Actually I don't know what's the purpose of blockedURLs, either. I write a placeholder to keep consistent with KeeFox, preserving the possibility of migration. For now I can drop that.

@TheZ3ro
Copy link
Contributor

TheZ3ro commented Mar 25, 2018

The * character is not allowed in hostname/domains/subdomains by RFC952 and RFC1123.
For regex in path and query strings the thing get more complicated.

Only my 2cent about making it KISS

@yan12125
Copy link
Contributor Author

Do you suggest that only strings with * should be treated as regular expressions? Then users won't be able to use a pattern like https://travis-ci\.(org|com)

@droidmonkey
Copy link
Member

I am tackling this issue in my advanced search work. I'll be posting it as PR tonight. Check out what i did when posted, we can extract the regex logic i made into a static function if need be.

@droidmonkey droidmonkey changed the title [WIP] Support multiple URLs in a single entry Support multiple URLs in a single entry Apr 3, 2018
@yan12125
Copy link
Contributor Author

yan12125 commented Apr 3, 2018

Rebased after the code formatting branch merged. No changes in functionality yet.

@neochrome
Copy link

What about having a key: KPH: urls which would be multiline (to support multiple urls) and also support shell globbing with * and possible **. I think that would be quite simple to use, and does one really need full regex support?
Example:

https://onesite.com/*/login
https://another-site.net/*
https://*.a-domain.io/*

@TheZ3ro
Copy link
Contributor

TheZ3ro commented Apr 3, 2018

Then users won't be able to use a pattern like https://travis-ci\.(org|com)

Thinking about this a little more..
Now you are checking the host if it's not a regex field and checking for a match if it is.

Just change the order. Check if there is a match when in regex mode. If there is all good. If there isn't fallback and check if the host is the same. if they are indeed the same you return true, if they aren't return false.

This will require a little of validation with the url field but it's required now as well since I don't reallt like passing the raw url content as regex

@droidmonkey droidmonkey added this to the v2.4.0 milestone Apr 5, 2018
@Thulium-Drake
Copy link

Thulium-Drake commented Apr 8, 2018

How does the implementation of your code currently handles punycode attacks? (Or an even better question, is it actually vulnerable to this?)

For example, if you go to this link (it's a demo page, so apart from showing the attack, harmless), and the browser shows https://www.apple.com, the browser is 'vulnerable' by not disambiguating any unicode characters to non-unicode.

https://xn--80ak6aa92e.com/

More information -> https://thehackernews.com/2017/04/unicode-Punycode-phishing-attack.html

@phoerious
Copy link
Member

We don't and there is also not really a reason for it. We are not a web browser, we only save whatever the user enters into the URL field and when the user wants us to open a URL, all we do is open a browser which hopefully has mitigations against this kind of attack.

@yan12125
Copy link
Contributor Author

yan12125 commented May 8, 2018

I added a commit to migrate altURLs and regExURLs to URL_ALT and URL_REGEX. Similar to implicit migration of KDBX 3 -> KDBX 4, such a migraion also occurs at database saving.

@MightyPork
Copy link

Sorry to bother, but is there any workaround or alternative way to do this, that can be used before this is merged?
I'm trying to use the same entry for amazon.de and amazon.co.uk, so they are picked up by the browser plugin correctly.

@yan12125
Copy link
Contributor Author

Sorry to bother, but is there any workaround or alternative way to do this, that can be used before this is merged?

@MightyPork: this might be useful: #398 (comment)

@kronn
Copy link

kronn commented Aug 19, 2018

I now create linked accounts. (Clone Entry -> Replace Username and Password with References) It's not quite the same with regards to expiry and such. OTOH, it works well enough for now.

@MightyPork
Copy link

thanks @kronn, this is easy and actually works. I never tried cloning before!

@erikvanoosten
Copy link

Whatever is chosen for value format, it must either be super simple, or specified in extreme detail.

This because there are more applications that read kdbx files (for example, I also use Keepass2Android) and ideally, these will be able to work seamlessly on the same data.

What about this idea: if a URL value contains one of these characters * or |, it will be considered a regular expression, else it must match exactly.

@Aetf
Copy link
Contributor

Aetf commented Aug 21, 2018

I also like the idea of storing things in a single advanced attribute. But why not just store the URL list as JSON or whatever existing serialization format rather than inventing some ad-hoc way to pack multiple strings together? A GUI can be implemented later to hide the complexity from users.

It also has the bonus point that plain or regex url can be easily identified by adding an extra field to the store object. Something like

[
{ "type": 1, "url": "https://example.com"},
{ "type": 2, "url": "https://regex\\.(com|org)"}
]

@erikvanoosten
Copy link

erikvanoosten commented Aug 21, 2018

@Aetf .kdbx files already supports multi-values so using json is not neccesary.
Also, keep in mind that clients need to iterate over all entries in the database. So forcing expensive json parsing to a client is not nice (think low end mobile phones).

@Aetf
Copy link
Contributor

Aetf commented Aug 21, 2018

@erikvanoosten I'm not sure if JSON parsing is expensive for this small scale... But I get your point. :) I just feel guessing the URL type from special characters isn't right.

@yan12125
Copy link
Contributor Author

@erikvanoosten What are multi-values? Sounds like a good idea for this.

BTW, I guess deserializing JSON values is much faster than decryption :)

@erikvanoosten
Copy link

@yan12125 You are right. However, unless you cache the results of the json deserialization (again not nice on a low end mobile phone), it will have to be done for each and every URL lookup, while the decryption result is always cached.

@MightyPork
Copy link

@erikvanoosten I may be missing your specific use case on mobile, but for me, as a user, that'd be totally negligible. I use the mobile app to find a password once a day, often even less. Even on the PC I need it just a few times a day. I think you're micro-optimizing here.

@droidmonkey
Copy link
Member

droidmonkey commented Aug 22, 2018

Json parsing is how the internet works, it is by far the fastest ascii based syntax. The only thing better that I know of is protobuf (binary).

You CAN store multiple values in kdbx (that is what attributes are) but then you have to uniquely name each key. That presents it's own issues.

We are introducing "custom attributes" from kdbx4 in v2.4, that might be an even better option. Really this PR boils down to UX, everything else trivial.

@Lantizia
Copy link
Contributor

Not sure if this is accidental or not...

But the main / normal attribute that the GUI shows on the normal side tab of 'Entry' (so not under 'Advanced') of simply 'URL'... it would appear that it does accept a semicolon

So for Steam I have...

https://store.steampowered.com;https://steamcommunity.com

When I'm in the main screen of KeePassXC (not editing the entry) it no longer shows this text as a working underlined/blue hyperlink.

But what is interesting is... KeePassXC-Browser can now understand this entry has two sites to prompt for permission on.

I'm not sure if this is intended or not - but please don't fix it until you've got an alternative in mind for multi-URL entries!!!

I've tried the above method (the altURLs advanced attribute thingy) and it just plain doesn't work (yes I'm on 2.3.4)

@saalkom
Copy link

saalkom commented Oct 29, 2018

@Lantizia

Made a GitHub account just to say thanks to you. altURLs has no effect under 2.3.4/KPXCBrowser-1.3.0 here either. I was racking my brain until I tried your workaround. Per #274 (comment) a vertical bar/pipe (|) works the same, although both apparently "break lot of things on KeePass side" (exactly what breaks is not elaborated on). But I second what you have to say about not "fixing" the unintended(?) functionality just yet.

@droidmonkey
Copy link
Member

droidmonkey commented Oct 30, 2018

@yan12125 can you change this to conform with the suggestion in #2356 (comment).

I personally do not think it is necessary to support regex and all the other features. That should be done by the supporting plugins (Browser) instead. Just like how the current URL field works.

@yan12125
Copy link
Contributor Author

Sorry, but I don't have much time before 11/13. I can update this PR then.

@droidmonkey
Copy link
Member

Ok I'll punt this to 2.4.1 or 2.5

@droidmonkey droidmonkey modified the milestones: v2.4.0, Future Oct 30, 2018
@heX16
Copy link

heX16 commented Nov 14, 2018

Sorry, but I don't have much time before 11/13. I can update this PR then.

I look forward to this PR! (I am very tired of making copies of records for different URLs)
Thanks for doing this important job!

Chih-Hsuan Yen added 3 commits November 29, 2018 10:14
@droidmonkey
Copy link
Member

I am closing this because we will implement this in 2.5.0 as part of our templates push.

@droidmonkey droidmonkey removed this from the v2.5.0 milestone Mar 24, 2019
@yan12125
Copy link
Contributor Author

Thanks for the info and sorry for lack of time and knowledge for finishing. It's great to hear there will be a more general feature. Is there an issue or a PR for that feature?

@droidmonkey
Copy link
Member

Not yet, that is a 2.5.0 planned feature

@frederickjh
Copy link

As part of 863 then?

@droidmonkey
Copy link
Member

Yes

@yan12125
Copy link
Contributor Author

Thank you both! I guess I got the idea of templates.

@yan12125 yan12125 deleted the wip/multiple-urls branch April 12, 2019 05:29
@Lantizia
Copy link
Contributor

For anyone else reading this in the future and especially the above comments on using a semicolon or pipe to seperate URLs on the one URL field.

Don't - if you use KeePassXC-Browser. It really confuses it and ends up asking for permission for sites which have nothing to do with the site you're currently on.

@yan12125
Copy link
Contributor Author

Now there is official support for multiple URLs in KeePassXC. In case anyone else is using my branch, here is a patch to migrate from URL_ALT to officially-supported KP2A URLs. URL_REGEX attributes are dropped, though, as I don't use it in many entries and it's non-trivial to migrate them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.