Skip to content
This repository has been archived by the owner on Jun 24, 2022. It is now read-only.

Overhauling Instant Messengers + add Session messenger #2293

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

lrq3000
Copy link
Contributor

@lrq3000 lrq3000 commented May 15, 2021

Description

Add Session messenger, using onion routing and E2EE encryption. Added a new subsection "Nodal messengers" to include onion routing messengers and blockchain based messengers, as the advantages and disadvantages differ from other types.

Resolves: #1678
Resolves: #2232 (redundant with #2311 , this will be removed here if the other PR gets merged before)
Resolves: #1357

Check List

  • I understand that by not opening an issue about a software/service/similar addition/removal, this pull request will be closed without merging.

  • I have read and understand the contributing guidelines.

  • The project is Free Libre and/or Open Source Software

@lrq3000 lrq3000 requested a review from a team as a code owner May 15, 2021 02:37
@lrq3000 lrq3000 marked this pull request as draft May 15, 2021 02:53
@lrq3000 lrq3000 closed this May 15, 2021
@lrq3000 lrq3000 reopened this May 15, 2021
@lrq3000
Copy link
Contributor Author

lrq3000 commented May 15, 2021

Build failing because of #2232 I think. I tried to build a PR with just a new image, no markup change, and it still failed.

@lrq3000 lrq3000 marked this pull request as ready for review May 15, 2021 03:38
@lrq3000 lrq3000 changed the title add Session messenger + new sub-section for nodal messengers Add Session messenger + new sub-section for nodal messengers May 15, 2021
Signed-off-by: Stephen L. <[email protected]>
@lrq3000
Copy link
Contributor Author

lrq3000 commented May 15, 2021

Ok I fixed #2232 so that the netlify preview build is online. It's ready for review.

@lrq3000 lrq3000 changed the title Add Session messenger + new sub-section for nodal messengers Add Session messenger + new sub-section for nodal messengers + fixes #2232 May 15, 2021
@dngray dngray self-requested a review May 15, 2021 06:14
@dngray
Copy link
Collaborator

dngray commented May 15, 2021

I would have thought Decentralized would have been a better definition than "Nodal" which I've actually not seen anywhere in CS.

Throwing Tor in there isn't strictly correct either and complicates things. If you're talking about .onion servers that is decentralized as well, as both peers contact a HSDir in order to rendezvous.

With Session I always believed it to be "decentralized" with some degree of being a distributed network. The service nodes are required for message storage.

What I am actually thinking is we might change the "Federated" heading to be "Decentralized" and put Session in that category. The Matrix team also refers to Matrix as being "Decentralized.

With Tox or Ring, (Briar requires Tor) it's a mixture, as in "distributed" for peer discovery, (DHT) and then point A to point B for actual communication (peer to peer). Predominately as we care about metadata in this context I would classify it as "peer-to-peer". Neither are anonymous and come with warnings to use Tor if that is required.

268562053

@jeroenev
Copy link

jeroenev commented May 15, 2021

Tbh when looking at the mentioned upsides/downsides of federated services and p2p applications I think session fits neither
Think the service nodes + onion routing fixes some issues that p2p has and some issues that federated models have

@dngray
Copy link
Collaborator

dngray commented May 15, 2021

Tbh when looking at the mentioned upsides/downsides of federated services and p2p applications I think session fits neither

I'd still put it under Decentralized, (assuming we change Federated to Decentralized) and then say what things don't apply and why.

Think the service nodes + onion routing fixes some issues that p2p has and some issues that federated models have

By definition that is still a decentralized network. The service nodes contain encrypted message data

onion routing fixes

The main one I saw there was requiring the service node to put down 15k to prevent Sybil attacks, that said to a well financed entity I don't think 15k is a lot. I guess it does depend on how many nodes there are in total though.

@lrq3000
Copy link
Contributor Author

lrq3000 commented May 15, 2021

Thank you both for your replies.

So indeed, I devised this new "nodal" category, because I do think this is a different category from federated. Both are indeed subtypes of decentralized networks, but their conceptual differences produce very significant differences in their threat models and use cases. For instance:

  • Federated servers are still a semi-centralized model, but instead of having one authority controlling the servers, it's multiple authorities. When authorities include the wide public (ie, anybody can run a server), we can say it's decentralized. But this model still imbues controlling powers to the server owners, such as access restrictions, filtering users, content (eg, by keyword) or other servers. Also, the user leaks metadata to the server they connect to.
  • Nodal networks on the other hand decouples nodes from authority, nodes are agnostic. The nodes have no mean to filter any content nor user, and banning other nodes is fruitless, it's only used in case of malicious nodes to protect the network, but banning nodes cannot impair user's ability to communicate since any new route can be created at any moment. Nodes also get no metadata about who the source is (except of course the id/public key). Nodal networks can in fact be seen as self-contained networks, although I never saw this terminology used for communication systems, but note that this is not surprising given this is a very new kind of communication system, Session being one of the first to implement this fully and robustly (BCM messenger needed servers that are now down and other messengers rely on Tor, with less than satisfying reliability as we know since they were not designed for instant communication). I used the examples of onion routing (such as Tor) and Blockchain because that's the technologies that underlie this new class of communication systems, Session being an precursory example (the nodes are using the Oxen blockchain - I should add this info BTW) but certainly not the last, we will certainly see more in the future.

So I'm strongly convinced that merging federated with nodal network messengers would be inaccurate and confusing, as both models work very differently. So although both would fit in a "decentralized" section, I think it's much clearer for users to separate and describe their respective pros and cons.

However I am not attached to the "nodal" typology, if you find a better name...

And that's just my opinion and reasoning for this PR. I will modify this PR according to the editorial board's decision of course.

@gary-host-laptop
Copy link

@lrq3000 I totally agree with you although I find that decentralized fits perfectly into what Matrix does, and distributed into what Session does, at least if we look at the graphs above, but maybe I am missing something.

I also think it would be a nice idea to add a small graph next to each category to make it easier for end user to understand in my opinion, even more if you choose to stay with the nodal definition which will be more confusing.

@lrq3000
Copy link
Contributor Author

lrq3000 commented May 15, 2021

@LongJohn-Silver Yes, certainly Matrix fits in the decentralized model. But Session doesn't fully fit in the distributed model either, although I guess it would be a better fit.

The quintessential example of the distributed model is peer-to-peer, where all nodes are connected together and play both the roles of users and relayers for other users. Here, with the nodal model Session uses (which is kind of a hybrid between decentralized and distributed now that I think about it), the nodes aren't users, and users aren't nodes. The users are shielded and anonymized precisely because they are outside of the network, and once they enter the network, the nodes take care of all the work for them, the users do not contribute to the network at all.

If I would draw what I think is a nodal network, it would be something like this (excuse me for the crude photoshopping, I'm no artist):

nodal-network

In green are the users (sender and recipient of a message in a Session communication for example), in black are the nodes selected to route messages for this communication, in grey the other nodes that are not involved for this communication, but can be for others. This shows the users are outside of the network, not part of it contrary to a distributed network. Also, the route is not selected to be the fastest, but by other metrics, so that the route acts as a further isolated subnetwork inside the whole nodal network. (Kinda like how the brain works, functional connectivity vs structural connectivity, not all nodes are used and not necessarily the fastest one but the most effective for the task at hand).

About the graphs yes it can be nice to add but I wonder if users will understand? We can also add a link to this tutorial maybe, which has a very illustrative animated version of the graph above to demonstrate the difference in resilience: https://web.archive.org/web/20200614011014/https://hackernoon.com/a-state-of-the-art-of-decentralized-web-part-1-54f70fdb7355

@lrq3000
Copy link
Contributor Author

lrq3000 commented May 15, 2021

What about renaming the section just "Blockchain"? Although this kind of network architecture is not specific to blockchains, most implementations use blockchains, so that would fit most cases.

@KeeJef
Copy link

KeeJef commented May 16, 2021

I think blockchain could be confusing, because people immediately assume that messages are stored on the blockchain which is not the case for Session. I don't mind decentralised or distributed either fits well imo.

To give a little background on the network. There are ~1750 Service Nodes, each having to stake atleast 15,000 Oxen (around $21,000 USD) these Service Nodes are broken up into "swarms" which are groups 5-7 nodes which are responsible for a deterministic subset of the networks Session ID's. When you send an encrypted message to a user you send that message to the swarm of nodes responsible for storage of messages belonging to their Session ID, its then replicated amongst those 5-7 nodes for redundancy and stored until the TTL expires, users check their swarm for messages belonging to them, once TTL expires the messages are purged by all nodes in the swarm.

When we look at the available categories

Centralised
No, because the network is comprised of ~1750 Nodes run by different operators, Where the user has an equal chance of using any of the Service Nodes

Federated
Not in the traditional sense. Federated generally implies a smaller number of centralised servers, where those servers are interconnected and sharing data and where the user can choose which server they want to provide services to them. Session users don't get a choice over which Service Nodes they use for message storage or onion routing, the protocol makes these choices based on set rules.

Peer to Peer
No, because clients don't store messages, connect to each other directly, or provide any services to the network

In saying this I think moving Element and Session into the same category of "Decentralised" would be confusing since the network layout in practice is very different between the two, for the reasons described above.

@jeroenev
Copy link

jeroenev commented May 16, 2021

Yeah session definitely is an interesting case
Technically it seems something in between p2p and federated, but UX/usability wise it feels very much like a centralized messenger, since there's no server to choose from like with federation and no always-on requirement like with p2p messengers

@dngray
Copy link
Collaborator

dngray commented May 16, 2021

I am thinking it might be best to have 3 categories, Centralized, Decentralized, and Distributed.

Then talk about each application. The reason is because "peer-to-peer" brings its own issues in regard to Ring etc, as it is.. but it is also distributed in regard to the DHT tables where peers are matched. I actually think we could improve the pros/cons section as well by marking specifically which ones may not apply. We don't list that many things so it should be easy.

I would have thought Nodal would have been a type of distributed network. To me it seems closer to that than decentralized, because although there are "supernodes" they aren't really the same as servers in matrix, xmpp, or email. etc

I do agree, with @KeeJef, we should not mention blockchain, people will assume the messages are on the chain. The more jargon the more complicated it gets.

I also think it would be a nice idea to add a small graph next to each category to make it easier for end user to understand in my opinion, even more if you choose to stay with the nodal definition which will be more confusing.

Funny you mention that, when I put the picture up there in this post #2293 (comment) I thought about that, i really think this would be a great idea would also help break up the page a bit more.

I think it still would be best not to mention "Nodal" specifically, but just refer to it as a kind of Distributed network, what do you think @KeeJef?

@KeeJef
Copy link

KeeJef commented May 17, 2021

I think it still would be best not to mention "Nodal" specifically, but just refer to it as a kind of Distributed network, what do you think @KeeJef?

Yeah i think nodal might be confusing since its not really a widely used term in this space, most people understand the gist of what decentralized/distributed means although i think they are often conflated with each other.

@Dyrimon
Copy link

Dyrimon commented May 19, 2021

The link for Oxen Dashboard is corrupted, unnecessary https:// included @lrq3000

@lrq3000
Copy link
Contributor Author

lrq3000 commented May 24, 2021

Mmmm I am getting some error when trying to include some svg images:

This page contains the following errors:
error on line 58 at column 17: Encoding error
Below is a rendering of the page up to the first error.

This happens even when I try to directly access the image URL.

Is there some parameters I need to set when saving the SVG from Inkscape?

@lrq3000
Copy link
Contributor Author

lrq3000 commented May 24, 2021

Found the issue, Jekyll is not configured to support accentuated characters. But since my computer uses a locale with accentuated characters, some metadata were automatically outputted in my locale, such as datetime:

Creator: FreeHEP Graphics2D Driver Producer: org.freehep.graphicsio.svg.SVGGraphics2D Revision Source: Date: jeudi 20 mai 2021 ࠲3:53:26 heure d�鴩 d�Europe centrale

Removing the line fixes the issue, but I also found a tool that trim that plus do some size optimizations, it's opensource so it may be useful in the future:

https://jakearchibald.github.io/svgomg/

@lrq3000
Copy link
Contributor Author

lrq3000 commented May 24, 2021

I have updated the PR with the provided feedbacks.

After researching more and scratching my head, I decided to put Session in distributed networks. Indeed, both onion routing and blockchains are primarily considered as distributed networks.

However, I could not put Session in the same section as peer-to-peer networks, as onion routing is definitely not a peer-to-peer system. We could say that onion routing and blockchains are "indirect distributed networks", where the sender and recipient do not interact together directly, in opposition with peer-to-peer distributed networks where in the end the send and recipient are communicating directly together. Unfortunately, apart from the peer-to-peer networks being defined as a subtype of distributed networks, no other type was formally identified. So I resorted to use "Non peer-to-peer" for the subsection where Session is.

I also added figures and explanations for each network type. They were generated using Cytoscape, here are the source files:

ptio-network-schemas.zip

Each file got exported into a svg file, which was then edited in Inkscape to remove the white background and resize to the file's content with 15 px margin, and then with SVGOmg to clean up unnecessary data and reduce filesize.

Please let me know what you think about the changes.

PS: On Windows, Jekyll live reload doesn't work well, I had to instead use jekyll serve --watch

Signed-off-by: Stephen L. <[email protected]>
@dngray
Copy link
Collaborator

dngray commented May 25, 2021

All SVG images should be either 128x128 and 384x128. If the image is going to warp make it smaller and center it top/bottom/left/right on a canvas that size.

They should be optimized, Inkscape does this:

optimize_svg

You may need python-lxml, and scour.

@lrq3000
Copy link
Contributor Author

lrq3000 commented May 29, 2021

Thank you very much @dngray for your help and sorry for the delay. The svg images are now updated according to your instructions.

@lrq3000
Copy link
Contributor Author

lrq3000 commented Jun 1, 2021

I have decided to change the category "Non Peer-to-Peer" into "Anonymous Routing" and make it a separate section instead of being both under the "Distributed network" section. I also rewrote the section to focus on anonymous routing, and others too to restore the old section headers (without "decentralized" or "distributed") but I mention with a link the nature of the network. Indeed, the illustration doesn't need to be explicitly labelled IMHO, it only needs to give an idea of how this kind of messenger work.

To explain why I made this change:

Please let me know what you guys think of the latest version :-)

@lrq3000 lrq3000 changed the title Add Session messenger + new sub-section for nodal messengers + fixes #2232 Overhauling Instant Messengers + add Session messenger Jun 2, 2021
@youdontneedtoknow22
Copy link

not really relevant to the discussion but: don't forget to add the audit of session

@KeeJef
Copy link

KeeJef commented Jul 9, 2021

Any progress on this, need any help / clarification from our side?

@youdontneedtoknow22
Copy link

Shouldn't Briar be under the "Anonymous Routing" section?

And I'm not a programmer so I'm not sure about this, but isn't "The protocol was independently audited" wrong?
The clients for all platforms were audited, not "only" the protocol. I'm not sure if auditing the protocol should happen on the client side or server side or both tho.

…s + add audits for all messengers if one is available

Signed-off-by: Stephen L. <[email protected]>
@lrq3000
Copy link
Contributor Author

lrq3000 commented Jul 9, 2021

@youdontneedtoknow22 Thank you very much for your feedback! Yes you're right, Briar should also be under the Anonymous Routing section, but IMHO it should also be in P2P too, as it allows both approaches (and by default, it is set in P2P mode, ie, it will also use local Wi-Fi if available, which can defeat the purpose of using Tor). I added a warning note on how to set it up for anonymous routing only mode.

You're also correct about the audit for Session, it was for the clients, not the protocol, I confused with Matrix (I made several PRs in parallel at the time 😅).

@KeeJef Thank you for sticking around :-) I have a question for you: the audit mentions that (encrypted) attachment files are 1) not encrypted on iOS devices (behavior inherited from Signal if I understand correctly), 2) are uploaded and downloaded from a centralized server. Are these points applicable for all platforms or it's only for iOS? If the former, are there any plan to manage attached files in a more decentralized manner in the future (ie, stored on Snodes and deleted when served)?

@KeeJef
Copy link

KeeJef commented Jul 16, 2021

@KeeJef Thank you for sticking around :-) I have a question for you: the audit mentions that (encrypted) attachment files are 1) not encrypted on iOS devices (behavior inherited from Signal if I understand correctly), 2) are uploaded and downloaded from a centralized server. Are these points applicable for all platforms or it's only for iOS? If the former, are there any plan to manage attached files in a more decentralized manner in the future (ie, stored on Snodes and deleted when served)?

  1. You can now have locally encrypted attachments on all planforms including iOS (works different depending on each platform)

  2. Right now Session uses the Session file server as a default location for uploading encrypted file blobs, Session attachments are sent by inserting a link to the encrypted files location on the Session file server into a message. This scheme is very general, and the code for running a Session file server is open source https://github.com/oxen-io/session-open-group-server .

This means that anyone can run their own Session file server. In the near future we plan to allow clients to set which file server they use, this will mean that even if our file server fails users could switch to community run options.

Regarding use of the Service Node network for file storage, the Service Node network is not really suitable for the storage of large files, since it focuses on redundancy which would necessitate the replication of files which would quickly use up space. The Service Node network is really tailored for high reliability sending of smaller data packets.

@KeeJef
Copy link

KeeJef commented Aug 3, 2021

Any further progress on this, need any additional information?

@lrq3000
Copy link
Contributor Author

lrq3000 commented Aug 3, 2021 via email

@efb4f5ff-1298-471a-8973-3d47447115dc

@lrq3000 i think everything is on hold because of this

https://blog.privacytools.io/the-future-of-privacytools/

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
8 participants