`about.html`: Include the presence of gVisor as additional security layer #41

EtiennePerot · 2024-07-28T02:04:40Z

This updates about.html to reflect the addition of gVisor as an extra layer of security in the Dangerzone document handling process.

The document sometimes conflated "container" and "sandbox", which is understandable because they were effectively the same thing before adding this extra sandboxing layer in the middle. Now it uses "container" only when talking about containers, otherwise it uses "sandbox". index.html was already using this language, so no update needed there.

Corp shill check: There is only one outgoing link to gvisor.dev which I can remove if you'd prefer. It is marked target="_blank" rel="noopener noreferrer" as are other external links on the page. This mentions the word "gVisor" fewer times than "Linux"; the word "gVisor" is used only when (a) first talking about the use of sandboxing in Dangerzone, and (b) when talking about specific gVisor components like the kernel/syscall filters. Other parts of the page use the unqualified word "sandbox" instead.

$ grep gVisor about.html | wc -l
5
$ grep Linux about.html | wc -l 
8

This PR is built on @apyrgio's 2024-05-see-also branch on the assumption that it will be merged into main, and so that the diff shown on GitHub only shows the difference against that branch. My intention is that once #39 is merged, I can rebase and this PR should be edited to have its target branch set to main.

apyrgio · 2024-07-29T21:47:41Z

Thanks a lot for the contribution Etienne 🤩. I haven't managed to look into it just yet, because I want to tie some other loose ends before switching to it (and documenting gVisor's usage in Dangerzone in general). I'll comment on it as soon as possible though.

apyrgio · 2024-08-19T15:18:20Z

Minor heads up, I've converted the changes in this PR from HTML format to Markdown. I've done the same thing for the parent PR as well.

apyrgio · 2024-08-19T16:49:26Z

src/about.md

@@ -48,30 +48,30 @@ I got the idea for Dangerzone from Qubes, an operating system that runs everythi

 Dangerzone was inspired by TrustedPDF but it works in non-Qubes operating systems, which is important, because most of the journalists I know use Macs and probably won’t be jumping to Qubes for some time.

-It uses Linux containers to sandbox dangerous documents instead of virtual machines. And it also adds some features that TrustedPDF doesn’t have: it works with any office documents, not just PDFs; it uses optical character recognition (OCR) to make the safe PDF have a searchable text layer; and it compresses the final safe PDF.
+It uses [gVisor](https://gvisor.dev/) sandboxes running in Linux containers to sandbox dangerous documents. And it also adds some features that TrustedPDF doesn’t have: it works with any office documents, not just PDFs; it uses optical character recognition (OCR) to make the safe PDF have a searchable text layer; and it compresses the final safe PDF.


Suggested change

It uses [gVisor](https://gvisor.dev/) sandboxes running in Linux containers to sandbox dangerous documents. And it also adds some features that TrustedPDF doesn’t have: it works with any office documents, not just PDFs; it uses optical character recognition (OCR) to make the safe PDF have a searchable text layer; and it compresses the final safe PDF.

It uses [gVisor](https://gvisor.dev/) sandboxes running in Linux containers to open dangerous documents, instead of virtual machines. And it also adds some features that TrustedPDF doesn’t have: it works with any office documents, not just PDFs; it uses optical character recognition (OCR) to make the safe PDF have a searchable text layer; and it compresses the final safe PDF.

I wanted to avoid using the word "sandbox" twice here. Also I brought back the comparison with virtual machines, since I think it makes sense in this paragraph, where we compare Dangerzone with TrustedPDF.

apyrgio · 2024-08-19T16:56:17Z

src/about.md


 How does Dangerzone work?
 -------------------------

-Dangerzone uses Linux containers (two of them), which are sort of like quick, lightweight virtual machines that share the Linux kernel with their host. The easiest way to get containers running on Mac and Windows is by using [Docker Desktop](https://www.docker.com/products/docker-desktop). So when you first install Dangerzone, if you don’t already have Docker Desktop installed, it helps you download and install it.
+Dangerzone uses Linux containers (two of them), and runs a sandbox inside each. The easiest way to get containers running on Mac and Windows is by using [Docker Desktop](https://www.docker.com/products/docker-desktop). So when you first install Dangerzone, if you don’t already have Docker Desktop installed, it helps you download and install it.


Nit, the gVisor sandbox runs only on the first container. The second container (soon to be removed) does not start a gVisor sandbox, and just recreates the PDF locally.

So, I wouldn't introduce gVisor just yet. If you want to make the container description more accurate and not mention virtual machines, we could describe them as follows:

Suggested change

Dangerzone uses Linux containers (two of them), and runs a sandbox inside each. The easiest way to get containers running on Mac and Windows is by using [Docker Desktop](https://www.docker.com/products/docker-desktop). So when you first install Dangerzone, if you don’t already have Docker Desktop installed, it helps you download and install it.

Dangerzone uses Linux containers (two of them), which are isolated application environments that share the Linux kernel with their host. The easiest way to get containers running on Mac and Windows is by using [Docker Desktop](https://www.docker.com/products/docker-desktop). So when you first install Dangerzone, if you don’t already have Docker Desktop installed, it helps you download and install it.

apyrgio · 2024-08-19T16:58:55Z

src/about.md


-When Dangerzone starts the container that will sanitize the suspicious document, it _disables networking_ and does not mount anything. So if a malicious document hacks the container, it doesn’t have access to your data and it can’t use the internet, so there’s not much it could do.
+When Dangerzone starts a container, it will first start a gVisor sandbox _inside_ that container, then runs the potentially-dangerous document processing workload inside the sandbox. This ensures that the process dealing with the document is isolated from the Linux kernel. The sandbox and its parent container are also both configured to _disable networking_ and to not mount anything from the host filesystem. So if a malicious document manages to execute arbitrary code, this code doesn’t have access to the host kernel, doesn't have access to your data, and can't use the internet, so there’s not much it could do.


I think that's the place where we can first mention the gVisor sandbox, given that we talk here for the first phase of the conversion:

Suggested change

When Dangerzone starts a container, it will first start a gVisor sandbox _inside_ that container, then runs the potentially-dangerous document processing workload inside the sandbox. This ensures that the process dealing with the document is isolated from the Linux kernel. The sandbox and its parent container are also both configured to _disable networking_ and to not mount anything from the host filesystem. So if a malicious document manages to execute arbitrary code, this code doesn’t have access to the host kernel, doesn't have access to your data, and can't use the internet, so there’s not much it could do.

When Dangerzone starts the container that will sanitize the suspicious document, it will first start a gVisor sandbox _inside_ that container, then run the potentially-dangerous document processing workload inside the sandbox. This ensures that the process dealing with the document is isolated from the Linux kernel. The sandbox and its parent container are also both configured to _disable networking_ and to not mount anything from the host filesystem. So if a malicious document manages to execute arbitrary code, this code doesn’t have access to the host kernel, doesn't have access to your data, and can't use the internet, so there’s not much it could do.

apyrgio · 2024-08-19T17:01:44Z

src/about.md


 * _Reads the original document from standard input_
 * Uses _LibreOffice_ or _PyMuPDF_ to convert original document to a PDF
 * Uses _PyMuPDF_ to split PDF into individual pages, and to convert those into RGB pixel data
 * _Writes the number of pages and the RGB pixel data to its standard output_

-Then that container quits. The host then writes the RGB pixel data to a volume. A second container starts and:
+Then that sandbox quits. The host then writes the RGB pixel data to a volume. A second sandbox starts and:


Keeping in line with the container/sandbox distinction, and because we don't use gVisor in the second conversion phase, I guess we have to refer to "a second container":

Suggested change

Then that sandbox quits. The host then writes the RGB pixel data to a volume. A second sandbox starts and:

Then that sandbox quits. The host then writes the RGB pixel data to a volume. A second container starts and:

We could also explain a bit here that we don't really need containers in the second phase for their security properties. We just want them for code portability. In TrustedPDF, the second phase happens in the host, for instance.

apyrgio · 2024-08-19T17:03:29Z

src/about.md


 * _Mounts a volume with the RGB pixel data_
 * If OCR is enabled, uses _PyMuPDF_ to convert RGB pixel data into a compressed, **searchable** PDF
 * Otherwise uses _PyMuPDF_ to convert RGB pixel data into a compressed, **flat** PDF
 * _Stores safe PDF in separate volume_

-Then that container quits, and the user can open the newly created safe PDF.
+Then that sandbox quits, and the user can open the newly created safe PDF.


Suggested change

Then that sandbox quits, and the user can open the newly created safe PDF.

Then that container quits, and the user can open the newly created safe PDF.

apyrgio · 2024-08-19T17:08:22Z

src/about.md

@@ -97,12 +97,13 @@ It’s still possible to get hacked with Dangerzone
 Like all software, it’s possible that Dangerzone (and more importantly, the software that it relies on like LibreOffice and Docker) has security bugs. Malicious documents are designed to target a specific piece of software – for example, Adobe Reader on Mac. It’s possible that someone could craft a malicious document that specifically targets Dangerzone itself. An attacker would need to chain these exploits together to succeed at hacking Dangerzone:

 * An exploit for either LibreOffice or PyMuPDF
-* A container escape exploit in the Linux kernel
+* A sandbox escape exploit in the gVisor kernel


Personally, I would be fine with linking to gVisor's security model, since it's important reading material for those who want to understand the security guarantees of gVisor.

apyrgio · 2024-08-19T17:22:28Z

src/about.md

@@ -97,12 +97,13 @@ It’s still possible to get hacked with Dangerzone
 Like all software, it’s possible that Dangerzone (and more importantly, the software that it relies on like LibreOffice and Docker) has security bugs. Malicious documents are designed to target a specific piece of software – for example, Adobe Reader on Mac. It’s possible that someone could craft a malicious document that specifically targets Dangerzone itself. An attacker would need to chain these exploits together to succeed at hacking Dangerzone:

 * An exploit for either LibreOffice or PyMuPDF
-* A container escape exploit in the Linux kernel
+* A sandbox escape exploit in the gVisor kernel
+* A container escape exploit in the Linux kernel that isn't protected by gVisor's syscall filters
 * In Mac and Windows, a VM escape exploit for Docker Desktop


I'm itching to remove this line. Once the attacker has access to the VM, they have access to the files of the host (sure, subject to some ACL rules depending on your Docker Desktop instlallation), and access to the internet. By our own standards, at this point the attacker is not "contained".

apyrgio · 2024-08-19T17:27:11Z

src/about.md

 * In Mac and Windows, a VM escape exploit for Docker Desktop

-If you opened such a malicious document with Dangerzone, it would start the first container and begin the conversion process. While it was converting the original document (say, a docx file) into a PDF using LibreOffice, it would exploit a vulnerability in LibreOffice to hack the container. Then, it would exploit a vulnerability in the Linux kernel to escape the container, and from there attempt to take over the computer.
+If you opened such a malicious document with Dangerzone, it would start the first sandbox and begin the conversion process. While it was converting the original document (say, a docx file) into a PDF using LibreOffice, it would exploit a vulnerability in LibreOffice to achieve code execution. Then, it would exploit a vulnerability in the gVisor kernel to escape the sandbox, then it would exploit a vulnerability in the Linux kernel to escape the container, and from there attempt to take over the computer.


Suggested change

If you opened such a malicious document with Dangerzone, it would start the first sandbox and begin the conversion process. While it was converting the original document (say, a docx file) into a PDF using LibreOffice, it would exploit a vulnerability in LibreOffice to achieve code execution. Then, it would exploit a vulnerability in the gVisor kernel to escape the sandbox, then it would exploit a vulnerability in the Linux kernel to escape the container, and from there attempt to take over the computer.

For example, let's say that you open a malicious `.docx` file that specifically targets Dangerzone. What Dangerzone would do first is to start a Linux container, then start a gVisor sandbox within it, and finally begin the conversion process into a PDF using LibreOffice. If the malicious document wants to escape to the host, it first needs to exploit a vulnerability in LibreOffice to achieve code execution. Once it has control of LibreOffice, it needs to exploit a vulnerability in the gVisor kernel to escape the sandbox. Assuming it finds one, it then needs to find a different vulnerability in the Linux kernel to escape the container, and from there attempt to take over the computer.

The chain of "Then" made the text a bit difficult to read, so I propose to add some fluff here and there.

eloquence · 2024-08-20T17:45:53Z

I did not mean to close this, sorry - looks like the merge of #39 automatically did so because this targets a branch that no longer exists. @EtiennePerot, could you re-open targeted to main?

EtiennePerot · 2024-08-24T00:56:05Z

@EtiennePerot, could you re-open targeted to main?

Done in #46.

apyrgio force-pushed the 2024-05-see-also branch from baad530 to 7939fb8 Compare August 19, 2024 14:19

about.html: Include the presence of gVisor as additional security layer

baa7b63

apyrgio force-pushed the moar-sandboxing branch from 1909771 to baa7b63 Compare August 19, 2024 15:16

apyrgio reviewed Aug 19, 2024

View reviewed changes

apyrgio mentioned this pull request Aug 20, 2024

Update "How it works" section and add some articles about Dangerzone #39

Merged

eloquence deleted the branch freedomofpress:2024-05-see-also August 20, 2024 17:44

eloquence closed this Aug 20, 2024

EtiennePerot mentioned this pull request Aug 24, 2024

about.md: Include the presence of gVisor as additional security layer. #46

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`about.html`: Include the presence of gVisor as additional security layer #41

`about.html`: Include the presence of gVisor as additional security layer #41

EtiennePerot commented Jul 28, 2024 •

edited

Loading

apyrgio commented Jul 29, 2024

apyrgio commented Aug 19, 2024

apyrgio Aug 19, 2024

apyrgio Aug 19, 2024

apyrgio Aug 19, 2024

apyrgio Aug 19, 2024

apyrgio Aug 19, 2024

apyrgio Aug 19, 2024

apyrgio Aug 19, 2024

apyrgio Aug 19, 2024

eloquence commented Aug 20, 2024

EtiennePerot commented Aug 24, 2024

	It uses [gVisor](https://gvisor.dev/) sandboxes running in Linux containers to sandbox dangerous documents. And it also adds some features that TrustedPDF doesn’t have: it works with any office documents, not just PDFs; it uses optical character recognition (OCR) to make the safe PDF have a searchable text layer; and it compresses the final safe PDF.
	It uses [gVisor](https://gvisor.dev/) sandboxes running in Linux containers to open dangerous documents, instead of virtual machines. And it also adds some features that TrustedPDF doesn’t have: it works with any office documents, not just PDFs; it uses optical character recognition (OCR) to make the safe PDF have a searchable text layer; and it compresses the final safe PDF.


		When Dangerzone starts the container that will sanitize the suspicious document, it _disables networking_ and does not mount anything. So if a malicious document hacks the container, it doesn’t have access to your data and it can’t use the internet, so there’s not much it could do.
		When Dangerzone starts a container, it will first start a gVisor sandbox _inside_ that container, then runs the potentially-dangerous document processing workload inside the sandbox. This ensures that the process dealing with the document is isolated from the Linux kernel. The sandbox and its parent container are also both configured to _disable networking_ and to not mount anything from the host filesystem. So if a malicious document manages to execute arbitrary code, this code doesn’t have access to the host kernel, doesn't have access to your data, and can't use the internet, so there’s not much it could do.

	Then that sandbox quits. The host then writes the RGB pixel data to a volume. A second sandbox starts and:
	Then that sandbox quits. The host then writes the RGB pixel data to a volume. A second container starts and:

	Then that sandbox quits, and the user can open the newly created safe PDF.
	Then that container quits, and the user can open the newly created safe PDF.

about.html: Include the presence of gVisor as additional security layer #41

about.html: Include the presence of gVisor as additional security layer #41

Conversation

EtiennePerot commented Jul 28, 2024 • edited Loading

apyrgio commented Jul 29, 2024

apyrgio commented Aug 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eloquence commented Aug 20, 2024

EtiennePerot commented Aug 24, 2024

`about.html`: Include the presence of gVisor as additional security layer #41

`about.html`: Include the presence of gVisor as additional security layer #41

EtiennePerot commented Jul 28, 2024 •

edited

Loading