index.html

<!DOCTYPE html>

<html>
   <head>
      <title>GSoC-CRIU</title>
     
      <script src='https://kit.fontawesome.com/a076d05399.js'></script>
      <style>
          h1 {margin-left: 25%;}
          h3 {margin-left: 25%;}
          p {margin-left: 25%; width: 48%;}
          i {margin-left: 25%;}
          ol {margin-left: 25%;}
          li {margin-left: 10%;}
          div {
                 margin-left: 25%;
                 width: 850px;
                 padding: 10px;
                 border: 5px solid gray;
           
          }
          table, th, td {
  		border: 1px solid black;
  		border-collapse: collapse;
	  }
          

      </style>
   </head>

   <body>
      <br />
      <h1> Google Summer of Code 2019</h1>
      <br />
      <i style='font-size:16px' class='fas'>&#xf073; 21 August 2019
		&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &#xf017; 10 minutes read</i>
      <p> </p>
      <p> </p>
      <div>
          <p style="margin-left:5%"> <b>Name:</b> Abhishek Dubey</p>
          <p style="margin-left:5%; width: 58%"> <b>Project Name:</b> Optimizing the Pre-dump Algorithm</p>
          <p style="margin-left:5%"> <b>Organization:</b> CRIU</p>
          <p style="margin-left:5%"> <b>Mentors:</b></p>
          <ul> 
              <li> <a href="xemul@virtuozzo.com">Pavel Emelianov (ScyllaDB)</a> </li>
              <li> <a href="avagin@gmail.com">Andrei Vagin (Google)</a> </li>
              <li> <a href="rppt@linux.ibm.com">Mike Rapoport (IBM)</a> </li>
              <li> <a href="rstoyanov1@gmail.com ">Radostin Stoyanov (RedHat)</a> </li>
          </ul>
          <p style="margin-left:5%"> <b>Commit list (working repo): </b><a href="https://github.com/dubeyabhishek/criu/commits/criu-dev">Github link</a></p>
          <p style="margin-left:5%; width: 58%"> <b>Official Documentation: </b><a href="https://criu.org/Optimizing_pre_dump_algorithm">Wiki page of product</a></p>
      </div>

      <h3> Jump start</h3>
           <p style="background-color: #D3D3D3;">  git clone git@github.com:dubeyabhishek/criu.git<br>
                <br/>
                cd criu/<br>
                <br/>
                make -j8<br>
                <br/>
                &#35;run complete zdtm test-suite<br>
                sudo test/zdtm.py run -a --pre=&lt;pre_dump_count&gt; --ignore-taint --keep-going <b>--pre-dump-mode=read</b><br>
                <br/>
                &#35;run specific zdtm test<br>
                sudo python test/zdtm.py run --pre=&lt;pre_dump_count&gt; -t zdtm/&lt;test_case_dir&gt;/&lt;test_case_name&gt;<b> --pre-dump-mode=read</b>
           </p> 
      <h3> The Project</h3>
      <p>
         I was involved in GSoC’19 with <a href="https://www.criu.org/Main_Page">CRIU</a> organization working on the project - 
         <a href="https://www.criu.org/Memory_pre_dump">Optimizing Pre-dump algorithm</a>.
         The idea behind this project was to reduce memory pressure and frozen time of target process while pre-dumping.
      </p>

      <h3> The Idea</h3>
      <p>
         In current implementation of <a href="https://www.criu.org/Incremental_dumps">pre-dump</a> in CRIU, first the memory content(pages) of target process 
         are stored in pipes. Later the pipes are flushed to
         disk images or page-server. Pipes are backed by pipe-buffer pages to store data. The pipe-buffer pages are pinned to memory making them non swappable. 
         When the count of pipe-buffer pages is high, the system is left with limited swappable pages to serve new memory requests. This incurs memory pressure 
         on overall system and may thrash hampering the performance. So, handling memory pressure is first issue.
         <br /><br />
         Another issue is duration for which current pre-dump algorithm keeps the target process <a href="https://www.criu.org/Freezing_the_tree">frozen</a>. 
         The parasite(blob injected in target process) in 
         current implementation drains the memory pages of target process into pipes and the target process remains frozen unless all pages are not drained. The 
         longer the frozen time, longer the duration for which pipe pages will be pinned to memory leading to memory pressure situation. Another use case which 
         desires shortest frozen time is live migration. So, another objective is to reduce frozen time of target process while pre-dumping.
         <br /><br />
         The optimized implementation must solve above mentioned two issues. In new pre-dump approach, the target process is frozen only till memory mappings are 
         collected. Then the process is unfrozen and it continues normally. Now start draining of process memory. We use 
         <a href="http://man7.org/linux/man-pages/man2/process_vm_readv.2.html"><em>process_vm_readv</em></a> 
         syscall to drain pages to user-space buffer. The syscall is given memory mappings(collected earlier) as input to perform this task. Since draining of 
         memory pages and process execution happen simultaneously, there is a possibility that the running process might have modified some memory mappings after 
         they have been collected by pre-dump. 
         In such case <em>process_vm_readv</em> will encounter the old mappings and fails. This gives rise to a race between pre-dumping and process execution. This race 
         needs to be handled on the fly for <em>process_vm_readv</em> to successfully drain complete memory.

      </p>
      
      <h3> Work done</h3>
          <table style="width:40%; margin-left:27%">
  			<tr align="center">
    				<th>Patch #</th>
    				<th>Title</th> 
    				<th>Status</th>
  			</tr>
  			<tr align="center">
    				<td>[PATCH 0/7]</td>
    				<td align="left"> <a href="https://lists.openvz.org/pipermail/criu/2019-August/044407.html"> GSoC 19: Optimizing the Pre-dump Algorithm </a></td>
    				<td>Submitted</td>
  			</tr>
  			<tr align="center">
    				<td>[PATCH 1/7]</td>
    				<td align="left"><a href="https://lists.openvz.org/pipermail/criu/2019-August/044408.html"> Adding --pre-dump-mode option </a></td>
    				<td>Submitted</td>
  			</tr>
                        <tr align="center">
    				<td>[PATCH 2/7]</td>
    				<td align="left"> <a href="https://lists.openvz.org/pipermail/criu/2019-August/044409.html"> Skip generating iov for non-PROT_READ memory </a></td>
    				<td>Submitted</td>
  			</tr>
  			<tr align="center">
    				<td>[PATCH 3/7]</td>
    				<td align="left"><a href="https://lists.openvz.org/pipermail/criu/2019-August/044410.html"> Skip adding PROT_READ to non-PROT_READ mappings </a></td>
    				<td>Submitted</td>
  			</tr>
                        <tr align="center">
    				<td>[PATCH 4/7]</td>
    				<td align="left"> <a href="https://lists.openvz.org/pipermail/criu/2019-August/044411.html"> Adding cnt_sub for stats manipulation </a></td>
    				<td>Submitted</td>
  			</tr>
  			<tr align="center">
    				<td>[PATCH 5/7]</td>
    				<td align="left"><a href="https://lists.openvz.org/pipermail/criu/2019-August/044412.html"> Handle vmsplice failure for read mode pre-dump </a></td>
    				<td>Submitted</td>
  			</tr>
                        <tr align="center">
    				<td>[PATCH 6/7]</td>
    				<td align="left"> <a href="https://lists.openvz.org/pipermail/criu/2019-August/044413.html"> The read mode pre-dump implementation </a></td>
    				<td>Submitted</td>
  			</tr>
  			<tr align="center">
    				<td>[PATCH 7/7]</td>
    				<td align="left"><a href="https://lists.openvz.org/pipermail/criu/2019-August/044414.html"> Refactor time accounting macros</a></td>
    				<td>Submitted</td>
  			</tr>
  			<tr align="center">
    				<td>[PATCH 8/7]</td>
    				<td align="left"><a href="https://lists.openvz.org/pipermail/criu/2019-August/044422.html"> Added --pre-dump-mode to libcriu</a></td>
    				<td>Submitted</td>
  			</tr>
		</table>      
      <br/>

      <h3> Interesting Issue </h3>
      <p>
          Handling the processing of iovs corresponding to modified memory mappings is most interesting issue of this project. Detailed discussion can be found 
          <a href="https://criu.org/Optimizing_pre_dump_algorithm#Design_issues">here</a>.
      </p>

      <h3> Evaluation </h3>
          <ol>
              <li style="margin-left:0%"> Performance </li>
	      <p style="margin-left:0%">test-config: 1GB</p>
              <p style="margin-left:0%"># python test/zdtm.py run --pre &lt;count> -t zdtm/static/maps04 --pre-dump-mode=&lt;read/splice> </p>
              <br/>
	           <table style="width:30%">
  			<tr align="center">
				<th>Pre-dump #</th>
				<th>splice (original)</th>
				<th>read (optimized)</th>
  			</tr>
  			<tr align="center">
    				<td>1</td>
				<td>0.59*</td>
				<td>0.66*</td>
  			</tr>
  			<tr align="center">
    				<td>2</td>
    				<td>0.06</td>
    				<td>0.06</td>
  			</tr>
  			<tr align="center">
    				<td>3</td>
    				<td>0.07</td>
    				<td>0.06</td>
  			</tr>
			<tr align="center">
    				<td>4</td>
    				<td>0.06</td>
    				<td>0.06</td>
  			</tr>
			<tr align="center">
    				<td>5</td>
    				<td>0.06</td>
    				<td>0.06</td>
  			</tr>
			<tr align="center" bgcolor="#D3D3D3">
    				<td>Total</td>
    				<td>0.84</td>
    				<td>0.90</td>
  			</tr>
		</table>
              <br/>

              <p style="margin-left:0%">Average performance drop : ~7%</p>
	      <p style="margin-left:0%">* First pre-dump in sequence have ~13% performance drop.</p>
              <br/>

              <li style="margin-left:0%"> Frozen Time</li>
              <p style="margin-left:0%">test-config: 1 GB</p>
                   <br/>
	           <table style="width:30%">
  			<tr align="center">
    				<th>Pre-dump #</th>
    				<th>splice (original)</th> 
    				<th>read (optimized)</th>
  			</tr>
  			<tr align="center">
    				<td>1</td>
				<td>125.13*</td>
				<td>82.93*</td>
  			</tr>
  			<tr align="center">
    				<td>2</td>
				<td>76.28</td>
				<td>65.94</td>
  			</tr>
  			<tr align="center">
    				<td>3</td>
				<td>78.05</td>
				<td>69.52</td>
  			</tr>
			<tr align="center">
    				<td>4</td>
				<td>77.80</td>
				<td>69.58</td>
  			</tr>
			<tr align="center">
    				<td>5</td>
				<td>74.61</td>
				<td>65.31</td>
  			</tr>
			<tr align="center" bgcolor="#D3D3D3">
    				<td>Total</td>
				<td>431.87</td>
				<td>353.28</td>
  			</tr>
		</table>
              <br/>

              <p style="margin-left:0%">Average frozen time reduced : ~18%</p>
	      <p style="margin-left:0%">* First pre-dump in sequence is ~35% faster.</p>
              <br/>

          <li style="margin-left:0%"> Memory pressure </li>
          <p style="margin-left:0%"> The new implementation reduces memory pressure on system, as pipes don't hog memory for longer duration.</p>
          </ol>

      <h3> Future work</h3>
      <ul style="margin-left:19%; width: 50%">
          <li> Possible optimization of page-by-page processing when huge memory region is unmapped.</li>
          <li> Develop new syscall by utilizing best of splice syscall and process_vm_readv syscall to avoid intermediate user-buffer.
               The new system call will directly splice memory pages to image files from the target process.</li>
      </ul>

      <h3> Acknowledgements </h3>
        <p> I am thankful to my mentors Pavel, Andrei and Mike for their guidance throughout the project. It was a great learning experience while working with them.
            I got chance to deep dive into memory draining part of CRIU and it was fun. Special thanks to Radostin for his prompt help and feedback. I would love to
            keep contributing to this wonderful community, called CRIU.
        </p>

      <h3> Conclusion </h3>
        <p> Overall, this year’s GSoC was a end-to-end learning experience for me and now I know "Open source" better than ever.</p>
   </body>
</html>