index.html

<!doctype html>
<html lang="en">

<head>
  <meta name="google-site-verification" content="ftFOlJETX-2KNjaPh8W6s8lhigItRuu9fOmjHZZ0nY0" />
  <!-- Required meta tags -->
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">

  <!-- Bootstrap CSS -->
  <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.3/css/bootstrap.min.css"
    integrity="sha384-MCw98/SFnGE8fJT3GXwEOngsV7Zt27NXFoaoApmYm81iuXoPkFOJwJ8ERdknLPMO" crossorigin="anonymous">

  <title>Pre-Avatar</title>
</head>
<style type="text/css">
  table {
    width: 100%;
    table-layout: fixed;
  }

  audio {
    width: 100%;
  }

  thead>tr>th:first-child {
    width: 96px;
  }
.bg-secondary {
    background-color: #11568b!important;
}
  @media (max-width: 767px) {
    .big-screen {
      display: none;
    }
    .container {
    max-width: 90% !important;
}
  }

  @media (min-width: 767px) {
    .small-screen {
      display: none;
    }
    .container {
    max-width: 90% !important;
}
  }
</style>


<body>
  <header class="header">
    <div class="jumbotron bg-secondary text-center">
      <div class="container">
        <div class="row align-items-center">
          <div class="col-md-12">
 
            <h1><a class="text-light" style="color:red;">Pre-Avatar: An Automatic Presentation Generation Framework Leveraging Talking Avatar </h1><br> 
            <font size=5><span style="color:#000000"> </font>  </a>
              
              <p>
 
            </p>
            
          </div>
        </div>
      </div>
    </div>
  </header>
  <main>
    <div class="container">
      <div class="row" id="result">
        <div class="col-md-12">
        	<h5>Abstract:</h5> 
          Since the beginning of the COVID-19 pandemic, remote conferencing and education have become important tools for coping with the difficulties of physical face-to-face communication. The previous applications all aim to lower the delivery cost by providing a real-time and zero-commuting solutions. However, our application is going to lower the production and reproduction costs when delivering communication from one place to another, even to another virtual place in Metaverse. This paper proposes a system called Pre-Avatar, which can generate a presentation video with the talking face of a target speaker after collecting a frontal photo and a 3-minute recording of the target speaker. Specifically, the system consists of three main modules, user experience interface (UEI), talking face and few-shot text-to-speech (TTS) module. The system clones the target speaker's voice and, given any text, generates the speech, which is then used to generate an avatar with appropriate lip and head movements. During inference, users only need to provide slides with notes to repeatedly generate their video through UEI.
          <br>

<p align="center">
<video width="640" height="480" controls>
    <source src="movie.mp4" type="video/mp4">
    您的浏览器不支持video标签
</video>
</p>
</body>
</html>