This document introduces how to install and configure MegaWise Docker.
Component | Configuration |
---|---|
GPU | NVIDIA Pascal or higher |
CPU | Intel CPU Sandy Bridge or higher |
RAM | 16 GB or higher |
Hard disk | 1 TB or higher |
Component | Version |
---|---|
Operating system | Ubuntu 16.04 or higher |
NVIDIA driver | 410 or higher. The latest version is recommended. |
Docker | 19.03 or higher |
NVIDIA Container Toolkit | 1.0.5-1 or higher |
-
Disable the Nouveau driver.
You must disable the Nouveau driver before installing the NVIDIA driver. Use the following command to check whether the Nouveau driver is enabled.
$ lsmod | grep nouveau
If the command returns any information about the Nouveau driver, you need to complete the following steps to disable the Nouveau driver:
-
Create the file
/etc/modprobe.d/blacklist-nouveau.conf
and add the following content:blacklist nouveau options nouveau modeset=0
-
Run the following command and reboot:
$ sudo update-initramfs -u $ sudo reboot
-
Confirm the Nouveau driver is disabled. The terminal does not return any information if the Nouveau driver is disabled.
$ lsmod | grep nouveau
If lsmod is not installed, you need to install lsmod before running the previous command.
$ sudo apt-get install lsmod
-
-
Download the latest NVIDIA driver installation file from NVIDIA driver download page.
Note: Installing or updating NVIDIA drivers comes with certain risks and may cause operating system crash. Please check whether your graphics card is compatible with the latest NVIDIA driver by visiting the NVIDIA driver download page in advance.
-
You must shut down the GUI before installing the NVIDIA driver. Press Ctrl+Alt+F1 to enter the CLI and run the following command to shut down the GUI:
$ sudo service lightdm stop
-
If you already have an NVIDIA driver installed, please remove the installed driver before installing a new one.
$ sudo apt-get remove nvidia-*
-
Give execute permission to the installation file and install the driver software. The following example assumes the installation file is downloaded to the
/home
directory.$ sudo chmod a+x NVIDIA-Linux-x86_64-430.50.run $ sudo ./NVIDIA-Linux-x86_64-430.50.run
-
Restart the operating system.
$ sudo reboot
-
Check whether the installation is successful.
$ sudo nvidia-smi
If the installation is successful, the terminal will return driver information which is similar to the following example:
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 430.34 Driver Version: 430.34 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GTX 1660 Off | 00000000:01:00.0 On | N/A | | 28% 49C P0 24W / 130W | 2731MiB / 5941MiB | 1% Default | +-------------------------------+----------------------+----------------------+
-
Update the package lists.
$ sudo apt-get update
-
Use curl to download the latest Docker.
$ sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add - $ sudo add-apt-repository \ "deb [arch=amd64] https://download.docker.com/linux/ubuntu \ $(lsb_release -cs) \ stable"
If curl is not installed, you need to install curl before running the previous command.
$ sudo apt-get install curl
-
Update the apt-get repository.
$ sudo apt-get update
-
Install Docker with the corresponding command-line interface and runtime environment.
$ sudo apt-get install docker-ce docker-ce-cli containerd.io
-
Run the following command again to check whether Docker is successfully installed. If the terminal returns version information about Docker, you can assume that Docker is successfully installed.
$ sudo docker -v
-
Use curl to add gpg key.
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \ sudo apt-key add - $ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
-
Update the package version to download.
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \ sudo tee /etc/apt/sources.list.d/nvidia-docker.list
-
Install NVIDIA runtime.
$ sudo apt-get update $ sudo apt-get install -y nvidia-container-toolkit
-
Restart Docker daemon.
$ sudo systemctl restart docker
-
Validate whether NVIDIA container toolkit is successfully installed.
$ sudo docker run --gpus all nvidia/cuda:9.0-base nvidia-smi
If the terminal returns version information about the GPU, you can assume that the NVIDIA container toolkit is successfully installed.
-
Download
install_megawise.sh
anddata_import.sh
to the same directory and make sure that you have execution access.$ wget https://raw.githubusercontent.com/Infini-Analytics/infini/master/script/data_import.sh \ https://raw.githubusercontent.com/Infini-Analytics/infini/master/script/install_megawise.sh $ chmod a+x *.sh
-
Install MegaWise and import sample data.
$ ./install_megawise.sh [parameter 1,required] [parameter 2,optional]
parameter 1:Absolute path of the installation folder of MegaWise. You must make sure that this folder does not exist.
parameter 2:MegaWise Docker image id. The default value is '0.4.2'.
Example:
$ ./install_megawise.sh /home/$USER/megawise '0.4.2'
The previous command performs the following operations:
- Pull MegaWise Docker image.
- Download config files and sample data.
- Launch MegaWise.
- Import sample data to MegaWise.
- Modify parameters to restart MegaWise.
If the terminal displays Successfully installed MegaWise and imported test data
, you can assume that MegaWise is successfully installed and sample data is imported.
-
Check the latest version number in docker hub.
-
Get the latest docker image of MegaWise.
$ sudo docker pull zilliz/megawise:$LATEST_VERSION
-
Install PostgreSQL client.
$ sudo apt-get install curl ca-certificates $ curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - $ sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt/ $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' $ sudo apt-get update $ sudo apt-get install postgresql-client-11
PostgreSQL client is installed to /usr/lib/postgresql/11/bin/ by default. After installation, run
which psql
. If the terminal does not return the correct location of the PostgreSQL client, please add the installation path to the environment variable.$ export PATH=/usr/lib/postgresql/11/bin:$PATH
-
Create a new folder as the working folder.
$ cd $WORK_DIR $ mkdir conf
-
Get MegaWise config files.
$ cd $WORK_DIR/conf $ wget https://raw.githubusercontent.com/Infini-Analytics/infini/master/config/db/chewie_main.yaml \ https://raw.githubusercontent.com/Infini-Analytics/infini/master/config/db/etcd.yaml \ https://raw.githubusercontent.com/Infini-Analytics/infini/master/config/db/megawise_config_template.yaml \ https://raw.githubusercontent.com/Infini-Analytics/infini/master/config/db/render_engine.yaml
-
Modify config files based on the hardware environment of MegaWise.
-
Open
chewie_main.yaml
in theconf
directory.-
Navigate to the following code:
cache: # size in GB cpu: physical_memory: 16 partition_memory: 16 gpu: gpu_num: 2 physical_memory: 2 partition_memory: 2
Configure the parameters based on the hardware environment of the server (The numbers are in GBs).
For the
cpu
part,physical_memory
andpartition_memory
respectively represents the available memory size for MegaWise and the memory size for the data cache partition. It is recommended that you set bothpartition_memory
andphysical_memory
to more than 70 percent of the server memory.For the
gpu
part,gpu_num
represents the number of GPUs used by MegaWise.physical_memory
andpartition_memory
respectively represents the available video memory size for MegaWise and the video memory size for the data cache partition. It is recommended that you reserve 2 GB of video memory to store the intermediate results during computation by settingpartition_memory
andphysical_memory
to a value that equals the video memory of a single GPU minus 2.
-
-
Open
megawise_config_template.yaml
in theconf
directory.-
Navigate to the following code and set parameter values:
worker_config: bitcode_lib: @bitcode_lib@ precompile: true stage: build_task_context_parallelism: 1 fetch_meta_parallelism: 1 compile_parallelism: 1 fetch_data_parallelism: 1 compute_parallelism: 1 output_parallelism: 1 worker_num : 2 gpu: physical_memory: 2 # unit: GB partition_memory: 2 # unit: GB cuda_profile_query_cnt: -1 #-1 means don't profile, positive integer means the number of queries to profile, other value invalid
Set the values of some parameters per the following table:
Parameter Value worker_num
The value of gpu_num
inchewie_main.yaml
physical_memory
The value of physical_memory
inchewie_main.yaml
partition_memory
The value of partition_memory
inchewie_main.yaml
-
Navigate to the following code and set parameter values:
string_config: dict_config: cache_size: 21474836480 # 20G split_threshold: 1000000 split_each: 100000 small_scale_num: 4000 # try not to use the temporary id hash_config: cache_size: 21474836480 # 20G bucket_num: 1999993 # prime number is a good choice bucket_size: 500 # make sure that each string is shorter than bucket_size-5 file_size: 104857600 # 100M
cache_size
indict_config
represents the memory size for encoding string dictionaries in bytes.cache_size
inhash_config
represents the memory size for encoding string hashes in bytes.
-
-
-
Run MegaWise.
sudo docker run --gpus all --shm-size 17179869184 \ -v $WORK_DIR/conf:/megawise/conf \ -v $WORK_DIR/data:/megawise/data \ -v $WORK_DIR/server_data:/megawise/server_data \ -v /tmp:/tmp \ -v /home/$USER/.nv:/home/megawise/.nv \ -p 5433:5432 \ $IMAGE_ID
Parameter description
--shm-size
The allocated memory size for a running Docker image in bytes. Use the value in the
physical_memory
parameter undercpu
->cache
inchewie_main.yaml
.-v
Directory mapping between the host and the Docker image. Separated by
:
, the former part is the directory of the host and the latter part is the directory of the Docker image.When launching the container, you can use
-v
to map local data files to the container to import local files to the MegaWise database.-p
Port mapping between the host and the Docker image. Separated by
:
, the former part is the port of the host and the latter part is the port of the Docker image. You can set the host to use any unoccupied port. In this tutorial, we use 5433.Logging starts when the container starts running. If you can find the following content in the log, you can assume the MegaWise server is successfully running.
MegaWise server is running...
-
Use MegaWise.
$ psql -U zilliz -p 5433 -h $IP_ADDR -d postgres
MegaWise Docker creates a built-in database
postgres
after launch. A default userzilliz
is created in the database. You will then be prompted to enter the password. The default password iszilliz
.If the terminal displays the following information, you can assume that the connection to MegaWise is successful.
psql (11.1) Type "help" for help. postgres=>
Note:If the connection timeouts, check whether the firewall settings are correct.