Skip to main content

How to Setup Software

Officially Ubuntu 22.04 is supported.
WSL2 is semi supported.
Other distros your on your own but it should be straight forward.
BTRFS is nice due to zstd disk compression to save space.

Path of least pain

Download jammy server live and install it.
(https://releases.ubuntu.com/22.04/ubuntu-22.04-live-server-amd64.iso)

Install podman + allow rootless quota

apt-get install ca-certificates podman

#allow rootless podman CPU quota
mkdir -p /etc/systemd/system/[email protected]/
cat <<EOT >> /etc/systemd/system/[email protected]/delegate.conf
[Service]
Delegate=memory pids io cpu cpuset
EOT

Install the NVIDIA driver + cuda + nvidia-container-runtime if you have NVIDIA Gpus.

#latest NVIDIA driver currently upstream
apt-get install -y --no-install-recommends nvidia-driver-510

#latest cuda currently available
wget https://developer.download.nvidia.com/compute/cuda/11.6.2/local_installers/cuda_11.6.2_510.47.03_linux.run
sh cuda_11.6.2_510.47.03_linux.run --silent --toolkit --no-drm --no-man-page
rm cuda_11.6.2_510.47.03_linux.run
echo "PATH=\"\$PATH:/usr/local/cuda-11.6/bin\"" > /etc/environment && \
echo "CUDA_HOME=\"/usr/local/cuda-11.6\"" >> /etc/environment && \
echo "CUDA_PATH=\"/usr/local/cuda-11.6\"" >> /etc/environment

#install nvidia-container-runtime + setup OCI hook
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | apt-key add -
curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | apt-key add -
mkdir -p /etc/apt/sources.list.d/
cat <<EOT >> /etc/apt/sources.list.d/nvidia-container-runtime.list
deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/\\\$(ARCH) /
deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/\\\$(ARCH) /
deb https://nvidia.github.io/nvidia-docker/ubuntu18.04/\\\$(ARCH) /
EOT
apt-get update
apt-get install -y nvidia-container-runtime
mkdir -p /usr/share/containers/oci/hooks.d
cat <<EOT >> /usr/share/containers/oci/hooks.d/oci-nvidia-hook.json
{
"version": "1.0.0",
"hook": {
"path": "/usr/bin/nvidia-container-toolkit",
"args": ["nvidia-container-toolkit", "prestart"],
"env": [
"PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
]
},
"when": {
"always": true,
"commands": [".*"]
},
"stages": ["prestart"]
}
EOT
#TODO: remove this once cgroupsV2 support is stable (likely the next major release)
sed -i 's/^#no-cgroups = false/no-cgroups = true/;' /etc/nvidia-container-runtime/config.toml

(Path of pain) Provision Disk

warning

Make sure you dont format your main disk

Reference https://github.com/gpuedge/farm_image

export DRIVE="sda"
export HOSTNAME="node1"
BLOCKSIZE=$(cat /sys/block/$DRIVE/queue/physical_block_size)
IOSIZE=$(cat /sys/block/$DRIVE/queue/optimal_io_size)
ALIGN=$(cat /sys/block/$DRIVE/alignment_offset)
SECTOR=$(($IOSIZE/$BLOCKSIZE))
FIRSTSECTOR=$(($ALIGN+$IOSIZE)/BLOCKSIZE)
SIZEBOOT=$((1024*1024*1024) / IOSIZE) * SECTOR)

#Align diskspace to sectors
parted -s /dev/$DRIVE mklabel gpt
parted -s -a optimal /dev/$DRIVE mkpart primary $FIRSTSECTORs $FIRSTSECTOR+$SIZEBOOTs
parted -s -a optimal /dev/$DRIVE mkpart primary $FIRSTSECTORs $FIRSTSECTOR+$SIZEBOOTs
parted -s -a optimal /dev/$DRIVE mkpart primary $FIRSTSECTOR+$SECTOR+$SIZEBOOTs 100%
parted -s /dev/$DRIVE set 1 esp on
parted -s /dev/$DRIVE set 1 boot on

#make disk bootable by UEFI by placing fat32 parition first
mkfs.vfat /dev/$DRIVE1
#make second partion BTRFS
mkfs.btrfs -f -R free-space-tree /dev/$DRIVE2

Lets mount the partitions now and copy the jammy base daily.

wget https://cdimage.ubuntu.com/ubuntu-base/jammy/daily/current/jammy-base-amd64.tar.gz
mount -o discard=async,space_cache=v2,compress-force=zstd:2,ssd,noatime /dev/$DRIVE2 /mnt
tar --same-owner -xf jammy-base-amd64.tar.gz -C /mnt
mkdir /mnt/boot/efi
mount -o umask=0077 /dev/$DRIVE1 /mnt/boot/efi
note

By mounting the disk initially with zstd compression forced we will save space off the initial tar extraction of the base system.

Lets update our fstab now.

UUID1=$(blkid -s UUID -o value /dev/$DRIVE1)
UUID2=$(blkid -s UUID -o value /dev/$DRIVE2)
cat <<EOT > /mnt/etc/fstab
proc /proc proc defaults 0 0
UUID=\$(UUID2) / btrfs defaults,discard=async,space_cache=v2,compress-force=zstd:2,ssd,noatime 0 0
UUID=\$(UUID1) /boot/efi vfat umask=0077 0 0
EOT

Locales

cat <<EOT > /mnt/etc/default/locale
LC_CTYPE="en_US.UTF-8"
LC_ALL="en_US.UTF-8"
LANG="en_US.UTF-8"
LANGUAGE="en_US.UTF-8"
EOT

Hostname and hosts

echo $HOSTNAME > /mnt/etc/hostname
cat <<EOT > /mnt/etc/hosts
127.0.0.1 localhost
127.0.0.1 \$(HOSTNAME)
EOT

Provision packages