准备工作一个Linux系统。一个好点的显卡,目前仅支持算力8.0的显卡,推荐A100,30/40系暂不支持。已安装git/git-lfs,已安装docker。安装nvidia-container-runtime(强烈建议),安装步骤如下:添加NVIDIA包存储库密钥:# 对于ubuntu 18...
准备工作
- 一个Linux系统。
- 一个好点的显卡,目前仅支持算力8.0的显卡,推荐A100,30/40系暂不支持。
- 已安装git/git-lfs,已安装docker。
- 安装nvidia-container-runtime(强烈建议),安装步骤如下:
- 添加NVIDIA包存储库密钥:
# 对于ubuntu 18.04
curl -sL https://nvidia.github.io/nvidia-docker/gpgkey | \
sudo apt-key add -
# 对于ubuntu 20.04/22.04,推荐用这个
curl -sL https://nvidia.github.io/nvidia-docker/gpgkey | gpg --dearmor | sudo tee /usr/share/keyrings/nvidia-docker.gpg >/dev/null
- 添加NVIDIA包存储库
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -sL https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
sudo tee /etc/apt/sources.list.d/nvidia-docker.list
- 对于ubuntu 20.04/22.04,请再deb后面增加一个
[arch=amd64 signed-by=/usr/share/keyrings/nvidia-docker.gpg]
sudo vim /etc/apt/sources.list.d/nvidia-docker.list
# 改完后长这样
deb [arch=amd64 signed-by=/usr/share/keyrings/nvidia-docker.gpg] https://nvidia.github.io/libnvidia-container/stable/ubuntu18.04/$(ARCH) /
#deb https://nvidia.github.io/libnvidia-container/experimental/ubuntu18.04/$(ARCH) /
deb [arch=amd64 signed-by=/usr/share/keyrings/nvidia-docker.gpg] https://nvidia.github.io/nvidia-container-runtime/stable/ubuntu18.04/$(ARCH) /
#deb https://nvidia.github.io/nvidia-container-runtime/experimental/ubuntu18.04/$(ARCH) /
deb [arch=amd64 signed-by=/usr/share/keyrings/nvidia-docker.gpg] https://nvidia.github.io/nvidia-docker/ubuntu18.04/$(ARCH) /
- 更新依赖
sudo apt-get update
- 安装nvidia-container-runtime
sudo apt-get install nvidia-container-runtime
- 编辑配置文件
sudo vim /etc/docker/daemon.json
- 添加下面的参数
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia"
}
- 如果你有自定义加速源,大概长这样
{
"registry-mirrors": [
"https://dockerproxy.com",
"https://docker.mirrors.ustc.edu.cn",
"https://dockerhub.azk8s.cn"
],
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
},
"default-runtime": "nvidia"
}
- 重启docker服务
sudo systemctl restart docker
正式开始
- 下载模型与代码。
git lfs install
git clone https://huggingface.co/TMElyralab/lyraChatGLM.git
- 观察so文件依赖,目测用了python3.8/cuda12/tensorRT8.x
$ ldd glm.cpython-38-x86_64-linux-gnu.so
linux-vdso.so.1 (0x00007ffd0dc3c000)
libtorch_cpu.so => not found
libtorch_cuda.so => not found
libtorch.so => not found
libc10.so => not found
libcudart.so.12 => not found
libnvinfer.so.8 => /usr/local/cuda/lib64/libnvinfer.so.8 (0x00007f1b8e800000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f1b8e400000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f1b9d07a000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1b8e000000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f1b9d075000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f1b9d070000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f1b9d069000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f1b8e719000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1b9d0ea000)
- 经过询问,作者说明是用了8.6的tensorrt,相关链接
- 观察英伟达官方tensorrt容器链接,发现最新的23.04比较符合要求。它提供了python3.8/cuda12.1/tensorRT8.6.1。
- 拉取英伟达提供的tensorrt+pytorch镜像。
docker pull nvcr.io/nvidia/pytorch:23.04-py3
- 启动容器,顺便将第一步下载好的模型和代码映射到容器中。
docker run --gpus all \
-it --rm \
-v ${PWD}/lyraChatGLM:/workspace/ \
nvcr.io/nvidia/pytorch:23.04-py3
- 设置默认清华镜像源
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
- 安装依赖
pip install transformers
pip install icetk
- 运行demo
python demo.py
- 提示报错:
expecting library version 8.6.1.2 got 8.6.0.12,
- 容器里面的是8.6.1.2,作者用的是8.6.0.12,所以需要降级一下,安装8.6.0.12
- 检查目前所有tensorRT的组件。
dpkg -l | grep nvinfer
- 结果如下:
ii libnvinfer-bin 8.6.1.2-1+cuda12.0 amd64 TensorRT binaries
ii libnvinfer-dev 8.6.1.2-1+cuda12.0 amd64 TensorRT development libraries
ii libnvinfer-dispatch-dev 8.6.1.2-1+cuda12.0 amd64 TensorRT development dispatch runtime libraries
ii libnvinfer-dispatch8 8.6.1.2-1+cuda12.0 amd64 TensorRT dispatch runtime library
ii libnvinfer-headers-dev 8.6.1.2-1+cuda12.0 amd64 TensorRT development headers
ii libnvinfer-headers-plugin-dev 8.6.1.2-1+cuda12.0 amd64 TensorRT plugin headers
ii libnvinfer-lean-dev 8.6.1.2-1+cuda12.0 amd64 TensorRT lean runtime libraries
ii libnvinfer-lean8 8.6.1.2-1+cuda12.0 amd64 TensorRT lean runtime library
ii libnvinfer-plugin-dev 8.6.1.2-1+cuda12.0 amd64 TensorRT plugin libraries
ii libnvinfer-plugin8 8.6.1.2-1+cuda12.0 amd64 TensorRT plugin libraries
ii libnvinfer-vc-plugin-dev 8.6.1.2-1+cuda12.0 amd64 TensorRT vc-plugin library
ii libnvinfer-vc-plugin8 8.6.1.2-1+cuda12.0 amd64 TensorRT vc-plugin library
ii libnvinfer8 8.6.1.2-1+cuda12.0 amd64 TensorRT runtime libraries
- 下载8.6.0的deb文件,用来降级tensorrt版本
# 下载
wget https://developer.nvidia.com/downloads/compute/machine-learning/tensorrt/secure/8.6.0/local_repos/nv-tensorrt-local-repo-ubuntu2004-8.6.0-cuda-12.0_1.0-1_amd64.deb
# 安装(其实只是解压)
dpkg -i nv-tensorrt-local-repo-ubuntu2004-8.6.0-cuda-12.0_1.0-1_amd64.deb
# 进入解压后的目录
cd /var/nv-tensorrt-local-repo-ubuntu2004-8.6.0-cuda-12.0/
# 安装所有deb文件
dpkg -i *.deb
- 重新运行demo.py即可,由于我没有A100,所以又报错了,提示算力不匹配。
The engine plan file is generated on an incompatible device, expecting compute 8.9 got compute 8.0, please rebuild.
- 作者用的是8.6.0,如果是8.6.1,理论上可以兼容多算力,这样就可以支持30/40系列的显卡了。