Transformer P32 Kaggle免费GPU训练模型演示

作者：陈华 • 发布时间：2023-09-28 • 阅读 1241

上节课当中，已经把数据集传到了 Kaggle，并且设置好了 GPU 环境，接下来就可以迁移代码，然后训练模型了。所谓的迁移代码，通俗说法，其实就是复制，然后把调试的代码删掉即可。

大体流程

1、register Kaggle：https://www.kaggle.com，注册验证码，需要科学上网。

2、Upload a Dataset：压缩文件 -> 上传。

3、New Notebook。

4、切换 GPU，没有验证手机号的，需要先验证手机号，并打开网络。

5、复制代码，注意顺序，删除调试代码。

# 安装三方包
!pip install sacrebleu
!pip install jieba

# 复制顺序
config.py
utils.py
data_processor.py
data_loader.py
model.py
data_parallel.py
train.py

# 修改配置
BATCH_SIZE = 350
BATCH_SIZE_GPU0 = 50

6、离线运行，先在线测试，没问题之后，再离线运行。

# 出现超内存报错，可调整 batch_size
torch.cuda.OutOfMemoryError: CUDA out of memory.

7、模型文件下载，最好将过程文件全部下载下来。

8、加载缓存模型，做预测测试。

texts = [
    "Having gone through the long years, people have come to understand more than ever before how important this doctrine is and how precious peace can be.",
    "Everything is for the people, and everything relies on them.",
]

# 经过漫长的岁月，人们更能感受到这一思想的重要，深知和平之弥足珍贵。
# 一切为了人民，一切依靠人民

给大家提供的 news_big 数据集，自己有机器的同学可以尝试。另外，自己有机器的同学，可以把 Generator() 层的softmax 打开，收敛慢，但更稳定。