Skip to content

LLM #16

@testpppppp

Description

@testpppppp

行业角度看LLM

通向AGI之路:大型语言模型(LLM)技术精要

大模型有哪些

https://zhuanlan.zhihu.com/p/611403556

模型结构

为什么现在的LLM都是Decoder only的架构?

lowrank角度

如何训练

Ladder Side-Tuning:预训练模型的“过墙梯”
LoRA: Low-Rank Adaptation of Large Language Models 简读
https://huggingface.co/blog/peft
https://github.com/tloen/alpaca-lora
如何评价 LLaMA 模型泄露? - 苏洋的回答 - 知乎
https://www.zhihu.com/question/587479829/answer/2925378135

如何部署推理

量化

Pytorch Lightning 完全攻略
https://github.com/Shivanandroy/simpleT5

资源需要多少

参考

t5 finetune

https://www.kaggle.com/code/evilmage93/t5-finetuning-on-sentiment-classification
https://discuss.huggingface.co/t/how-to-fine-tune-t5-base-model/8478
https://shivanandroy.com/fine-tune-t5-transformer-with-pytorch/
https://www.kaggle.com/code/nulldata/training-t5-models-made-easy-with-simplet5/

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions