Patrick's Blogs

Transformer

By Anonymous; Published on 2024-11-17

Transformer 为何Transformer？在Attention一节中，我们讲述了注意力机制为循环神经网络带来的优点。那么有没有一种网络直接基于注意力机制进行构造，而不再依赖RNN呢？该网络结构就是Transformer！ Transformer模型在2017年google提出，直接基于Self-Attention结构，取代了之前NLP任务中常用的RNN结构。与RNN这类...

Attention

By Anonymous; Published on 2024-11-15

Attention Seq2Seq瓶颈 Seq2Seq编码器的所有信息都编码到一个context信息中，一个向量很难包含所有文本信息，并且Seq2Seq很难处理长序列。基于RNN的Decoder需要依赖上一时间步的context，这给训练带来了很大难度。 Attention Attention对上述瓶颈进行了如下改进：编码器会把更多的信息传递给解码器，编码器将所有...

Seq2Seq

By Patrick; Published on 2024-11-14

Seq2Seq架构 Seq2Seq简介全称为"Sequence to Sequence"，即序列到序列架构，顾名思义：从一个文本序列到一个文本序列，通常用于机器翻译、文本摘要任务。 Seq2Seq架构下面以机器翻译为例子，讲解Seq2Seq架构。我们看到机器翻译任务中的Seq2Seq模型是由编码器（Encoder）和解码器（Decoder）组成的，Enc...

深度学习基本步骤

By Patrick; Published on 2024-09-17

基本配置以下的超参数可以统一配置，方便后续修改： batch_size 初始学习率训练次数（max_epochs） GPU配置 batch_size=16 lr=1e4 max_epochs=100 device=torch.device("cuda:1" if torch.cuda.is_available() else "cpu") 数据读入...

pytorch-autograd

By Patrick; Published on 2024-09-16

实例代码我们先来看一个简单的代码： 123456import torchx = torch.tensor([1.0, 2.0], requires_grad=True)y = x**2loss = y.mean()loss.backward() 正向传播 x = [1.0, 2.0] y = [1.0, 4.0] loss = (1.0 + 4.0) /2 反向传播 ...

pytorch-计算图

By Patrick; Published on 2024-09-12

目标掌握计算图的原理掌握计算图静态生成和动态生成算法掌握计算图的常用执行方法什么是计算图？计算图（Computational Graph）是机器学习和深度学习中用于表示计算过程的一种数据结构。它由节点（Node）和边（Edge）组成，其中节点操作（例如：加法、乘法、激活函数等），边表示张量的状态及张量之间的依赖关系，即数据流动的方向。计算图在深度学习框架中作...

开题报告

By Patrick; Published on 2024-08-21

如何选题? 可以在中国知网上搜索自己学科专业全国排名靠前学校的近3年的所有相关硕博士毕业论文，分类汇总整理，从中找到自己喜欢的适合自己的研究方向，以确定大致的研究方向。疯狂查找近十年的相关研究文献，尤其近五年的文献必须认真研读，并需要尤其关注综述类期刊文献。关注每年的国家、省市级的社科基金项目、自然科学基金项目的申报情况，也能为我们提供一定的选题点子。一个清晰的选题，往...

benchmark

By Patrick; Published on 2024-08-12

What is benchmark？ How to build benchmark? 参考文献 https://zhuanlan.zhihu.com/p/682617717 https://zhuanlan.zhihu.com/p/685171601 https://www.skycaiji.com/aigc/ai15071.html

How to use pipeline?

By Patrick; Published on 2024-08-12

导包 12345import torchfrom peft import prepare_model_for_kbit_trainingfrom transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfigfrom transformers import pipeline 加载模型我们使用bit...

Quick Start of LoRA

By Patrick; Published on 2024-08-11

目标 Dataset Creation. Base Model Choice 步骤 1. 安装包 12345678!pip install -qqq bitsandbytes==0.39.0!pip install -qqq torch==2.0.1!pip install -qqq -U git+https://github.com/huggingface/transfo...

FEATURED TAGS

ABOUT ME

CONTACT

Dr. Patrick Director

Transformer

By Anonymous; Published on 2024-11-17

Attention

By Anonymous; Published on 2024-11-15

Seq2Seq

By Patrick; Published on 2024-11-14

深度学习基本步骤

By Patrick; Published on 2024-09-17

pytorch-autograd

By Patrick; Published on 2024-09-16

pytorch-计算图

By Patrick; Published on 2024-09-12

开题报告

By Patrick; Published on 2024-08-21

benchmark

By Patrick; Published on 2024-08-12

How to use pipeline?

By Patrick; Published on 2024-08-12

Quick Start of LoRA

By Patrick; Published on 2024-08-11

FEATURED TAGS

ABOUT ME

CONTACT

ARCHIVES