Logo
Explore idrainformatica.it Help
Register Sign In
starred/nanoGPT
1
0
Fork 0
You've already forked nanoGPT
mirror of https://github.com/karpathy/nanoGPT.git synced 2026-04-22 08:15:15 +02:00
Code Issues Packages Projects Releases Wiki Activity
89 Commits 6 Branches 0 Tags
21675d7755b47415d524a3860a4e8e17ffa4cc66
Commit Graph

9 Commits

Author SHA1 Message Date
Andrej Karpathy
89da79eee1 add note of caution for the produced warning, investigate later 2023-01-14 20:38:22 +00:00
Andrej Karpathy
91d02510ce fix bug... if topk > vocab_size, torch.topk will throw error 2023-01-14 03:57:00 +00:00
Andrej Karpathy
43b37fd568 reverse the order, making sure that the final layer init is preserved, and becomes the token embedding instead of the other way around. otherwise the loss can be all messed up from a bad init 2023-01-14 02:16:10 +00:00
Andrej Karpathy
7c8288552b tie the weights of lm_head.weight and transformer.wte.weight, i.e. the last linear layer of decoder and the token embeddings. 2023-01-14 01:00:55 +00:00
Andrej Karpathy
8f85b83347 inference time mini-optimization low-hanging fruit ty @jxtps for raising: when we are running inference we can apply lm_head on only the very last token 2023-01-12 06:02:50 +00:00
Andrej Karpathy
177d5f7dc5 disabling torch.jit.script here for massive performance boost when using torch.compile, our default. see issue #11. thanks @vgoklani for flagging 2023-01-02 23:05:01 +00:00
Andrej Karpathy
2febf4463c candidate changes to apis, have to think through more 2023-01-01 01:29:48 +00:00
ankandrew
7f0e6d9a71 Frozen GPTConfig 2022-12-29 17:07:19 -03:00
Andrej Karpathy
fe8042867c first very bad commit 2022-12-28 00:58:19 +00:00
Powered by Gitea Version: 1.25.2 Page: 37ms Template: 5ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API Chi siamo