visualdata 6 months ago

If you are just trying to understand transformers by building, I would start with Andrej Karpathy's Let's build GPT: [https://www.youtube.com/watch?v=kCc8FmEb1nY](https://www.youtube.com/watch?v=kCc8FmEb1nY)

CygnusX1 6 months ago

This is an incredible series, even if you don't have any plans to follow along.

antoine-ross 1 month ago

Can vouch for this. I believe all of Dr Andrej's tutorials are really intuitive and relatively easy to follow along. Learned a lot from watching all of his tutorials.

JRytM 2 months ago

!remindme 1 week

RemindMeBot 2 months ago

I will be messaging you in 7 days on [**2024-04-30 21:31:59 UTC**](http://www.wolframalpha.com/input/?i=2024-04-30%2021:31:59%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/18l6gyl/has_anyone_trained_their_own_llm_from_scratch/l0yaeg6/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F18l6gyl%2Fhas_anyone_trained_their_own_llm_from_scratch%2Fl0yaeg6%2F%5D%0A%0ARemindMe%21%202024-04-30%2021%3A31%3A59%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%2018l6gyl) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|

[deleted] 2 months ago

[удалено]

RemindMeBot 2 months ago

I will be messaging you in 8 hours on [**2024-05-07 11:37:38 UTC**](http://www.wolframalpha.com/input/?i=2024-05-07%2011:37:38%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/18l6gyl/has_anyone_trained_their_own_llm_from_scratch/l2xkb8l/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F18l6gyl%2Fhas_anyone_trained_their_own_llm_from_scratch%2Fl2xkb8l%2F%5D%0A%0ARemindMe%21%202024-05-07%2011%3A37%3A38%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%2018l6gyl) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|

Erikhm 1 month ago

!remindme 5days

RemindMeBot 1 month ago

I will be messaging you in 5 days on [**2024-06-06 08:48:14 UTC**](http://www.wolframalpha.com/input/?i=2024-06-06%2008:48:14%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/18l6gyl/has_anyone_trained_their_own_llm_from_scratch/l6lkq08/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F18l6gyl%2Fhas_anyone_trained_their_own_llm_from_scratch%2Fl6lkq08%2F%5D%0A%0ARemindMe%21%202024-06-06%2008%3A48%3A14%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%2018l6gyl) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|

redditfov 1 day ago

Thanks!

exclaim_bot 1 day ago

>Thanks! You're welcome!

nocnydrwal 6 months ago

!RemindMe one week

zolo90 6 months ago

!Remind me 1 month

freddyox 6 months ago

!RemindMe 10 hours

[deleted] 6 months ago

!Remind me one week

Accomplished_Pin_626 6 months ago

!Remind me 5 days

tamlc 6 months ago

!RemindMe 2 hours

Tacx79 6 months ago

Around a year ago (very shortly before pygmalion-6b and c.ai were starting to be very popular) I wrote some simple gpt from scratch with 100-600m params, as usual I wrote the dataloader to not just put the stuff randomly into the model - I had \~5gb of text (not sure if compressed or after tokenizing). The model started to form somewhat logical but still very stupid short sentences after 100k-300k steps (maybe 30k-100k with other architecture) and I calculated it would take 200 years on my pc to do just 1 epoch over that 5gb of text. All the models I trained were useless but I learned a lot of useful stuff about 'text' part of ai - it was fun after all

timschwartz 2 months ago

Were you training with a GPU or on your CPU?

KvAk_AKPlaysYT 6 months ago

I'm currently in the process of doing so by watching this video, keep in mind that I'm just doing it for the experience. https://youtu.be/UU1WVnMk4E8?si=EAWK-cTAOJQe7Z6W

[deleted] 6 months ago

Would love to hear your experiences after you're done.

KvAk_AKPlaysYT 6 months ago

!RemindMe 1 month

lordosthyvel 6 months ago

Optimistic

[deleted] 6 months ago

[удалено]

RemindMeBot 6 months ago

I will be messaging you in 1 month on [**2024-01-18 12:11:37 UTC**](http://www.wolframalpha.com/input/?i=2024-01-18%2012:11:37%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/18l6gyl/has_anyone_trained_their_own_llm_from_scratch/kdvrkot/?context=3) [**6 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F18l6gyl%2Fhas_anyone_trained_their_own_llm_from_scratch%2Fkdvrkot%2F%5D%0A%0ARemindMe%21%202024-01-18%2012%3A11%3A37%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%2018l6gyl) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|

proudomarr 2 months ago

u/KvAk_AKPlaysYT reminder bro

[deleted] 6 months ago

Not LLM which is too much expensive, but I have trained a transformer which output random "florida man" meme news titles lol. I used Colab to train with PyTorch, wrote entire transformer from scratch. Since it was free version of colab, after the training, I was banned from using GPU for about a month.

[deleted] 6 months ago

That's pretty funny. Good ol' florida man.

Wonderful-Camp2553 6 months ago

"Florida man melts GPUs in Google's data center, gets banned"

[deleted] 6 months ago

LMFAO.

stddealer 6 months ago

I've trained very small (a few thousand of parameters) LMs, based on HMM, able to generate gibberish that might look like English to non English speakers, but their actual working use case is to determine if some text is made of English language or not. I did the same things for French and German.

[deleted] 6 months ago

That's a cool project!

[deleted] 6 months ago

[удалено]

[deleted] 6 months ago

I have 90k in Google Cloud Credits. I will give them to anyone that wants to try to train their own model.

[deleted] 6 months ago

They run out in February: first come first serve!

Key-Morning-4712 6 months ago

I hope we can make it a unified effort by this sub and train one model that's actually competitive to other 7b models. That would be cool.

[deleted] 6 months ago

We have a lot of brain power in this sub to do such a thing. I've got the credits if we want to collab.

Key-Morning-4712 6 months ago

Let's do. It would be great if you can create a new github org and a new reddit post inviting everyone in this sub. Thanks for doing this btw.

[deleted] 6 months ago

We have a few folks who signed up for credits here: https://join.slack.com/t/halyai/shared_invite/zt-23euqlj0i-kM68jyXT_o__cx_1DkLYpA Join #gcp channel. We will divy up the credits with whoever joins by end of day. Update: we have too many people. Join and you can be on the waitlist.

[deleted] 6 months ago

Another update. We made the good people who got in as board members (10 people so far) who vote on funding new projects with google credits. It's like a communist VC firm. You can pitch your ideas and projects. Higher chance of getting approved if you solve a real societal problem. I'll work with Google to get more credits for this communist endeavor. I'm not on the board so I have no say what gets funded.

Blonkist 4 months ago

Is this still going on? I would be curious to hop in as an observer.

[deleted] 4 months ago

No, this program has ended.

waxbolt 6 months ago

How many FLOPs is that equivalent to?

[deleted] 6 months ago

No idea

Caffeine_Monster 6 months ago

A fair bit. Smidge under 10k A100 hours. or 1/20th of a llama2 7b. Probably better off doing some ambitious finetuning rather than under training a small model from scratch.

Smallpaul 6 months ago

I’m curious why you would rather use your GPU time on this rather than on doing something new.

[deleted] 6 months ago

The research project is understanding long term memory for LLMs. https://docs.google.com/document/d/1MY-GSRDR3wt9bIBikUZLyJ1USDWVTr7zcIvDvDAhWQI/edit?usp=drivesdk

Smallpaul 6 months ago

There is no need at all to train an LLM from scratch to execute on that plan and I’m completely confused about why you would want to give away the 90k to someone who wants to.

[deleted] 6 months ago

I'm porting off google cloud so might as well let someone have fun. No skin off my back.

Smallpaul 6 months ago

Why wouldn’t you use the tokens to actually explore/deliver the project you linked.

[deleted] 6 months ago

By the time we hear back if the grant was approved the credits are gone.

Smallpaul 6 months ago

So the grant really has nothing to do with the tokens and you are just confusing things by referencing it when I asked you why you want to train an LLM from scratch. And we are back to the original question of why DO you want to train an LLM from scratch?

BackgroundAmoebaNine 6 months ago

/u/Smallpaul, is there a reason you're going so hard on OP right now? Would you rather see them executed than to share 90K cloud credits that they do not have use for and are expiring in February?

[deleted] 6 months ago

Sorry for the confusion. I read your comment wrong. I was just showing that we are trying to get deep understanding about context and context windows.

[deleted] 6 months ago

I see you meant tokens as credits, I thought you meant tokens in LLM context.

Smallpaul 6 months ago

Sorry. Jumping between threads and mixing up my terminology.

[deleted] 6 months ago

All good. I bet you don't get confused as often as I do 😂

johnkapolos 6 months ago

PM'd you :)

mgranin 6 months ago

sent a PM to you

LoadingALIAS 6 months ago

I’m interested. Check your DMs.

[deleted] 6 months ago

😬😬😬

Extraltodeus 6 months ago

Total cumulated A100 hours for all llama2 models was around 3 millions IIRC

sexybokononist 6 months ago

Training this on just one A100 would take 342 years. If they started training in 1681 they’d be finishing up this year.

Gov_CockPic 6 months ago

How many guys on stationary bikes would it take to produce the electricity needed for the compute of 1 hour of A100 compute training?

m18coppola 6 months ago

I trained a language model on a single copy of the king james bible. it's hilariously incoherent but surprisingly structured.

Dyonizius 6 months ago

interesting!! some historians believe the bible was written by psyop agents

the_ham_man4 1 week ago

Historians = some uneducated Reddit users who believe anything on YouTube

Evening_Ad6637 6 months ago

This is my experience from June this year with llama.cpp -> train-from-scratch: https://www.reddit.com/r/LocalLLaMA/comments/14dstqm/tutorial_train_your_own_llamacpp_miniggmlmodel/

Fun_Tangerine_1086 6 months ago

Yes, working on -2k and -4k context versions of gpt2-medium and gpt2-large(ish) sized models. - With care, you can actually do useful work on a 12 GB GPU (RTX 3060 12gb here) - Using 4- and 8-bit optimizers; and other non-AdamW optimizers, batch_size=1; (you can actually do gpt2-medium w/ 4k context, w/ 4-bit AdamW on a 12GB GPU... with 262 mb of VRAM to spare) - Datasets -- Using subsets (10%-30%) of SlimPajama and some openwebtext; also longer-context material (radio transcripts, books, transcripts of old PDF reports). Switching subsets of SlimPajama occasionally does seem to work! - Pretty easy for training to get "stuck" or loss explode; have frequent checkpoints; be willing to stop and resume training w/ different learning rates. (In ye olde ML days, cyclic learning rates were in vogue; practically, I've been doing that w/ the stops/starts w/ different rates) - Check you work occasionally vs some of the open evaluations (ex: hellaswag). Can save a lot of effort, when say you mistokenize both training and validation datasets... also sample some output from you models regularly!

Gov_CockPic 6 months ago

What's your power utility bill been like since you started?

SlowSmarts 6 months ago

I trained a small gpt2 model about a year ago and it was just gibberish. I then started training a model from llama.cpp when I first saw it was possible about half a year ago. This has been more successful, and it has learned to stop itself recently. The llama model takes ~750GB of ram to train. I've been training it on and off, whenever I have CPU time not being used up by other projects. I've tried various methods of CPU clustering but nothing so far has performed well enough to persist with. I've also tried other training acceleration methods like CuBLAS, but my K80 GPUs are now old enough that it becomes a python library nightmare to get working and not crash. So, the llama model has been mostly trained on an average of 80 CPU threads, using most of the 768GB system ram, for about 3 months combined. ..... And it just now learned to stop itself, occasionally.

masc98 6 months ago

I've trained a good old GPT2 model on some whatsapp conversations, simple dumb project that I honestly suggest to you as well. It's simple to collect data and you'll make good laughs, guaranteed. Jokes aside, the important things you soon realise, is that CLM pretraining is SO important if you need good zero shot performance and common world knowledge in your model. If your model is meant for a narrower context, I'd suggest a lightweight pretraining with domain knowledge and then finetune on instructions. Lately I've used [xLLM](https://github.com/BobaZooba/xllm) library, pretty neat experience.

Imaginary_Bench_7294 6 months ago

Unfortunately, this requires a lot of time and effort. You need to create a dataset in the format you want the model to work with. If you want a good dataset, this entails curating it, reading through each entry for spelling or grammatical errors. That in itself takes a lot of work. If you use datasets that have been provided free of charge, you should still check the data for accuracy and appropriate content. Then comes the compute expense. Lora training is based on already trained models, so I don't know exactly how it compares in some aspects. However, for proper training from scratch, you need to use the full sized models, which is hardware prohibitive depending on the size of the model. Of course, while small models are convenient for testing and lower hardware requirements, larger models are better able to be generalized since they can develop more intricate relationships between words and concepts. There is also a fine balance between overfitting the model and the desired results. Overfit the data, and you're likely to have it spit out exact copies of the input texts. Under train the model, and it might string together unrelated things. One of the easier, but costlier ways to do this is by increasing the epochs, or how many times the data is fed in, while decreasing the amount it alters the relationships per epoch, aka the learning rate. Making the model learn slower and thus allowing more checkpoints to be saved, let's you select the point at which the training has reached optimal status for your needs. That also means that to reach the final epoch, you're looking at much more compute time required. Then you've got batch sizes, input string lengths, noise injection, etc, etc. Finding the right balance for what you want the model to do is not a simple matter. That's one of the major reasons most of the models are based on pretrained Llama. The fine tuning of a model can be done relatively quickly in comparison to the initial base model training, as you're only adjusting the internal relationships, not creating them. For the most part.

[deleted] 6 months ago

Can you use AI to do that work?

Imaginary_Bench_7294 6 months ago

For some things, sure. Such as curating the datasets, you could probably use AI for that. Spell check and grammar check systems could handle making sure the text isn't full of mistakes, and AI could determine if it is applicable to what you want the data to contain. The issue would come mostly from fact-checking the data if it is not roleplay content. Edit: hit post to early. The parts that would require human touch, such as determining if your model has reached the desired level of training, would be iffy. You can have some metrics such as loss, cross entropy, or other stats that tell you how close the model produces text VS the training data, but that is a loose representation. For coding or mathematics, that works pretty well. For creativity, not so much, as a higher loss means it is less likely to reproduce the input data, and therefore be more creative.

[deleted] 6 months ago

I've read papers saying most models are actually under trained.

Imaginary_Bench_7294 6 months ago

I'd have to read the papers you're referencing to really discuss them, however it depends on the goal of the model. Task specific models, such as coding or math centric models, might not be. Generalist models, such as for chatting, RP, etc, probably not so much a concern. Overtraining on wildly varying data such as chat logs will be detrimental to the creativity and also increase the potential of it spitting out exact copies of the training data. In fact, this can even happen when the model isn't over trained on the data. [https://www.theregister.com/2023/12/01/chatgpt\_poetry\_ai/](https://www.theregister.com/2023/12/01/chatgpt_poetry_ai/)

CKtalon 6 months ago

Yes, even at 1.5T tokens, a 7B LLM wouldn't have reached convergence. (Chinchilla (20x parameters) is not to be used as a rule of thumb for 'sufficient training'). Not sure how you are going to train from scratch though. Even a 1-2 B model will require thousands of dollars.

[deleted] 6 months ago

I have 90k in Google Cloud credits that expire in February. Need to use them. Happy to have others help me use them up (no crypto mining because that is against TOS).

artificial_simpleton 6 months ago

No one can possibly read through the entire dataset used for pretraining of a large language model, partially because it would take much longer than a human lifetime to do so. You need to curate the data you are using, but you don't do it manually, and knowing what heuristics to use is, of course, critical (some basic ones can be found at eg red pajama repo). Overfitting is also largely not a problem for LLM pretraining, simply because you usually have a lot more data than what your compute budget is. Also injecting noise for LLM pretraining is something no one does these days.

a_beautiful_rhind 6 months ago

Wasn't someone trying to reproduce phi here?

[deleted] 6 months ago

I'm interested to know if home grown LLMs also suffer from context loss on long prompts.

[deleted] 6 months ago

I'm working with UCSB on a research project and would love to interview anyone who has experience in this.

[deleted] 6 months ago

[удалено]

[deleted] 6 months ago

Why'd you drop out?

[deleted] 6 months ago

[удалено]

[deleted] 6 months ago

Sounds like you at least had a good time in IV 😁

[deleted] 6 months ago

I went there for 10 years. I was the Van Wilder of UCSB. They couldn't get rid of me.

MindOrbits 6 months ago

Check out Santa Barbara Hacker Space. I have a feeling a few members have been working with AI.

[deleted] 6 months ago

Is Steve still with them? Love that guy.

MindOrbits 6 months ago

I escaped CA a while a go so haven't been in person for some time, even when I was there often who you'd see really depended on day and time. They had a slack channel, that's probably the best way to find out.

a_beautiful_rhind 6 months ago

I'm assuming they do. Nobody can train anything substantial though because $$$$.

Sartilas 6 months ago

Hard

LoathsomeNeanderthal 6 months ago

Old article but you get the idea: https://www.mosaicml.com/blog/gpt-3-quality-for-500k

fab_space 6 months ago

I did it from scratch with the goal to make it able to produce valid words just by generating letter after letter.. giving it a score at each generation and use that feedback to adjust weights. In the other terminal the generator show me real time results generating a bunch of text (up to 256chars, space and punctuation included). Doing this will make u aware about how hard is to achieve a general LM based on words instead of a use specific one based on chars. I’ll try to serve this as web app then the reinforcement will be made by multiple users increasing the overall generation results faster than just me but i’m sure it will be hacked by lamers very soon.

randomqhacker 6 months ago

Not an LLM, but I used bi-grams and tri-grams from a large corpus of the Internet, ranked by frequency, to generate likely next words. I also added some variance (think temperature) to make it not always pick the most likely words. It was fun to watch it babble in a way that almost worked grammatically, but otherwise it was pretty useless, unless you want to reinvent next word prediction for a virtual keyboard or something. [https://en.wikipedia.org/wiki/Word\_n-gram\_language\_model](https://en.wikipedia.org/wiki/Word_n-gram_language_model)

richhoods 6 months ago

I think people forget what the B stands for with these llms. Training these models even on cloud machines are many times more expensive than what most people can afford

Internet--Traveller 6 months ago

It's quite technical, you need to create your own datasets in json to train it. I watched a video of it, and decided not to try it.

chibop1 6 months ago

Unless you're training a really tiny model like GPT1 with 117M, no individual can train from scratch. Most people mean finetuning. For full parameter finetuning, you can get it done with 8x a100 80gb in about 30 hours depending on the size of dataset. As far as training from scratch: According to [this](https://the-decoder.com/gpt-4-architecture-datasets-costs-and-more-leaked/), the training costs for GPT-4 was around $63 million. For [Llama-2](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md), here are hours spent/gpu. * 7B: 184320 * 13B: 368640 * 70B: 1720320 * Total: 3311616 If you were to rent a100 80gb at $1.6/hr, that's $294,912 USD to train 7B model. This only includes GPU cost. This does not include obtaining quality dataset, extra hardware, and so on.

Revolutionalredstone 6 months ago

I've created a few from absolute scratch. I'm not using transformers, back prop, or even connectionism. Instead I've got a drag-net system where millions of tiny programs are generated and individually graded based on their contribution to successful prediction. (collectivism) The technique is incredibly simple and doesn't even use math (no divide or anything even as complicated as that in the program) Its also extremely fast at inference time. I've got a bunch of other ideas as well, I want to combine ideas carefully to see what's important

iamkushagra24 1 month ago

!remindme 2 weeks

RemindMeBot 1 month ago

I will be messaging you in 14 days on [**2024-05-28 06:39:40 UTC**](http://www.wolframalpha.com/input/?i=2024-05-28%2006:39:40%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/LocalLLaMA/comments/18l6gyl/has_anyone_trained_their_own_llm_from_scratch/l3yuhlj/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2FLocalLLaMA%2Fcomments%2F18l6gyl%2Fhas_anyone_trained_their_own_llm_from_scratch%2Fl3yuhlj%2F%5D%0A%0ARemindMe%21%202024-05-28%2006%3A39%3A40%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%2018l6gyl) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|

minecraft_simon 6 months ago

why would anyone do that?

Business-Lead2679 6 months ago

Oh, definitely! Here is my little side project with Mistral-7b, I trained it to respond in a more readable way haha https://preview.redd.it/cb965m0go87c1.png?width=682&format=png&auto=webp&s=d53d0cded865d2d2fbf87918a87601221faed1b9

Business-Lead2679 6 months ago

PS ignore the params at the top, I’m running the model in Jupyter notebook and doing ctrl c ctrl v of its responses into bettergpt.chat so I can see how the responses look like in the classic UI

[deleted] 6 months ago

Awesome! I tried mistral out but the results were really poor. Not sure how they got so much funding from A16Z with an LLM that barely works. This was a month ago so maybe it's better now.

Business-Lead2679 6 months ago

Did you use the instruct version with the correct prompt template? Or perhaps you used the base model (which of course won’t respond correctly as it’s not instruction tuned). I fine-tuned the base model on my dataset, and it works really well. I love how it breaks down the problems into small pieces so you really understand what’s it about: https://preview.redd.it/5k7x0ah0r87c1.png?width=682&format=png&auto=webp&s=b545772f2863cd3a3ba87e4c73a2a4beb10b75f2

Business-Lead2679 6 months ago

And I fine-tuned it in such way that it will address you by the name you set!

[deleted] 6 months ago

It must have been the base model since it was so bad. I need to try the one that actually works.

Mac-Wac-1 6 months ago

lol ya if you have money. Like a min of 500k

MetalHarmony761 6 months ago

!Remind me 1 week

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe