Jul 28, 2024

Reflecting on my experience with learning new technology

4 Comments

Thank you for sharing Cornellius! One to bookmark for myself!

You are welcome Mandy. I am glad you find it useful.

The training process for large models mainly includes the following stages:

1. Pretraining Stage

2. Tokenizer Training

3. Language Model Pretraining

4. Dataset Cleaning

5. Model Performance Evaluation

6. Instruction Tuning Stage

7. Open Source Dataset Organization

8. Model Evaluation Methods

Thank you for the input! It's certainly true the cycle for training LLM following the steps you pointing out.

Non-Brand Data

What Would I do to Learn Large Language Model…