Category: Model Training & Tuning

Neural Tangent Kernel

The Neural Tangent Kernel (NTK) is a mathematical tool used to study and predict how very large neural networks learn. It simplifies the behaviour of neural networks by treating them like a type of kernel method, which is a well-understood class of machine learning models. Using the NTK, researchers can analyse training dynamics and generalisation…

LoRA Fine-Tuning

LoRA Fine-Tuning is a method used to adjust large pre-trained artificial intelligence models, such as language models, with less computing power and memory. Instead of changing all the model’s weights, LoRA adds small, trainable layers that adapt the model for new tasks. This approach makes it faster and cheaper to customise models for specific needs…

Gradient Accumulation

Gradient accumulation is a technique used in training neural networks where gradients from several smaller batches are summed before updating the model’s weights. This allows the effective batch size to be larger than what would normally fit in memory. It is especially useful when hardware limitations prevent the use of large batch sizes during training.

Dynamic Prompt Tuning

Dynamic prompt tuning is a technique used to improve the responses of artificial intelligence language models by adjusting the instructions or prompts given to them. Instead of using a fixed prompt, the system can automatically modify or optimise the prompt based on context, user feedback, or previous interactions. This helps the AI generate more accurate…

Parameter-Efficient Fine-Tuning

Parameter-efficient fine-tuning is a machine learning technique that adapts large pre-trained models to new tasks or data by modifying only a small portion of their internal parameters. Instead of retraining the entire model, this approach updates selected components, which makes the process faster and less resource-intensive. This method is especially useful when working with very…