Category: Model Training & Tuning

Zero Resource Learning

Zero Resource Learning is a method in artificial intelligence where systems learn from raw data without needing labelled examples or pre-existing resources like dictionaries. Instead of relying on human-annotated data, these systems discover patterns and structure by themselves. This approach is especially useful for languages or domains where labelled data is scarce or unavailable.

Gradient Clipping

Gradient clipping is a technique used in training machine learning models to prevent the gradients from becoming too large during backpropagation. Large gradients can cause unstable training and make the model’s learning process unreliable. By setting a maximum threshold, any gradients exceeding this value are scaled down, helping to keep the learning process steady and…

Neural Tangent Kernel

The Neural Tangent Kernel (NTK) is a mathematical tool used to study and predict how very large neural networks learn. It simplifies the behaviour of neural networks by treating them like a type of kernel method, which is a well-understood class of machine learning models. Using the NTK, researchers can analyse training dynamics and generalisation…

LoRA Fine-Tuning

LoRA Fine-Tuning is a method used to adjust large pre-trained artificial intelligence models, such as language models, with less computing power and memory. Instead of changing all the model’s weights, LoRA adds small, trainable layers that adapt the model for new tasks. This approach makes it faster and cheaper to customise models for specific needs…

Gradient Accumulation

Gradient accumulation is a technique used in training neural networks where gradients from several smaller batches are summed before updating the model’s weights. This allows the effective batch size to be larger than what would normally fit in memory. It is especially useful when hardware limitations prevent the use of large batch sizes during training.