Category: Model Optimisation Techniques

Residual Connections

Residual connections are a technique used in deep neural networks where the input to a layer is added to its output. This helps the network learn more effectively, especially as it becomes deeper. By allowing information to skip layers, residual connections make it easier for the network to avoid problems like vanishing gradients, which can…

Gradient Accumulation

Gradient accumulation is a technique used in training neural networks where gradients from several smaller batches are summed before updating the model’s weights. This allows the effective batch size to be larger than what would normally fit in memory. It is especially useful when hardware limitations prevent the use of large batch sizes during training.

Parameter-Efficient Fine-Tuning

Parameter-efficient fine-tuning is a machine learning technique that adapts large pre-trained models to new tasks or data by modifying only a small portion of their internal parameters. Instead of retraining the entire model, this approach updates selected components, which makes the process faster and less resource-intensive. This method is especially useful when working with very…

Hyperparameter Optimisation

Hyperparameter optimisation is the process of finding the best settings for a machine learning model to improve its performance. These settings, called hyperparameters, are not learned from the data but chosen before training begins. By carefully selecting these values, the model can make more accurate predictions and avoid problems like overfitting or underfitting.

Knowledge Distillation

Knowledge distillation is a machine learning technique where a large, complex model teaches a smaller, simpler model to perform the same task. The large model, called the teacher, passes its knowledge to the smaller student model by providing guidance during training. This helps the student model achieve nearly the same performance as the teacher but…