Memory-Augmented Neural Networks are artificial intelligence systems that combine traditional neural networks with an external memory component. This memory allows the network to store and retrieve information over long periods, making it better at tasks that require remembering past events or facts. By accessing this memory, the network can solve problems that normal neural networks…
Category: Embeddings & Representations
Sparse Coding
Sparse coding is a technique used to represent data, such as images or sounds, using a small number of active components from a larger set. Instead of using every possible feature to describe something, sparse coding only uses the most important ones, making the representation more efficient. This approach helps computers process information faster and…
Knowledge-Augmented Models
Knowledge-augmented models are artificial intelligence systems that combine their own trained abilities with external sources of information, such as databases, documents or online resources. This approach helps the models provide more accurate, up-to-date and contextually relevant answers, especially when the information is too vast or changes frequently. By connecting to reliable knowledge sources, these models…
Self-Attention Mechanisms
Self-attention mechanisms are a method used in artificial intelligence to help a model focus on different parts of an input sequence when making decisions. Instead of treating each word or element as equally important, the mechanism learns which parts of the sequence are most relevant to each other. This allows for better understanding of context…
Energy-Based Models
Energy-Based Models are a type of machine learning model that use an energy function to measure how well a set of variables fits a particular configuration. The model assigns lower energy to more likely or desirable configurations and higher energy to less likely ones. By finding the configurations that minimise the energy, the model can…
Semantic Forking Mechanism
A semantic forking mechanism is a process that allows a system or software to split into different versions based on changes in meaning or interpretation, not just changes in code. It helps maintain compatibility or create new features by branching off when the intended use or definition of data or functions diverges. This mechanism is…
Attention Rollout
Attention Rollout is a technique used to visualise and interpret how information flows through the layers of an attention-based model, such as a transformer. It helps to track which parts of the input the model focuses on at each stage, giving insight into the decision-making process. This method combines attention maps from different layers to…
Recursive Neural Networks
Recursive Neural Networks are a type of artificial neural network designed to process data with a hierarchical or tree-like structure. They work by applying the same set of weights recursively over structured inputs, such as sentences broken into phrases or sub-phrases. This allows the network to capture relationships and meanings within complex data structures, making…
Masked Modelling
Masked modelling is a technique used in machine learning where parts of the input data are hidden or covered, and the model is trained to predict these missing parts. This approach helps the model to understand the relationships and patterns within the data by forcing it to learn from the context. It is commonly used…
Positional Encoding
Positional encoding is a technique used in machine learning models, especially transformers, to give information about the order of data, like words in a sentence. Since transformers process all words at once, they need a way to know which word comes first, second, and so on. Positional encoding adds special values to each input so…