Neural combinatorial optimisation is a method that uses neural networks to solve complex problems where the goal is to find the best combination or arrangement from many possibilities. These problems are often difficult for traditional computers because there are too many options to check one by one. By learning from examples, neural networks can quickly…
Category: Deep Learning
Intrinsic Motivation in RL
Intrinsic motivation in reinforcement learning refers to a method where an agent is encouraged to explore and learn, not just by external rewards but also by its own curiosity or internal drives. Unlike traditional reinforcement learning, which relies mainly on rewards given for achieving specific goals, intrinsic motivation gives the agent additional signals that reward…
Soft Actor-Critic
Soft Actor-Critic is a type of algorithm used in reinforcement learning that helps computers learn to make decisions by balancing two goals: getting rewards and staying flexible in their choices. It uses a method called maximum entropy, which means it encourages the computer to try different actions rather than always picking the same one. This…
Distributional Reinforcement Learning
Distributional Reinforcement Learning is a method in machine learning where an agent learns not just the average result of its actions, but the full range of possible outcomes and how likely each one is. Instead of focusing solely on expected rewards, this approach models the entire distribution of rewards the agent might receive. This allows…
Prioritised Experience Replay
Prioritised Experience Replay is a technique used in machine learning, particularly in reinforcement learning, to improve how an algorithm learns from past experiences. Instead of treating all previous experiences as equally important, this method ranks them based on how much they can help the learning process. The algorithm then focuses more on experiences that are…
Dueling DQN
Dueling DQN is a type of deep reinforcement learning algorithm that improves upon traditional Deep Q-Networks by separating the estimation of the value of a state from the advantages of possible actions. This means it learns not just how good an action is in a particular state, but also how valuable the state itself is,…
Double Deep Q-Learning
Double Deep Q-Learning is an improvement on the Deep Q-Learning algorithm used in reinforcement learning. It helps computers learn to make better decisions by reducing errors that can happen when estimating future rewards. By using two separate networks to choose and evaluate actions, it avoids overestimating how good certain options are, making learning more stable…
Deep Q-Networks (DQN)
Deep Q-Networks, or DQNs, are a type of artificial intelligence that helps computers learn how to make decisions by using deep learning and reinforcement learning together. DQNs use neural networks to estimate the value of taking certain actions in different situations, which helps the computer figure out what to do next. This method allows machines…
On-Policy Reinforcement Learning
On-policy reinforcement learning is a method where an agent learns to make decisions by following and improving the same policy that it uses to interact with its environment. The agent updates its strategy based on the actions it actually takes, rather than exploring alternative possibilities. This approach helps the agent gradually improve its behaviour through…
Off-Policy Reinforcement Learning
Off-policy reinforcement learning is a method where an agent learns the best way to make decisions by observing actions that may not be the ones it would choose itself. This means the agent can learn from data collected by other agents or from past actions, rather than only from its own current behaviour. This approach…