The first HumemAI research paper A machine with human-like memory systems, didn’t use any machine learning. Machine learning reinforcement learning (RL). All the memory management policies used there were handcrafted. They might be fine, as long as they work fine. But handcrafted functions are rigid and won’t be able to adapt to a new environment. And that’s why in this paper, we used RL to learn these policies. RL, one of the three pillars of machine learning, where the other two are supervised learning and unsupervised learning, is a bit different from the other two. It’s learning objective is reward maximization, not a simple maximum likelihood, meaning that it’ll do whatever it takes to maximize its rewards. This can lead to superhuman behaviors such as AlphaGo!
We introduced another toy environment called RoomEnv-v1, which is a bit more complicated than its predecessor RoomEnv-v0.
Abstract: Inspired by the cognitive science theory of the explicit human memory systems, we have modeled an agent with short-term, episodic, and semantic memory systems, each of which is modeled with a knowledge graph. To evaluate this system and analyze the behavior of this agent, we designed and released our own reinforcement learning agent environment, “the Room”, where an agent has to learn how to encode, store, and retrieve memories to maximize its return by answering questions. We show that our deep Q-learning based agent successfully learns whether a short-term memory should be forgotten, or rather be stored in the episodic or semantic memory systems. Our experiments indicate that an agent with human-like memory systems can outperform an agent without this memory structure in the environment.
Check out the paper https://arxiv.org/abs/2212.02098.