TL;DR: This paper considers adaptive control architectures that integrate active sensory-motor systems with decision systems based on reinforcement learning and proposes a new decision system that overcomes the effects of perceptual aliasing.
Abstract: This paper considers adaptive control architectures that integrate active sensory-motor systems with decision systems based on reinforcement learning. One unavoidable consequence of active perception is that the agent's internal representation often confounds external world states. We call this phenomenon perceptual aliasing and show that it destabilizes existing reinforcement learning algorithms with respect to the optimal decision policy. A new decision system that overcomes these difficulties is described. The system incorporates a perceptual subcycle within the overall decision cycle and uses a modified learning algorithm to suppress the effects of perceptual aliasing. The result is a control architecture that learns not only how to solve a task but also where to focus its attention in order to collect necessary sensory information.
TL;DR: It is shown that humans have several ways of interacting with their environments which resist accommodation in the decision cycle model, which is critiquing the received theory in human computer interaction.
Abstract: Multimedia technology offers instructional designers an unprecedented opportunity to create richly interactive learning environments. With greater design freedom comes complexity. The standard answer to the problems of too much choice, disorientation, and complex navigation is thought to lie in the way we design the interactivity in a system. Unfortunately, the theory of interactivity is at an early stage of development. After critiquing the decision cycle model of interaction – the received theory in human computer interaction – I present arguments and observational data to show that humans have several ways of interacting with their environments which resist accommodation in the decision cycle model. These additional ways of interacting include: preparing the environment, maintaining the environment, and reshaping the cognitive congeniality of the environment. Understanding how these actions simplify the computational complexity of our mental processes is the first step in designing the right sort of resources and scaffolding necessary for tractable learner controlled learning environments.
TL;DR: In this paper, the authors present a Recognitional Planning Model (RPM) based on a series of observations of Army, Marine, and Naval command post exercises, which is intended as a prescriptive model to increase the speed of planning.
Abstract: : Planning under conditions of uncertainty, time pressure, and other stressors is a complex cognitive activity for military command post staff members. The Army and the Marine Corps have developed formal planning models intended as step-by-step guides. However, these models are inconsistent with the actual strategies of skilled planners, and they slow down the decision cycle. As a result, the formal models are usually ignored in practice, in order to generate faster tempo. Yet some procedure is needed, to direct inexperienced staff and to coordinate the actions of a planning team. We present a Recognitional Planning Model (RPM) based on a series of observations of Army, Marine, and Naval command post exercises. The RPM applies the concept of recognitional decision making. The RPM is intended as a prescriptive model to increase the speed of planning, but it is also a descriptive model capturing the strategies of experienced planning teams.
TL;DR: In this paper, the design of reinforcement-learning-based domination team (RL-DOT), a nonplayer character (NPC) team for playing Unreal Tournament (UT) Domination games is described and a Q- learning-style algorithm is used to learn the optimal decision-making policy.
Abstract: In this paper, we describe the design of reinforcement-learning-based domination team (RL-DOT), a nonplayer character (NPC) team for playing Unreal Tournament (UT) Domination games. In RL-DOT, there is a commander NPC and several soldier NPCs. The running process of RL-DOT consists of several decision cycles. In each decision cycle, the commander NPC makes a decision of troop distribution and, according to that decision, sends action orders to other soldier NPCs. Each soldier NPC tries to accomplish its task in a goal-directed way, i.e., decomposing the final ultimate task (attacking or defending a domination point) into basic actions (such as running and shooting) that are directly supported by UT application programming interfaces (APIs). We use a Q-learning-style algorithm to learn the optimal decision-making policy. We carefully choose some opponent policies for our illustrative experiments. In these experiments, RL-DOT shows a distinct learning characteristic, which illustrates its efficiency in playing UT Domination games.
TL;DR: A hierarchical Multi Agentbased Information Fusion System for Decision Making Support (MAIFSDMS) is implemented by applying Maximum Score of the Total Sum of Joint Probabilities (MSJP) fusion method and is done by a collection of Information Fusion Agents (IFA) that forms a multiagent system.
Abstract: Quick, accurate, and complete information is highly required for supporting strategically impact decision making in a Military Operation (MO) in order to reduce the decision cycle and to minimize the loss. For that purpose, we propose, design and implement a hierarchical Multi Agentbased Information Fusion System for Decision Making Support (MAIFSDMS). The information fusion is implemented by applying Maximum Score of the Total Sum of Joint Probabilities (MSJP) fusion method and is done by a collection of Information Fusion Agents (IFA) that forms a multiagent system. MAIFS uses a combination of generalization of Dasarathy and Joint Director' s Laboratory (JDL) process models for information fusion mechanism. Information fusion products that are displayed in graphical forms provide comprehensive information regarding the MO' s area dynamics. By observing the graphics resulted from the information fusion, the commandant will have situational awareness and knowledge in order to make the most accurate strategic decision as fast as possible.