جزییات کتاب
Издательство InTech, 2011, -434 pp.Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. Two other learning paradigms exist. Supervised learning has initially been successful in prediction and classification tasks, but is not brain-like. Unsupervised learning is about understanding the world by passively mapping or clustering given data according to some order principles, and is associated with the cortex in the brain. In reinforcement learning an agent learns by trial and error to perform an action to receive a reward, thereby yielding a powerful method to develop goal-directed action strategies. It is predominately associated with the basal ganglia in the brain.The first 11 chapters of this book, Theory, describe and extend the scope of reinforcement learning. The remaining 11 chapters, Applications, show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels.This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field. We would like to thank all contributors to this book for their research and effort.Summary of Theory:Chapters 1 and 2 create a link to supervised and unsupervised learning, respectively, by regarding reinforcement learning as a prediction problem, and chapter 3 looks at fuzzycontrol with a reinforcement-based genetic algorithm. Reinforcement algorithms are modified in chapter 4 for future parallel and quantum computing, and in chapter 5 for a more general class of state-action spaces, described by grammars. Then follow biological views; in chapter 6 how reinforcement learning occurs on a single neuron level by considering the interaction between a spatio-temporal learning rule and Hebbian learning, and in a global brain view of chapter 7, unsupervised learning is depicted as a means of data pre-processing and arrangement for reinforcement algorithms. A table presents a ready-to-implement description of standard reinforcement learning algorithms. The following chapters consider multi agent systems where a single agent has only partial view of the entire system. Multiple agents can work cooperatively on a common goal, as considered in chapter 8, or rewards can be individual but interdependent, such as in game play, as considered in chapters 9, 10 and 11.Summary of Applications:Chapter 12 continues with game applications where a robot cup middle size league robot learns a strategic soccer move. A dialogue manager for man-machine dialogues in chapter 13 interacts with humans by communication and database queries, dependent on interaction strategies that govern the Markov decision processes. Chapters 14, 15, 16 and 17 tackle control problems that may be typical for classical methods of control like PID controllers and hand-set rules. However, traditional methods fail if the systems are too complex, timevarying, if knowledge of the state is imprecise, or if there are multiple objectives. These chapters report examples of computer applications that are tackled only with reinforcement learning such as water allocation improvement, building environmental control, chemical processing and industrial process control. The reinforcement-controlled systems may continue learning during operation. The next three chapters involve path optimization. In chapter 18, internet routers explore different links to find more optimal routes to a destination address. Chapter 19 deals with optimizing a travel sequence w.r.t. both time and distance. Chapter 20 proposes an untypical application of path optimization: a path from a given pattern to a target pattern provides a distance measure. An unclassified medical image can thereby be classified dependent on whether a path from it is shorter to an image of healthy or unhealthy tissue, specifically considering lung nodules classification using 3D geometric measures extracted from the lung lesions Computerized Tomography (CT) images. Chapter 21 presents a physicians' decision support system for diagnosis and treatment, involving a knowledgebase server. In chapter 22 a reinforcement learning sub-module improves the efficiency for the exchange of messages in a decision support system in air traffic management.Neural Forecasting SystemsReinforcement learning in system identification Reinforcement Evolutionary Learning for Neuro-Fuzzy Controller DesignSuperposition-Inspired Reinforcement Learning and Quantum Reinforcement Learning An Extension of Finite-state Markov Decision Process and an Application of Grammatical Inference Interaction between the Spatio-Temporal Learning Rule (non Hebbian) and Hebbian in Single Cells: A cellular mechanism of reinforcement learning Reinforcement Learning Embedded in Brains and RobotsDecentralized Reinforcement Learning for the Online Optimization of Distributed SystemMulti-Automata Learning Abstraction for Genetics-based Reinforcement LearningDynamics of the Bush-Mosteller learning algorithm in 2x2 games Modular Learning Systems for Behavior Acquisition in Multi-Agent EnvironmentOptimising Spoken Dialogue Strategies within the Reinforcement Learning ParadigmWater Allocation Improvement in River Basin Using Adaptive Neural Fuzzy Reinforcement Learning Approach Reinforcement Learning for Building Environmental Control Model-Free Learning Control of Chemical ProcessesReinforcement Learning-Based Supervisory Control Strategy for a Rotary Kiln Process Inductive Approaches based on Trial/Error Paradigm for Communications NetworkThe Allocation of Time and Location Information to Activity-Travel Sequence Data by means of Reinforcement LearningApplication on Reinforcement Learning for Diagnosis based on Medical Image RL based Decision Support System for u-Healthcare EnvironmentReinforcement Learning to Support Meta-Level Control in Air Traffic Management
درباره نویسنده
در فیزیک، وبر (به انگلیسی: weber) (نماد: Wb؛ ˈveɪbɚ, ˈwi:bɚ) یکای شار مغناطیسی است. این یکا به نام ویلهلم ادوارد وبر (۱۸۰۴ – ۱۸۹۱)، فیزیکدان آلمانی، نامگذاری شدهاست.