- What is robotics?
- The field of robotics are the problem of building hardwares with sensors and actuators which is capable of performing tasks in the real world.
- What is a robot?
- A robot is an automonous system which exists in the real world, can sense its environment, and act on it to achieve some goals.
What is intelligent robotics/AI Robotics? the field which combines robotics and AI is called intelligent robotics, which is inspired directly from the field of engineering instead of psychology.
What is Cognitive Robotics? Refers to the design of sensorimotor and cognitive capabilities in intelligent robots, taking direct inspiration from cognitive and biological sciences”
Cognitive robotics is the field that combines insights and methods from AI as well as cognitive and biological sciences to robotics.
It contains four key principles: Embodied Cognitive Theory, AI/Knowledge-based systems, Behavioural Robotics and Synthetic Methedologies.
What is Artificial Cognitive Systems? Modelling of simulated and embodied/robotic agents taking inspiration from natural and cognitive systems
- Six key attributes of cognition in artificial cognitive systems:
Embodiment: The body of a robot is the physical structure that enables it to interact with the environment.
A robot’s intelligence and cognitive processes are closely tied to its physical body and the way it interacts with the environment. Rather than treating the robot as a disembodied mind or an abstract information processor, embodiment emphasizes the integration of perception, action, and cognition in a unified system.
What is cognition? “Cognition is the process by which an autonomous system perceives its environment, learns from experience, anticipates the outcome of events, acts to pursue goals, and adapts to changing circumstances” (Vernon 2014)
The first intelligent robot: Shakey the robot
- What are the five main cognitive robotics approaches?
- Developmental Robotics
- Evolutional Robotics
- Swarm Robotics
- Soft Robotics
- Neural Robotics
- What is the Marr’s abstraction?
- Computational / Theory
- Representational / Algorithmic
which decomposes the different level of abstraction and allow the analysation of the problem from highest levels, avoiding the consideration of the details of the implementation before the computational level or theoretical level is solved.
Example of an analysis/design of a cognitive robotics experiment: (Modi):
Follows Marr’s Abstraction, Modi experiment is designed as:
Computational / Theory: Model the language acquisition process exploiting embodiment, following a study showing how posture and space strategies allow word learning in the absence of visual objects.
Representational / Algorithmic: Robots are given both speech and visual input stimuli, which will be lead to some internal semantic representations, result in expected behavior: naming correct object or being able to acting on object such as pointing to it.
Implementation: A robot hardware platform such as iCub or Nao is used, with a camera and microphone to capture visual and speech input, as well as handle the embodiment routine. A LSTM network is used to implement the robot’s cognitive architecture.
Further information: “The robot’s task and the input/output dataset which the model is going to be used”: The robot is presented with N objects visually, then each object is then given a word associated to it => The input consists of images of the objects and corresponding speeches, while the output consists of the name of the object that is learned and the movement of the robot’s arm to point to the object.
“The procedure of training and testing the robot”:
- Provide image only inputs;
- Provide image-speech associated inputs;
- Gather the output and compare with the expected output to evaluate the effect of model training.
Test: use a separate test dataset on the robot to evaluate the model’s performance.
- What is the first robot in the world, the first industrial robot and the first cognitive robot?
- First robot in the world: Waldo
- First industrial robot: PUMA (Unimate)
- First cognitive robot: Shakey
What is cybernetics? Cybernetics is the study of the communication and control of regulatory feedback both in living beings and machines, and in combinations of the two.
- What is the difference between close-loop and open-loop control?
- Open-loop control: The control action is independent of the respond.
- Close-loop control: The control action is dependent on the respond.
- What are the main four components of a robot?
- A Physical body to be embodied in the world
- Sensors to perceive the environment
- Actuators & Effectors to take actions in the environment
- Controller to be automous and have goals
What are the types of robots? Humanoid: resembles a human (have human-like body plan and senses for human-robot interactions), but not necessarily in appearance
Android: robots which are designed to resemble humans in appearance (Geminoid, Hanson, Kaspar)
Biometric Robots: robots which are designed with animal body-plans, to exploit biological structures and mechanisms
What is the uncanny valley? The uncanny valley is a hypothesis in the field of aesthetics which holds that when features look and move almost, but not exactly, like natural beings, it causes a response of revulsion among some observers => caused by the failure to attain full human-like qualities in both behavior and appearance.
What are the difference between effector and actuator? Effector: the device that has an effect on the environment Actuator: the device which enabling effectors to execute the movement
What are the two types of DOF (Degree of Freedom)? Translational DOF: move without rotation, x, y, z Rotational DOF: rotate without translation, roll, pitch, yaw
What are the total and controllable DOFs? Untrollable DOF: DOF which cannot be controlled by the robot, cause problems
Holonomic DOF: All DOFs happens to be able to controlled by Non-holonomic DOF: Some DOFs cannot be controlled by the robot Redundant DOF: The robot has more DOFs than it needs to perform a task
What is active and passive actuation? Activator: the actuators which use energy to provide power to the effectors Passive actuation: the actuators which do not use extra energy to provide power to the effectors by exploiting the body-enviroment interaction (exploit gravity/weight)
What are stiff effector and compliant actuator? Stiff effector: the effectors which are rigid and do not deform when interacting with the environment Compliance actuator: the actuators which are responsive to external force by stopping the movement or deforming
- What are the types of sensors?
- Proprioceptive sensors: sensors which measure the internal state of the robot (joint angles, joint torques, joint velocities, etc.)
- Exteroceptive sensors: sensors which measure the external state of the robot (camera, microphone, etc.)
What are the levels of sensor processing? Low: electronic signals, such as bump sensor Mid: signal processing, such as microphones, sonar, lidar High: include computation or sensor-fusion, suchas vision, NLP, multi-modal sensor fusion
- What are two main perception theories?
- Passive vision: passively observing the world, which is done by most robots
- Active vision: actively observing the world, use knowledge and instict to look for particular stimuli (human’s eye-movement)
What is SONAR? SOund NAvigation and Ranging
What is LIDAR? LIght Detection And Ranging
What is developmental Robotics? Developmental Robotics is the first example of cognitive robotics, which take direct inspiration form developmental psychology.
Developmental Robotics is the interdisciplinary approach to the automous design of behavioral and cognitive capability in artificial agents which take direct inspiration from developmental principles and machenisms observed in natural cognitive systems such as children.
- What are the six main principles of Developmental Robotics?
- Development as a Dynamic System:
- Phylogenetic and Ontogenetic interaction:
- Embodied and Situated development:
- Intrinsic Motivation and Social Learning:
- Non-linear, stage-like development:
- Online, open-ended cumulative learning:
- What are the four Piaget’s stages of cognitive development?
- Concrete Operational
- Formal Operational
What is the difference between navigation and manipulation? Navigation: where to go and were am I Manipulation: go to where and fecth sth (reaching & grasping)
- What are the difference between Localisation and Map making?
- Localisation: where am I
- Map making: where I have been (SLAM)
- What are two main ways of path planning?
- Qualitative Path Planning: Topological/Route navigation, such as distinctive places and associative methods
- Quantitative Path Planning: Metric navigation, precise determine the location
- What are two main ways of feature-based localisation?
- marcov Localization, typically used for global localisation
- extended kalman filters, also used for global localisation
What is the main method used for iconic licalisation? Monte-Carlo Localization
What is SLAM? Simultaneous Localization and Mapping, build maps while localising in a new environment without any prior knowledge
What is RatSLAM? a cognitive navigation model, emulate the spatial navigation capability of rat’s hippocampal system.
good at deal with noisy input, cope with new environment and accomodate complexity.
Discover head-direction cells, grid cells, place cells.
Place Cells: fire when the rat is in a particular location in the environment
Grid cells: fire in a metrically regular way across the environment
What is an artificial neural network? consists of a set of interconnected nodes, can receive input stimuli from the environment and learn new output responses
What are the main activation functions? neuron functions: ReLU, Sigmoid, Tanh.
What are the main output layer function (categorical)? SoftMax.
- What are the main NNet Topologies?
- Feedforward: perception, MLP, Deep NN.
What is CNN? learn features automatically through filters, similar to hierarchical, deep brain visual cortex
What is the main revolution of CNN? 2010, AlexNet, which is based on 1998’s Lenet.
What is Feature Map? Simple neoruns use local receptive fields to filter elementary filters, which are then combined to form feature maps.
What is convolution? Sequential implementation of feature maps to scan the input image with a single local receptive field, extract features and combine them to form a feature map.
What is pooling? The technique to reduce the feature maps’ spatial resolution, which is locally invariant, can reduce the sensitivity of the output to shifts and distortions.
What is robot manipulator? One or more links connected by joints, and the endeffector.
What is endeffector? used to affect and move objects in the environment, such as gripper, hand, arm, etc.
What is joint limit? the extreme of how far a joint can move.
What is Workspace? the space that the endeffector can reach, the set of all poses attainable by a particular endeffector
What is inverse and forward kinemetics? Inverse kinematics: given the endeffector pose, find the joint angles Forward kinematics: given the joint angles, find the endeffector pose
What is compliance and how to achieve it? yielding to the environment forces, maintain safety for contact with humans
solutions: sensors(software compliance), soft materials, spring in joints
What is the difference between close loop and open loop control? close loop ctrl: achieve and maintain a desired state by continuously comparing its current state with the desired one using feedback and computing error (difference between current state and desired state)
open loop ctrl: directly apply force to joints (actuators) towards the desired state without check
What is reaching model? Follows the natural cognitive principle of “reaching model”: infants learns to ctrl their bodies by actively generating trial-and-error arm movements, reaching a U-shaped development: accuracy goes up, then down, then up again.
Theory: replica the human reaching model via imolementing maturation model. Algorithmic: given a robot with one arm low-acuity visual input, let it learn to reach to grasp the object. Implementation: use a humanoid such as iCub or Nao, implement the grasp model using artificial neural networks, continue to give it visual input and update the weights of the network using genetic algorithm, analyze the change of grasp performance.
- What are two main ways of deep learning for manipulation?
- Reinforcement Learning: deep Q reinforcement learning
- Hand-eye coordination
What is Social Interaction and Learning? It is a developmental approach containing for incremental skills: joint attention, imitation, cooperation and TOM (Theory of Mind).
What is joint attention? It refers to the capability that two or more individuals share attention to the same object or event, no matter the object is visible for both or not.
- What are the four stages of joint attention?
- Sensitivity stage
- Ecological Stage
- Geometrical Stage
- Representational Stage
- (Developmental Learning) Joint Gaze Model:
- Computational: use a humanoid robot to exploit the stages of joint attention observed in human infants
- Algorithmic: use a controller of a humanoid robot, continue to train it to follow the gaze of a human
- Implementation: implement the controller with modules such as salient feature detectors, visual feedback controller, self-evaluating learning module, internal evaluator, gate module, with the rboot embodied in the reality as Nao, iCub, etc.
What is “Attention manipulation via pointing”? Imperative Pointing: request an object when the other agent is not initially looking at it. Declarative Pointing: create a shared attention on an object focus of the interaction
Imitation Architecture: HAMMER architecture combine inverse and forward models, compare the “intended” movement outcome with the real outcome, then use the respond to make the system more robust
What is Theory of Mind? The ability of attributing mental stages of others, infer other agent’s intention and beliefs, and predict their behavior.
- What are three main challenges(technical/scientific) of HRI (human-Robot-Interaction)?
- ASR: Automatically Speech Recognition (accuracy problem, can be alleviated using high-quality microphones)
- Action recognizing and intention reading (can use real-time pose estimation like OpenPose with 3D depth cameras such as Kinect)
- Trust and Acceptability (can be alleviated using social robots with more human-like appearance and social gazes)
What is Bayesian ToM Trust Model? use a bayesian network to seperate reliable and unreliable speakers collect statistical information to track the reliability of agents
What roles can robots take in education? Tutor (1-1), Teacher (1-Many), Peer
- Why we can use robots/need to use robots in Education?
- Teaching requlre physical contact: robots are physical, human are physical
- Social behaviors would benefit learning: better attention, participation, enjoyment, …
- Learning gains increase when interacting with physical systems
- What are the results of the review paper?
- Role of robot is application specific
- The role of tutor is the most effective role for robot to take in the field of education
- Robots increase learning sudokos;
- too social robots offer less learning advantage for maths (distractions);
- Robots teach L1/L2: language acquisation benefited from emboddied and situated learning, but not suitable for children in 1-1 case, however effective when robots work as a tutor along with human teacher
- Advantage of social robots:
- allow 1-1 personalized learning
- can help learning in context
- patient and non-judgemental
- can be used in special need cases
- Problem of social robots:
- expensive and need maintainance
- still have limitations in speeching and actions
- Not universally suitable for all cases
Robot-Era Project: build robots as compaions for the elderly, over 80% of the elderly people are satisfied with the robot
- What is Ethics for AI and Machine Ethics?
- Ethics for AI: build ethical rules for AI, ensure AI respect human and do right things
- Machine Ethics: build machines which can automatically understand, create, apply their ethical rules, with the compliance of human ethics
- What are three main ethics approaches taught?
- Deontological Ethics: assume there exists a set of universal ethical rules, they just exists and never changes (bullshit)
- Utlitarian/Consequentialist Ethics: For the greatest good
- Virtue Ethics: Assume one is wise enough to automatically come up with and implement virtue rules, similar to machine ethics
What are two ethical systems that is widely used? Utlitarianism and Virtue Ethics
- Potential Harm of AI and Robotics?
- Bias and Discrimination (use unbiased data)
- Denial of Autonomy
- What can we do to avoid the potential harm?
- Machine Ethics
- Responsible AI (follow ethical guidelines in R&D, RRI)
- What are four main objectives of Responsible AI approaches?
- Ethically permissible: consider as diverse as possible
- fair and non-discriminatory: avoid bias and discrimination on individuals
- public trust
- justfiable: prioritizing transparency and interpretability
- What are two main principles of RRI? (Responsible Research and Innovation)
- involve society in the very upstream during research process, align the outcome of research with the society
- embedded ethics
What is explainable AI? produce explainable models, enable human to understand, trust and manage AI systems
What is the main problem of Explainable AI? Performance VS Explainability
What is classical robotics? Consider developing robots as an engineering problem, break functions down into separated modules such as perception, planning and actions, but have high level of complexity
What is behavior-based robotics? divide the development into a bottom-up approach, build robots with simple behaviors, then combine them to form complex behaviors
What is bio-inspired approach? use natural/biological solutions to evolve complexity, such as use natural selection to evolve complex organs and behaviors, like evolutionary robotics
What is evolutionary robotics? automatic design of robots and their sensorimotor control through an evolutionary computation process.
a method that permits to create robots capable of developing the abilities to perform one or more functions as a result of an adaptation process analogous to natural evolution.
genetic algorithm: initialization + fitness test + selection + reproduction + mutation
Khepera’s Body and Brain: implement neural networks with evolving weights
What is co-evolution? Co-evolution is the process in which two or more species evolve in response to each other.
- What are the advantages of co-evolution?
- select the most effective solutions
- select solutions to match counter-strategies
- obtained higher complexity without increasing supervision
What is evolving morphology? Evolving morphology is the process of evolving the shape of neural network, the shape of body, etc.
- Encode digital (artificial) genes:
- Direct encoding
- Grammar Encoding
What is swarm robotics? the study of how independent robots can interact as a group, forming a collective behevior which cannot be achieved by single agents
- Microscale interaction: local, peer-to-peer
- Macroscale Sructure: global, systemwide
- Principle of self-organization:
- Positive /negative feedback: positive feedback amplifies the effect, negative feedback reduces the effect
- Fluctuation: random change
- Multiple interactions: multiple agents interact with each other
Scalability: the possibility to scale to any system size (decentralized approach + limited communication)
Falut-tolerance: the ability to maintain the collective behavior in the presence of faults, multi-robot systems have higher degree of fault tolerance
- Examples: Ant colony (Ant Colony Optimization to solve TSP problems), EvoRobot, SAGA (drone swarms)