🤖 AI Robot Reinforcement Learning Playground

Train a robot using reward-based learning. Watch it learn to navigate, avoid obstacles, and reach goals in real-time.

🌍 Environment & Robot

0
Episodes
0
Total Reward
0%
Success Rate
0
Steps
📈 Learning Progress (Last 100 Episodes)
Episode -100 Now

🎮 Training Controls

0.10
0.20
0.90

🏆 Reward Rules

+10
-5
-0.1
+0.5
+2 (if fast)
Q-Learning Status
States Explored: 0
Best Path Length: --
Average Reward: 0.00
Learning Curve: Starting...
🤖 Robot Status
Ready to learn. Click Start Training.

📖 How To Use AI Robot Reinforcement Learning Playground

🎯 Step 1: Understand the Environment

The robot (blue circle) starts at the bottom-left. The goal (green star) is at top-right. Red squares are obstacles. The robot must learn to reach the goal without hitting obstacles.

Pro Tip: Click anywhere on the canvas to move the goal and create new learning challenges.

⚙️ Step 2: Adjust Reward Rules

Check/uncheck reward rules to shape the robot's behavior. Give higher rewards for desired actions and penalties for undesired ones.

🧠 Step 3: Set Training Parameters

  • Learning Rate (α): How quickly the robot adapts to new information
  • Exploration Rate (ε): Chance to try random actions vs. using learned knowledge
  • Discount Factor (γ): How much the robot values future rewards

▶️ Step 4: Start Training

Click "Start Training" to begin. Watch the robot improve over time. The progress bars show learning improvement across episodes.

📊 Learning Concept:
Q(s,a) = Q(s,a) + α [R + γ max Q(s',a') - Q(s,a)]
// This is the Bellman equation for Q-learning

❓ Frequently Asked Questions

What is reinforcement learning?
How does the robot learn to avoid obstacles?
What do the training parameters mean?
Can I see the robot's decision-making process?
How is this used in real robotics?
Why does the robot sometimes take weird paths?