Q-Learning Model - FrozenLake (From Scratch)
This is a tabular Q-learning agent trained from scratch on the FrozenLake-v1 environment using only NumPy and Gymnasium.
Model Description
- Algorithm: Q-Learning (Tabular)
- Environment: FrozenLake-v1 (4x4 grid, slippery)
- Implementation: From scratch using NumPy
- Training Episodes: 10,000
- Success Rate: 100%
- Average Reward: 1.000
Training Details
Hyperparameters
- Learning Rate (α): 0.1
- Discount Factor (γ): 0.99
- Epsilon Start: 1.0
- Epsilon End: 0.01
- Epsilon Decay: 0.995
Environment
FrozenLake-v1 is a classic RL environment where an agent navigates a 4x4 frozen grid to reach a goal while avoiding holes. The surface is slippery, making deterministic policies challenging.
Performance
The agent achieves a 100% success rate after training:
- Successfully navigates from start to goal
- Avoids all holes
- Optimal path learned
Usage
import pickle
import gymnasium as gym
from huggingface_hub import hf_hub_download
# Download model
model_path = hf_hub_download(
repo_id="aryannzzz/q-learning-frozenlake-scratch",
filename="q_learning_model.pkl"
)
# Load Q-table
with open(model_path, 'rb') as f:
data = pickle.load(f)
q_table = data['q_table']
config = data['config']
# Use the agent
env = gym.make('FrozenLake-v1')
state, _ = env.reset()
done = False
total_reward = 0
while not done:
action = q_table[state].argmax() # Greedy policy
state, reward, terminated, truncated, _ = env.step(action)
total_reward += reward
done = terminated or truncated
print(f"Total Reward: {total_reward}")
env.close()
Model Files
q_learning_model.pkl- Complete Q-table and configurationq_learning_training.png- Training progress visualizationq_table_visualization.png- Q-table heatmap
Training Code
This model was trained using custom Q-learning implementation:
- Repository: RL-Competition-Starter
- File:
01_q_learning_from_scratch.py
Limitations
- Only works on FrozenLake-v1 (4x4 version)
- Tabular approach - doesn't generalize to unseen states
- Requires discrete state and action spaces
Citation
If you use this model, please reference:
@misc{q-learning-frozenlake-scratch,
author = {aryannzzz},
title = {Q-Learning FrozenLake Model from Scratch},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/aryannzzz/q-learning-frozenlake-scratch}
}

