This lecture compares model-free policy evaluation methods: Monte Carlo (unbiased, high variance), Temporal Difference (biased, low variance), and model-based Dynamic Programming. TD combines bootstrapping and sampling, updating after each transition. The lecture analyzes bias-variance tradeoffs and convergence properties, using a Mars rover example.