LLMs evolved from basic language models to self-improving systems. Reinforcement learning, self-rewarding methods, and chain-of-thought prompting boost reasoning and evaluation. Models now iteratively refine their performance, approaching or exceeding human capabilities in specific tasks, exemplified by breakthroughs like DeepSeek R1. Future work focuses on enhancing internal reasoning and vector-based thought processes.