Nikhil Kamath ft. Perplexity CEO, Aravind Srinivas | WTF Online Ep 1.

This conversation follows Arvind, a Chennai-raised AI expert, discussing his journey from IIT Madras to OpenAI and his current work at Perplexity AI. He details his AI learning process, emphasizing the importance of simple, scalable ideas over complex ones, and the role of massive compute in AI's recent advancements. He contrasts narrow AI with general (AGI) and superintelligence, highlighting the challenges of creating truly autonomous, self-improving AI. Arvind also discusses the current AI landscape, the differences between major players, and the potential for future disruption in various sectors, including the data center and advertising industries. He suggests personalized apps and voice-based AI as promising areas for Indian entrepreneurs. The speaker's journey began with expectations to excel in IITs, transitioning to competitive programming and then discovering machine learning through a Kaggle competition. Early self-learning through online resources like Andrew Ng's lectures and Stanford materials, coupled with formal coursework and research, led to a PhD and internships at OpenAI and DeepMind. A humbling experience at OpenAI highlighted the importance of continuous learning and accepting not always being the best. The speaker's work at OpenAI involved research on AI, including generative AI and reinforcement learning, and later contributed to the development of ChatGPT. The core concept of AI is discussed, focusing on the distinction between narrow and general intelligence. General AI aims to create systems capable of performing a wide range of tasks, unlike specialized AI. The evolution of large language models (LLMs) is explained, emphasizing the crucial roles of massive datasets, transformer networks, and fine-tuning for specific applications. The speaker contrasts LLMs with the need for physical common sense in achieving Artificial General Intelligence (AGI), highlighting the challenges in replicating human-like abilities. Recent advancements in AI are attributed to a confluence of factors: massive compute power, high-quality data, reinforcement learning from human feedback (RLHF), and improved model architectures. The speaker discusses the current landscape of AI chatbots, focusing on the race for accuracy, speed, and unique features like source citations and agentic behavior. Challenges in building and deploying AI products are highlighted, including the need for contextual reasoning, seamless API integrations, and efficient backend infrastructure. The speaker's perspective on the future of AI includes the potential for personalized AI assistants, the disruption of existing industries, and the importance of addressing ethical concerns. The speaker also shares insights into the data center industry, the competitive landscape of AI chip manufacturers (like Nvidia), and the opportunities for Indian entrepreneurs in the AI space. Finally, the speaker touches upon the need for responsible AI development and regulation, emphasizing the potential risks and benefits of this rapidly evolving technology. The speaker's internship at OpenAI occurred in the summer of 2018. Here's a breakdown of the experience: Initial Confidence: The speaker initially believed they were skilled in AI and machine learning. Humbling Experience: However, they quickly realized they were "very, very bad" compared to others at OpenAI. Feedback from Ilia Sutskever: The speaker presented their ideas to Ilia Sutskever, who, at the time, was essentially running the company. Sutskever listened for about half a minute and then directly told the speaker that their ideas were useless. Respectful Delivery: Sutskever's feedback, while blunt, was delivered respectfully. Emotional Impact: The speaker was upset to hear this, as they had believed their ideas were good. Learning and Growth: The internship served as a crucial learning period, where the speaker delved into the details of AI and machine learning. Foundation Building: The experience at OpenAI, along with other experiences, helped build the speaker's fundamentals. Improved Peer Group: The speaker benefited from an increasingly better peer group and exposure. Questioning Understanding: The speaker constantly questioned their understanding of the world. Comfort with Not Being the Best: The speaker became comfortable with not being the best person in the room, a shift from their previous mindset. Here's how the speaker defines AI and differentiates its types: Broad Definition of AI: The speaker suggests that AI can be broadly defined as anything that makes any kind of prediction. Calculator as AI: Even a calculator could be considered a form of AI because it performs a task (math) that a human could do, and it does it better than a human. Intelligence Definition: The speaker believes a calculator doing maths can definetly be called intelligence. Shift from Narrow to General: The concept of AI has shifted from narrow (performing one specific task) to more general (capable of learning and performing multiple tasks). Example of Generality: A general AI system, unlike a narrow one, could perform a wide range of tasks, even with slight variations, using a single piece of code. Super intelligence: AI community is referring to super intelligence, not general intelligence, once you crack it, there is no way to control it. Self Improvement: Until the system is shut down super intelligent system is just going to keep improving itself. Smart System: If the system is that smart that it knows what to do all the time and thinking so smart ahead of everybody else. Arvind describes his upbringing in Chennai, highlighting his early interest in statistics, fueled by his passion for cricket and strong math skills. He recounts his programming journey, beginning in 11th standard, and his mother's aspiration for him to attend IIT Madras, setting the stage for his competitive academic path.Arvind details his academic journey, emphasizing his early aptitude for numbers and programming, which led him to pursue competitive programming and ultimately, a career in AI. He shares the significant influence of his mother's expectation that he would attend IIT Madras, shaping his competitive spirit and drive to excel.Arvind discusses his experience at IIT Madras, his involvement in competitive programming, and his fortuitous introduction to machine learning through a Kaggle competition. He describes how his participation in this competition, despite initially lacking knowledge of machine learning concepts, sparked his interest in the field.Arvind recounts his rapid progress in machine learning, including a short internship where he finished a 2.5-month project in 3 weeks, leading to significant self-learning. He explains his self-directed learning approach, using online resources like Andrew Ng's lectures and Stanford materials, ultimately leading to his PhD at Berkeley.Arvind shares his experiences during internships at OpenAI and DeepMind, emphasizing the humbling experience of working alongside exceptionally talented individuals. He discusses the shift in his mindset from striving to be the smartest person in the room to embracing continuous learning and seeking knowledge from the best.Arvind details his 2018 summer internship at OpenAI, highlighting the circumstances that led him to the opportunity, including working in a less-than-ideal lab space at Berkeley and his proactive approach to networking and self-promotion. He explains how his early research caught the attention of John Schulman, who later co-founded ChatGPT.Arvind recounts a pivotal interaction with Ilia Sutskever, where his "fancy" ideas were deemed impractical. This encounter underscores a crucial lesson: the importance of practical results over complex, theoretical approaches in the field of AI. He reveals the nature of his initially dismissed ideas.Arvind elaborates on his initially rejected research ideas, focusing on AI's ability to learn its own loss function. He contrasts this approach with the simpler, more effective methods that ultimately proved more successful, highlighting the importance of practicality and scalability in AI development.Arvind provides a simplified explanation of the history of computing leading up to the current state of AI, tailored for a non-technical audience. This segment is valuable for viewers seeking a basic understanding of the context and evolution of the field.Arvind engages in a discussion about the definition of AI and general intelligence, differentiating between narrow AI (designed for specific tasks) and general AI (capable of performing a wide range of tasks). He clarifies the distinction and the challenges in achieving true general intelligence.Arvind delves into the complexities of AGI, discussing the concept of an AI agent that can continuously learn, improve itself, and set its own goals. He highlights the current limitations of AI systems in terms of autonomy, self-awareness, and the definition of an objective function.Arvind discusses the concept of superintelligence, an AI that surpasses AGI and possesses the ability to recursively improve itself without human intervention. He addresses concerns about AI taking over humanity, offering a more nuanced perspective on the current state of AI development.Arvind proposes a practical definition of AGI as a digital remote knowledge worker, capable of performing various tasks currently done by humans. He explores the evolving definition of intelligence, considering both human-like behavior and functional capabilities.Arvind continues the discussion on defining intelligence, comparing human capabilities with those of AI. He argues that if AI can outperform humans in tasks for which humans are paid, then it should be considered intelligent, regardless of whether it mimics human behavior exactly.Arvind concludes by discussing the broader implications of AI capabilities, emphasizing that the definition of intelligence might need to be revised as AI systems continue to advance and surpass human abilities in various domains. He highlights the limitations of human capabilities as a benchmark for intelligence. This segment offers a visual representation of a neural network, explaining how it processes input numbers through layers, applying mathematical functions and learning patterns. The speaker describes the process of updating parameters to minimize loss across a large dataset, contrasting it with the practice of curve-fitting data for desired outputs. This segment explains the significant shift in AI capabilities from 2010 to the 2020s, attributing it to the success of neural networks, particularly due to the application of massive datasets and computational power. The speaker credits Ilia Sutskever's work and the surprising simplicity of the breakthrough, emphasizing the role of "blind faith" in achieving success. The segment details the multi-stage process of training large language models (LLMs), starting with initial training on massive datasets to predict the next word, followed by a crucial post-training phase involving fine-tuning on data relevant to practical tasks like software programming, email summarization, and general conversation to create a usable chatbot like ChatGPT. The speaker also introduces the contrasting viewpoint that current LLMs are not on the path to Artificial General Intelligence (AGI). This segment clarifies the relationship between neural networks and machine learning. It defines machine learning as training computer programs to make intelligent predictions on data, with neural networks being one specific method. The speaker highlights the scalability of neural networks compared to other machine learning techniques, particularly when dealing with large datasets.This segment explains large language models (LLMs) as giant neural networks trained on vast text datasets to predict the next word in a sequence. The speaker uses GPT as an example, describing the pre-training phase focused on next-word prediction and the post-training phase for tasks like image captioning. The process involves tokenizing text from sources like Wikipedia and Reddit. This section compares the offerings of major players in the AI chatbot market (e.g., Google, Meta, Microsoft, Anthropic), noting the current lack of significant differentiation. The speaker predicts that future differentiation will come from more agentic behavior, moving beyond simple text responses to incorporate charts, images, and interactive elements like product cards and booking options. The speaker highlights perplexity's accuracy and speed as current strengths, but emphasizes the future importance of AI agents capable of performing tasks like booking reservations, sending emails, and managing calendars.The speaker envisions a future where AI can perform complex tasks autonomously, acting as a personal assistant capable of handling various interactions and actions on behalf of the user. The limitations of current LLMs in achieving this are discussed, emphasizing the need for improved reasoning capabilities to enable agentic behavior. The speaker argues that the recent progress in reasoning is crucial for enabling AI to go beyond simple input-output relationships and handle more complex, nuanced tasks. The speaker proposes that the path to AI capable of physical tasks lies in developing reasoning and planning abilities. Instead of relying solely on vast datasets of videos, the AI would need to parse scenes and construct plans to solve tasks, similar to how humans mentally simulate actions before performing them. The discussion emphasizes the need for AI to build mental models that allow it to reason and act in novel scenarios, not just those seen in training data.This segment identifies the key factors behind the recent surge in AI capabilities: unprecedented scale of computation, high-quality data, reinforcement learning from human feedback, and training on tasks relevant to human labor. The speaker emphasizes that simply throwing compute at the problem is insufficient; high-quality data and task-relevant training are equally vital. The importance of curated datasets, such as transcripts of lectures and textbooks, is highlighted for fostering reasoning abilities in models. The concept of "chain of thought" prompting is introduced as a method to improve model reasoning and iterative problem-solving. This segment highlights Yan's counter-opinion on the current trajectory of LLMs toward AGI. He emphasizes the need for physical commonsense—the ability to perform basic everyday tasks like pouring water or handling multiple objects simultaneously—as a prerequisite for true AGI. The speaker uses examples of everyday actions to illustrate the complexity of tasks that current models struggle with, highlighting the significant computational resources required for such abilities.This section delves into the difficulties of training computer models to perform simple physical tasks, contrasting the ease with which humans accomplish them. It explains that teaching a computer to pick up a glass would necessitate vast computational resources, building a robotic arm, and extensive training in various physical and visual settings. The speaker argues that current methods lack the ability to generalize across different physics settings and emphasizes the need for more efficient learning with limited data, similar to human evolution. This segment details Perplexity's innovative approach to query processing, utilizing multiple models concurrently for different tasks such as query rewriting, page chunking, summarization, and question suggestion. The speaker addresses concerns about latency, explaining how clever techniques like streaming answers and optimized backend infrastructure ensure a fast and responsive user experience, even with multiple models involved.This segment delves into the technical aspects of minimizing latency in AI responses, emphasizing the importance of tail latency (99th percentile) over mean latency. The speaker discusses strategies like streaming answers in chunks to create a real-time feel, even if the full response isn't yet generated. They also highlight the use of custom runtimes and specialized hardware to optimize efficiency and reduce latency. The discussion concludes by acknowledging that while users may not perceive minor latency differences, these become significant at scale.This section analyzes the cost dynamics of AI services, explaining how the cost per query is constantly decreasing due to advancements in open-source models. The speaker discusses the pricing strategies of Perplexity and OpenAI, highlighting the impact of open-source models on pricing and margins. The discussion also touches on the increasing cost of more complex tasks like deep research. This segment explores the subtle differences between various AI models and how user preferences influence the choice of platform. The speaker discusses how the core functionalities of many AI models are similar due to shared benchmarks, highlighting the importance of unique features and user experience in attracting and retaining subscribers. The discussion emphasizes the need for product differentiation beyond the underlying model itself. This segment presents an investment perspective on the AI market, focusing on Meta's unique position. The speaker argues that Meta is well-positioned for future growth due to the enduring importance of human connection and brand value in an AI-driven world, contrasting this with Google's potential challenges in integrating AI with its existing ad-based model. The discussion highlights the importance of human-to-human connection in a world increasingly reliant on AI.This segment shifts the focus to the Indian advertising market, analyzing the dominance of Meta and Google in acquiring clients and the potential for disruption. The speaker questions whether an Indian company could successfully challenge these giants, acknowledging the significant challenges involved. The discussion highlights the dominance of Meta and Google in the Indian advertising market.The speaker outlines potential strategies for a new company to disrupt the established players in the Indian market, focusing on superior targeting and a unique value proposition. They emphasize the importance of solving the cold start problem and building a user base from scratch, drawing parallels to TikTok's growth strategy. The discussion centers on the significant challenges of disrupting established market leaders.This segment analyzes TikTok's growth strategy, highlighting the use of paid advertising on existing platforms to gain traction. The speaker emphasizes the need for a new company to offer a unique value proposition that existing platforms lack. The discussion uses TikTok's success as a case study for market disruption.This segment discusses Perplexity's potential future, focusing on the market for personalized AI assistants and the possibility of incorporating ads. The speaker argues that a highly personalized assistant could command a premium price, making it a viable business model. The discussion contrasts the effectiveness of personalized ads on platforms like Instagram with the limitations of Google's ad model.This segment analyzes the challenges faced by companies attempting to compete with Google's dominance in search and related services. The speaker highlights Google's strategic advantages, including its control over the Android ecosystem and its intricate integration of search and advertising. The discussion focuses on the significant barriers to entry and the need for innovative strategies to overcome Google's market dominance. This segment delves into the booming Indian data center market, examining its valuation, growth potential, and inherent challenges. The discussion assesses the viability of investing in data center businesses, considering factors like commoditization, competition from hyperscalers, and the need for software integration to enhance margins and create a sustainable competitive advantage. The speaker also considers the role of data sovereignty and the potential for faster buildouts due to lower labor costs. This segment explores the shifting landscape of content consumption from text and images to short-form and long-form video, podcasts, and live streams. It highlights the untapped potential of aggregating Indian podcasts, integrating interactive features like live Q&A and AI-powered editing tools, and leveraging AI for transcription and personalized content delivery, presenting opportunities for a new platform to differentiate itself from existing giants like YouTube and Instagram. This segment details the predicted boom in personalized apps driven by AI's ability to simplify software creation. It explains how AI will empower individuals to build custom apps tailored to their specific needs, eliminating the need for generic, one-size-fits-all solutions and creating a massive new market for personalized software. The discussion also touches upon the challenges and uncertainties surrounding monetization strategies for this emerging market.This segment focuses on the transformative potential of AI coding assistants like Cursor, Replit, and Bolt. It explores how these tools are democratizing software development, enabling individuals without coding expertise to create functional applications. The discussion also assesses the quality of AI-generated code compared to human-developed code and considers the implications for the future of software engineering education. This segment analyzes Nvidia's market position and the reasons behind its success in the AI hardware market. The speaker discusses Nvidia's competitive advantages, including its flexible and general-purpose chips, strong software ecosystem (CUDA), and established relationships with hyperscalers. The discussion also touches upon potential disruptions from emerging competitors and the long-term implications for Nvidia's margins and market share. The role of Google's full-stack approach is also discussed as a potential competitor.This segment focuses on India's potential contributions to the AI revolution and provides actionable advice for young entrepreneurs. It emphasizes the importance of building India's own AI models, focusing on areas where Western labs may not prioritize, such as Indian languages and dialects. The speaker also outlines a multi-stage process for starting an AI-focused venture, starting with building a compelling product and gradually scaling up to model training and data center infrastructure.This segment identifies specific, under-exploited opportunities in the Indian AI market, particularly in voice technology and the need for improved speech recognition and synthesis capabilities for Indian languages and dialects. It also explores the impact of AI on Indian outsourcing giants, predicting a shift towards increased efficiency and potentially reduced workforce size, but also the continued reliance on human expertise for complex tasks requiring high reliability. This segment delves into the crucial topic of AI regulation, advocating for a focus on regulating AI applications rather than the underlying models themselves. The speaker emphasizes the importance of addressing potential harms, such as the impact of AI chatbots on children's mental well-being, while emphasizing the need to avoid stifling innovation through overly restrictive regulations. The segment concludes with a balanced perspective, urging cautious optimism and a focus on responsible development.This segment explores the evolving landscape of data ownership in the age of AI. It considers the possibility of a future where data is compartmentalized by geography or other criteria, requiring payment for access to train AI models. The discussion differentiates between the consumption of data by humans and AI models, highlighting the unique challenges posed by the latter's ability to process and retain vast amounts of information. The speaker's uncertainty about the future of data access and pricing models is clear. This segment discusses the profound impact of AI on the future of work, particularly in the context of software development and other fields. It highlights the potential for significant labor displacement while acknowledging the creation of new opportunities and the need for upskilling and adaptation. The speaker also considers the challenges faced by recent graduates entering a job market disrupted by AI-driven automation and layoffs. The speaker implies that Facebook (Meta) has a strong incentive for AI to succeed within its advertising business model. AI's success would lead to their ads business flourishing even better. Google's ads business and AI agents have opposite business incentives. Google has the least incentive to bring out AI-native search or agents on its core platforms. AI should facilitate native transactions and not be vulnerable to search bar placement. AI's are used for fact checks, research, sources, and financial research, including charts and stock prices.