David AI Raises $50 Million Series B to Power Next-Generation Voice and Audio AI
David AI, a San Francisco–based startup developing the foundational data layer for audio artificial intelligence, has raised $50 million in its latest Series B funding round. The round was led by Meritech Capital Partners, with participation from NVIDIA and several existing investors, including Alt Capital, First Round Capital, Amplify Partners, and Y Combinator. The raise brings David AI’s total funding to approximately $80 million across its Seed, Series A, and Series B rounds.
David AI describes itself as the world’s first audio data research lab, developing large-scale, research-grade datasets designed for training advanced speech and voice models. The company’s mission is to provide the highly structured, multilingual, and acoustically rich data needed to power next-generation voice interfaces across devices and applications. Today, the company reports that its datasets include one of the world’s largest collections of speaker-separated conversational audio across roughly 15 languages, with extensive metadata on dialects, accents, environments, and topics.
The company’s rapid funding momentum reflects growing demand for high-fidelity audio data as AI expands beyond text and images into real-world, multimodal interactions. Earlier this year, David AI secured $5 million in a seed round led by First Round Capital, with participation from BoxGroup, SV Angel, Liquid 2, Y Combinator, and several prominent angel investors. It then raised $25 million in a Series A led by Alt Capital and Amplify Partners, alongside returning investors First Round Capital, Y Combinator, and BoxGroup. The new $50 million Series B, led by Meritech and joined by NVIDIA, marks the company’s largest injection of capital to date and signals escalating interest from both venture capital firms and major AI ecosystem players.
Co-founders Tomer Cohen and Ben Wiley, both alumni of Scale AI, launched David AI after recognizing a major gap: while model development in speech AI had accelerated, the underlying data—especially diverse, well-annotated audio—lagged significantly behind. They set out to create a dedicated research lab approach to audio data, focused not only on collection at scale but on rigorous quality, structure, and experimental design. Their goal is to make it possible for any AI lab or enterprise developer to build robust, real-world voice systems without wrestling with inconsistent, noisy, or incomplete data.
David AI says it now works closely with top AI labs globally and provides datasets that help train, benchmark, and evaluate state-of-the-art speech models. The company also reports that it has reached eight-figure annual revenue, reflecting strong commercial demand for high-quality audio datasets across industries such as consumer tech, automotive, robotics, and enterprise AI.
The company plans to use the new funding to expand its research and engineering teams, scale its dataset production infrastructure, and deepen partnerships with leading AI developers. The capital will also support new investments in evaluation frameworks—tools that test how AI models behave in real acoustic environments, noisy conditions, and across different global accents and dialects.
For investors, David AI represents a critical infrastructure company for the coming wave of AI interfaces, many of which will rely on voice. Meritech Capital Partners, known for backing some of the most successful enterprise and consumer technology companies of the past two decades, leads the round with a focus on helping David AI scale globally. NVIDIA’s participation underscores the strategic alignment between high-quality multimodal data generation and the computational platforms that power advanced AI models.