Loading…
Image of Eira May
Staff

Eira May

The framework helping devs build LLM apps

Ben and Eira talk with LlamaIndex CEO and cofounder Jerry Lu, along with venture capitalist Jerry Chen, about how the company is making it easier for developers to build LLM apps. They touch on the importance of high-quality training data to improve accuracy and relevance, the role of prompt engineering, the impact of larger context windows, and the challenges of setting up retrieval-augmented generation (RAG).

Why we built Staging Ground

A two-part episode: In part one, Ben chats with friend of the show and senior software engineer Kyle Mitofsky about Staging Ground, a private space within Stack Overflow where new users can receive guidance from experienced users before their question is posted. In part two, Ben talks to Stack Overflow moderator Spevacus, who participated in the beta of Staging Ground. They talk about why we wanted to build a safer asking experience for new users, the positive feedback we’ve gotten from the community so far, and the challenges of building Staging Ground within the existing Stack Overflow architecture.

OverflowAI and the holy grail of search

Product manager Ash Zade joins the home team to talk about the journey to OverflowAI, a GenAI-powered add-on for Stack Overflow for Teams that’s available now. Ash describes how his team built Enhanced Search, the problems they set out to solve, how they ensured data quality and accuracy, the role of metadata and prompt engineering, and the feedback they’ve gotten from users so far.

The reverse mullet model of software engineering

Ben and Ryan are joined by software developer and listener Patrick Carlile for a conversation about how the job market for software engineers has changed since the dot-com days, navigating boom-and-bust hiring cycles, and the developers finding work at Walmart and In-N-Out. Plus: “Party in the front, business in the back” isn’t just for haircuts anymore.

Why configuration is so complicated

Ben and Ryan explore why configuration is so complicated, the right to repair, the best programming languages for beginners, how AI is grading exams in Texas, Automattic’s $125M acquisition of Beeper, and why a major US city’s train system still relies on floppy disks. Plus: The unique challenge of keeping up with a field that’s changing as rapidly as GenAI.

How do you evaluate an LLM? Try an LLM.

On this episode: Stack Overflow senior data scientist Michael Geden tells Ryan and Ben about how data scientists evaluate large language models (LLMs) and their output. They cover the challenges involved in evaluating LLMs, how LLMs are being used to evaluate other LLMs, the importance of data validating, the need for human raters, and more needs and tradeoffs involved in selecting and fine-tuning LLMs.

Are long context windows the end of RAG?

The home team is joined by Michael Foree, Stack Overflow’s director of data science and data platform, and occasional cohost Cassidy Williams, CTO at Contenda, for a conversation about long context windows, retrieval-augmented generation, and how Databricks’ new open LLM could change the game for developers. Plus: How will FTX co-founder Sam Bankman-Fried’s sentence of 25 years in prison reverberate in the blockchain and crypto spaces?