🍔🧠 How Stripe Runs a 50M lines Monorepo With 0 Pain

Happy Monday! ☀️

Welcome to the 135 new hungry minds who have joined us since last Monday!

If you aren't subscribed yet, join smart, curious, and hungry folks by subscribing here.

📚 Software Engineering Articles

AI agents accelerate Liger Kernel engineering exponentially
Things you didn't know about database indexes
Latency vs throughput vs bandwidth explained clearly
Balancing cost and reliability for Spark on Kubernetes
Null has cost billions in software failures

🗞️ Tech and AI Trends

Angular v22 brings major framework improvements
When AI builds itself: recursive self-improvement unlocked
Instagram accounts hacked via Meta's AI chatbot abuse

👨🏻‍💻 Coding Tip

Sliding-window idempotency caches trade memory for duplicate detection; Bloom filters + time-partitioned LRU prevent costly re-processing

Time-to-digest: 5 minutes

Selective Test Execution at Stripe: Fast CI for a 50M-line Ruby monorepo 🪫

Stripe's Ruby monorepo contains 50 million lines of code and 1.2 million test units. Running them all sequentially would take four months. Instead, Stripe built Selective Test Execution (STE)—a system that runs only ~5% of tests per build while maintaining full confidence in code safety.

The challenge: Static dependency analysis fails on dynamic languages like Ruby. Metaprogramming, runtime configuration, and non-code dependencies (YAML, JSON, fixtures) make it nearly impossible to predict what a test actually touches without running it.

Implementation highlights:

Dynamic file access interception: Build a C++ shared library (file_access_interceptor) loaded via LD_PRELOAD that records every file opened during test execution at the OS syscall level
Hierarchical scope tracking: Attribute file access to specific tests using a scope stack that naturally handles child processes and distinguishes global dependencies from test-specific ones
Roaring bitmap indexes: Store three billion data points as compressed bitmaps mapping changed files to impacted tests, enabling fast union/intersection operations during selection
Monotonic Revision IDs: Order build metadata without git queries using ancestry-preserving identifiers, making baseline selection fast and reproducible
Pragmatic guardrails: Force rerun of directory-globbing tests and linters with selective scans to handle edge cases where file access doesn't capture behavioral changes

Results and learnings:

Massive speedup: Reduced median test execution to <0.5% of the full suite while maintaining safety and confidence
Cost efficient: Spend <10% compute of an "always run everything" strategy across 50,000+ builds per week
Operationally sound: Single fast database query per build makes selection reproducible and debuggable

Stripe's approach proves that observing what code actually does beats predicting what it might do. By intercepting at the syscall boundary and storing results efficiently, they've cracked the unsolved problem: how to scale testing without sacrificing safety.

EP217: Latency vs Throughput vs Bandwidth

Latency, throughput, and bandwidth often get used interchangeably, but each one tells a different story about performance.

Balancing cost and reliability for Spark on Kubernetes

Today, we’re excited to open-source Spot Balancer in collaboration with AWS, a tool that helped us reduce Spark compute costs by up to 90% while maintaining reliability.

AI Engineering for Developers

A tour through AI engineering for developers who already know how to ship software. Fourteen chapters, no LinkedIn voice, no slow warm-up. We will go from 'what is a foundation model' to 'how do you run agents in production on Google Cloud' without skipping the parts that matter.

Things you didn't know about indexes

AI helping build better AI: How agents accelerate Liger Kernel engineering

ARTICLE (thumb it up)
I Built a Free Video Thumbnail Generator That Never Uploads Your Files

GITHUB REPO (bench the keys)
Keybench – Scriptable, extensible performance tool for key value stores

ESSENTIAL (billion dollar oopsie)
Null Looks Like an Empty Value — Until You Realize It Has Caused Billions of Dollars in Software Failures

ARTICLE (ai blog go brrr)
How I Redesigned 4 Years of Blog Posts (196 of them!) Overnight with AI

ESSENTIAL (tired but wiser)
Being oncall taught me everything

ARTICLE (queue the drama)
Building observability into Notion's dead-letter queue

ARTICLE (gemma go smart)
Gemma 4 12B: The Developer Guide

GITHUB REPO (git trees grow)
Treehouse – isolate multple dev environments on different Git worktrees

Want to reach 200,000+ engineers?

Let’s work together! Whether it’s your product, service, or event, we’d love to help you connect with this awesome community.

WORK WITH US

🚀 Angular v22 Released: Production-Ready Signal Forms, Aria, and AI-Native Features (8 min)

Brief: Angular v22 brings three features to production-ready status: Signal Forms for reactive, composable forms, Angular Aria for accessible components, and asynchronous reactivity APIs for handling async operations; plus new AI-native capabilities including agentic tooling, WebMCP support, and integration with Google AI Studio and Gemini Canvas for no-code Angular app building.

🤖 When AI Builds Itself: Anthropic's Path to Recursive Self-Improvement (8 min)

Brief: Anthropic reveals that AI systems are accelerating their own development, with engineers shipping 8x more code per quarter than in 2021-2025, Claude now writing over 80% of production code, and models progressing from executing specified tasks to autonomously running research projects—signaling a potential future where AI could design its own successors, bringing both transformative benefits and significant risks that require urgent coordination and alignment research.

🔐 Meta Confirms Thousands of Instagram Accounts Hacked Via AI Chatbot Exploit (4 min)

Brief: Meta disclosed that over 20,000 Instagram users had their accounts hijacked through a vulnerability in its AI-assisted account recovery system, which hackers exploited by tricking the chatbot into resetting passwords and sending verification codes to attacker-controlled emails, granting full access to accounts, messages, and linked profiles for months until the flaw was patched this week.

🤖 Uber Caps AI Tool Usage to Rein In Spiraling Costs (3 min)

Brief: Uber is limiting employees to $1,500 monthly per AI coding tool after exhausting its 2026 AI budget in just four months, a rational cost-control measure that suggests the company values these agentic tools at roughly 11% of engineer compensation while signaling that subsidized API plans are no longer available to enterprise-scale customers.

🚀 VoidZero Joins Cloudflare to Power Open-Source JavaScript Toolchain (7 min)

Brief: VoidZero, creator of Vite, Vitest, Rolldown, and Oxc, is joining Cloudflare with all team members, while the projects remain open source, vendor-agnostic, and community-driven; Cloudflare commits $1M to a Vite ecosystem fund and plans to build its CLI on top of Vite to support AI-driven development and full-stack applications.

This week’s tip:

Implement request-level idempotency keys with a sliding-window probabilistic cache to trade bounded memory for rare duplicates. Use a Bloom filter for negative lookups and a time-partitioned LRU for positive confirmations; evict old partitions on window slide.

Wen?

Payment processing at scale: Detect duplicate charge requests without maintaining a full history database; safe to evict very old keys.
Event deduplication in async workflows: Bloom filter false positives cause harmless re-execution; false negatives are caught by partition re-check.
Multi-region consistency: Each region maintains its own partition set; reconcile via event log on conflicts to avoid global consensus.

Just get out and do it. You will be very, very glad that you did.
Christopher McCandless

That’s it for today! ☀️

Enjoyed this issue? Send it to your friends here to sign up, or share it on Twitter!

If you want to submit a section to the newsletter or tell us what you think about today’s issue, reply to this email or DM me on Twitter! 🐦

Thanks for spending part of your Monday morning with Hungry Minds.
See you in a week — Alex.

Icons by Icons8.

*I may earn a commission if you get a subscription through the links marked with “aff.” (at no extra cost to you).

🍔🧠 How Stripe Runs a 50M lines Monorepo With 0 Pain

Keep Reading

Hungry Minds 🍔🧠