Happy Monday! ☀️

Welcome to the 93 new hungry minds who have joined us since last Monday!

If you aren't subscribed yet, join smart, curious, and hungry folks by subscribing here.

📚 Software Engineering Articles

🗞️ Tech and AI Trends

👨🏻‍💻 Coding Tip

  • Property-based testing with fast-check finds edge cases automatically through constraint-solving

Time-to-digest: 5 minutes

Search is mission-critical at Palantir. When schemas evolve or indices corrupt, you need to rebuild them while users are actively searching. That's where the Elasticsearch reindex machine comes in — automating what would otherwise be a painful, manual nightmare across hundreds of deployments.

The challenge: Rebuild massive search indices from the database without taking the service offline, saturating resources, or requiring a ops team to babysit the process.

Implementation highlights:

  1. Shadow index pattern: Create a duplicate index with new mappings in parallel while the old one keeps serving reads. Dual-write to both until the shadow catches up, then flip the alias.

  2. Parallel pipelining: Workers dynamically read pages from the database while others index them concurrently, keeping I/O flowing without explicit role assignment.

  3. Three-dimensional rate limiting: Cap documents per batch, memory per batch, and total batches per run — ensuring no single dimension becomes the bottleneck that tanks your service.

  4. Crash-safe state machine: Persist every index lifecycle step to the database before touching Elasticsearch. If the server dies mid-reindex, it resumes safely from the last known state.

  5. Cluster-agnostic design: Writes fan out to all connected clusters, making single-cluster repairs and cross-cluster migrations work with nearly identical code paths.

Results and learnings:

  • Truly online: Reindex millions of documents while live traffic continues without degradation or downtime

  • Hands-free operation: Automated detection, repair, and execution across 100+ schemas per deployment across 100+ deployments

  • Battle-tested observability: Logs for investigation, metrics for dashboards, and on-demand diagnostics for root causes — because different problems need different tools

Palantir's approach proves that reindexing at scale isn't black magic — it's careful design. Shadow indices, pipelining, and crash-safe state machines turn a risky operation into something you can sleep through.

ARTICLE (brain-bake-101)
How LLMs are Actually Trained

ESSENTIAL (test-the-new-way)
A new era for software testing

Want to reach 200,000+ engineers?

Let’s work together! Whether it’s your product, service, or event, we’d love to help you connect with this awesome community.

Brief: Apple announces a major Apple Intelligence overhaul built on foundation models co-developed with Google, featuring on-device processing and Private Cloud Compute for image creation, advanced editing, and visual understanding while maintaining privacy guarantees that prevent data access to Apple or third parties.

Brief: OpenAI is overhauling ChatGPT into a "superapp" combining coding tools and AI agents to generate higher-margin revenue ahead of its planned IPO, shifting focus from the free chatbot toward paid business products like Codex, which now has 5 million weekly active users and accounts for 40% of revenue.

Brief: OpenAI's GPT-5.5, GPT-5.4, and Codex are now generally available on Amazon Bedrock, allowing 100,000+ organizations to access frontier AI models with native AWS security controls (IAM, VPC, KMS encryption, CloudTrail) without new vendor relationships, while Codex shifts from per-seat to pay-per-token pricing for developer teams.

Brief: Anthropic faced backlash for secretly throttling Claude Fable 5 with hidden safeguards that degraded responses to distillation attempts without notifying users, but now pledges to make these restrictions transparent by routing blocked queries to an older model with visible warnings.

Brief: After Anthropic acquired Bun, the JavaScript runtime's commit activity shifted from primarily human to over 80% AI-generated code, while external contributors dropped significantly, raising questions about whether the company understands open source stewardship or will treat the project as internal infrastructure rather than a community-driven standard.

This week’s tip:

Leverage property-based testing with constraint-solving to find edge cases in state machines and parsers. Tools like fast-check (JavaScript) or Hypothesis (Python) shrink failing inputs automatically, surfacing minimal counterexamples that traditional unit tests miss, especially for off-by-one errors and boundary conditions.

Wen?

  • Parser validation: Generate random input strings; verify the parser rejects invalid syntax and accepts valid syntax correctly.

  • State machine correctness: Run sequences of random state transitions; assert invariants hold after each step.

  • Serialization round-trips: Create arbitrary objects, serialize, deserialize, compare; catch encoding bugs automatically.

Some of your greatest lessons come from your darkest moments.
Roger Lee

That’s it for today! ☀️

Enjoyed this issue? Send it to your friends here to sign up, or share it on Twitter!

If you want to submit a section to the newsletter or tell us what you think about today’s issue, reply to this email or DM me on Twitter! 🐦

Thanks for spending part of your Monday morning with Hungry Minds.
See you in a week — Alex.

Icons by Icons8.

*I may earn a commission if you get a subscription through the links marked with “aff.” (at no extra cost to you).

Keep Reading