By Adrin Jalali, VP of Labs at Probabl and scikit-learn core maintainer, & Cailean Osborne, Head of...
How you can embrace AI in your open source project's contribution cycle: Insights from dlt

A topic on the top of many maintainers’ minds is what to do about AI-assisted or generated contributions. The surge of such contributions, including the infamous AI slop, is a shared experience and oftentimes a shared frustration among many open source maintainers.
As I’ve written about previously on the Probabl blog and on my personal blog, the arrival of AI coding tools has fundamentally changed the social dynamics of developing and maintaining open source. While the cost of writing and contributing code to an open source repo has shrunk dramatically, the cost of reviewing and maintaining it hasn’t.You may have heard of maintainers who are struggling to keep up with the skyrocketing inbound of issues and PRs, and often it’s hard to tell if first-time contributors even understand what they’ve contributed.
In February, my colleague Cailean Osborne and I published a blog post about “Maintaining open source in the age of generative AI” in which we summarized discussions and solutions we’ve seen in open source communities. We described how some maintainers reject all AI-generated contributions because they say they have never seen useful ones or that the risks of copyright infringement or license violations are simply too great. At the same time, others are proactively trying to steer the types of AI-assisted or generated contributions they receive by implementing solutions like AI contribution policies and AGENTS.md files.
Building on my previous blog posts, I’ve begun reaching out to maintainers who are trying new things to learn more about their considerations, choices, and experiences. Since many of us are having conversations about what to do about this new normal, I think it’ll be valuable if we share our insights about what works well and what doesn’t. This way, we can collectively build resources about best practices that may help any maintainer, no matter which ecosystem they’re in and which AI-related questions they’re wrestling with.
An area where I want to begin is AI contribution policies and AGENTS.md files because this is quite new and maintainers are trying out different approaches. An open source project that has adopted a rather proactive stance to AI contributions is dlt, the Python library that makes data loading easy. To learn more about the design and experience of their AI contribution policy, I interviewed Marcin Rudolf, the CTO of dltHub, this week.
My conversation with Marcin Rudolf
Adrin Jalali: First of all, tell me a bit about yourself and dltHub. What does the project do, and who's building it?
Marcin Rudolf: I’m the co-founder and CTO of dltHub, which I started in Berlin together with Matthaus Krzykowski, Anna Hoffmann, and Adrian Brudaru. Our open source project, dlt (data load tool), is a Python library for building data pipelines. Our mission is to make a new generation of Python users autonomous when they create and use data in their organizations – so they don’t have to wait on a dedicated data engineering team to move data around for them. On top of that we build dltHub Pro, our agentic platform that deploys, monitors, and scales those pipelines in production. Besides that I’m one of the main contributors of our library and a frequent reviewer.
Adrin Jalali: How has your team and/or community experienced AI-assisted or generated contributions, either from your own developers or from external contributors?
Marcin Rudolf: We have two streams of work: one in the dlt repository and the other in several private ones. As mentioned above, I’m a frequent reviewer of both and I’m pretty sure that most if not all of OSS contributions are LLM generated. We are getting way more of them and the quality (or mergeability) has increased. So actually our experience here is pretty positive. There are interesting challenges in our commercial product where we do way more prototyping and increasing productivity has side effects. I’ll talk more about it below.
Adrin Jalali: So, we’re here to discuss your CONTRIBUTING_AI.md file. It opens with a pretty strong statement: "We strongly encourage using AI coding agents." I’d say that's a more proactive stance than most open source projects take. What led you to that position?
Marcin Rudolf: A small note upfront: CONTRIBUTING_AI was a radical proposal that we were testing. Finally, it landed in our private repos. In OSS repo, we have a pretty strong setup for agents via rules, skills, and async tasks but no explicit guidelines. Why: because we realized that contributed code is now almost 100% generated by agents and we just decided to control the agents behavior instead. We do not need to encourage anyone… For us the transition already happened.
Adrin Jalali: You then mention that AI agents are "particularly effective in this codebase because we've invested in strong typing, generated API bindings, and comprehensive test patterns." Was that infrastructure built with agents in mind, or did you retroactively realize it was making agent output more reliable? And what does the failure mode look like in codebases that don't have such a foundation?
Marcin Rudolf: Good software architecture, coding patterns, test setup, and all those software engineering good practices are fundamental to having coding agents under control. Our OSS repo is pretty mature with really good patterns that over the years were proven to solve our problems. Agents are great at following them - way better than human newcomers, which didn’t invest weeks into onboarding. Without patterns to follow agents will go haywire and start to increase entropy in the project beyond human control. Same problems will be solved from scratch in different ways, code is not reused, docstrings are too verbose… Once slop starts spreading, agents will just copy it further. It is pretty dangerous.
Adrin Jalali: In terms of prompt hygiene, you have an explicit rule about not pasting production data, credentials, or unsanitized logs into agent sessions. How do you actually enforce that in a distributed open source community?
Marcin Rudolf: Here not much has changed with agents. dlt is a library and we never touch the production data of our users. Same thing in the commercial systems. The dev team does not have access to production secrets (not speaking of secrets of our customers). We have internal policies on what can be pasted into prompts and also a whitelist of LLM models we can use. Overall, there’s a fundamental tension between your agent both impersonating you and being prevented from accessing secrets. This is an actual challenge that we are solving in our commercial product with strict separating of dev/prod environments, local and remote sandboxing etc. A real unsolved problem and also an opportunity to build something interesting.
Adrin Jalali: In your tickets and issues guidelines, you describe two types of AI-assisted contributions. On the one hand, AI output that a human has fully reviewed and approved, which contributors can include as their own. On the other hand, output that a human hasn't verified, which should come with an explicit disclaimer at the very top. Your stated goal is transparency, specifically so that reviewers know which parts carry human judgement behind them and which are AI suggestions. How has that played out in practice? Does the disclaimer change how reviewers are engaging with issues?
Marcin Rudolf: In practice, code that is not fully reviewed by a human will never get merged. The review process itself has changed. We expect self review and use of automated review skills. The interaction with contributors during the review also changed. There’s way more automation at play and way less time (per line of contributed code) to make decisions. Examples:
1. Review produces an implementation plan that brings the PR to our expected standards. I often just ask the contributor for permission to use it, then apply and merge the PR soon after.
2. PRs where contributor could not control the agent (e.g. thousands lines of code) will be quickly closed or fully reimplemented from scratch by us (if the idea itself is valuable)
3. Overall, there’s way more communication about intentions and value of solving particular issues than the code itself.
Adrin Jalali: Now, turning to PRs. You state "AI-generated code follows the same rules as human-written code" and back that up with six concrete instructions before submitting a PR: reading every line of the diff yourself, flagging code you can't explain, requiring tests, running format and lint, running the relevant tests, and running your /review skill. How did you arrive at those six specifically?
Marcin Rudolf: All our agent skills and rules follow good engineering practices we were doing ourselves before.
Adrin Jalali: I’m curious about the /review skill that you built which runs an automated branch review before PRs are opened. What does that workflow look like in practice? What is it catching? Did it require significant investment to build and maintain? Also, how often do you change the skill itself?
Marcin Rudolf: Here’s the skill we use in OSS: https://github.com/dlt-hub/dlt/tree/devel/.claude/skills/review-pr and it is pretty darn good. It basically creates a deep context out of the issue, PR comments, code, external docs etc. and then an implementation plan that is compared with the contributors code (mostly to find missing edge cases that need to be tested). The skill itself is based on what we were doing and what we've learned in the last few years when doing reviews. It is updated from time to time if we encounter any gaps. I personally always use it after I understand the contributors' PR myself and pass my findings and doubts to the skill to further drill down. Pretty useful.
Adrin Jalali: Interesting. You have a section specifically about improving the AI rules, and you even suggest having the agent update its own rules when it repeatedly gets something wrong. Tell me a bit about how that works in practice. Who notices when a rule is missing or broken? Who owns the fix? How do you prevent that configuration from drifting into noise or internal contradictions over time? And most importantly, how do you ensure preventing prompt injections there?
Marcin Rudolf: In practice, you must identify what is your source of truth and, as for now, this truth must come from humans. In our case, the truth is code which we actively maintain and an ontology with terms, glossaries, use cases, entity diagrams etc. that encode the knowledge of the company. We have https://github.com/dlt-hub/dlthub-ai-workbench where we validate PRs against existing documentation and code bases, official terminology etc. At this moment, we do not have self updating skills. For now, we settled on a human curated source of truth.
Adrin Jalali: You have a .planning/ folder in your repo, where developers can commit context for feature planning, implementation decisions, and research done with AI agents. That’s an interesting experiment. What's the signal so far? Are people actually going back and reading those files, or are they mostly write-only?
Marcin Rudolf: Yes, this is an experiment coming from an radical idea that your agent session (not code) is the ultimate source of truth. You can even use https://github.com/dadlerj/tin and model your commit structure based on sessions. In practice, the planning folder is used a lot to share knowledge and generate documentation but information there becomes stale and old plans are deleted. Usage is reserved for bigger refactors or fundamental features only. And we do not use it in OSS at all.
Adrin Jalali: You have a section called "Help the Agent Help You" built around a clear principle: "if a machine can verify it, don't make the AI or a human reviewer verify it." Concretely, you tell contributors to use strong typing everywhere, use generated API bindings, and lean on the linter and type checker. I’m particularly curious about these API bindings for Python clients. That’s not always the norm in the Python community. Do you think that impacts the kinds of contributors you get? Has it changed some people’s opinion on typing in Python?
Marcin Rudolf: This question touches on a pattern that I was hinting earlier. Most of our contributors are practitioners that come to fix their production edge case. Very often they are not software engineers and come to fix a single issue. Our rules, patterns, coding style are indeed pretty tough but now it is up to agents to onboard and follow those rules. And it really works - we have more PRs and they are pretty good - agents enabled those users to get stuff they need merged.
Adrin Jalali: You draw a clear line between things AI is good at like scaffolding, E2E tests, refactoring with coverage; and things that need double human attention like migrations, auth, and CI/CD. How did you arrive at that specific list? Did it come from incidents, or is it more of a principled heuristic?
Marcin Rudolf: Those are rules that went into our commercial/private repos and they come from incidents from before any LLMs existed. Those rules are good practices that we’ve learned from our past mistakes.
Adrin Jalali: Based on your experience, if a maintainer of another open source project came to you and said "we're starting to see AI-generated PRs, what should we do first?", what would your top recommendation be?
Marcin Rudolf: I’d suggest investing into a solid setup for linting, type checking, and testing. Refresh on software architecture and use well established coding patterns. In short: establish good patterns for agents to copy from. Make it explicit by investing, e.g., in `.claude` setup. Tighten the review discipline - do not allow any slop or blackbox code into your repo or risk it spreading. Don’t get dragged into endless code generation, get used to spending 90% of your time reviewing LLM generated code. Tbh, we are heading into the unknown where you do not write code anymore but (hopefully) you can still be an engineer.
Adrin Jalali: Finally, do you think agents.md files, or something like them, will become standard for open source projects the way CONTRIBUTING.md or CODE_OF_CONDUCT.md did?
Marcin Rudolf: I really don’t know! Will the way we instruct agents become even more human-like? If so - we are back to CONTRIBUTING.md - no need for special agent files. Or will we develop some sort of not-so-human-readable DSL - then human and agent setups will be more and more apart. By the way, our AGENTS.md is even more human than our CONTRIBUTING.md document… just a few principles of being a good dlt engineer:
1. This is a mature library used by thousands of the best engineers in the world
2. dlt is a library, not a platform. users add it to their code. We respect existing workflows and other libraries they use
3. No black boxes: clean, minimal Pythonic interfaces, human readable file formats, no side effects, documentation
4. Multiply - don't add: we do more work here so our users do less
About Marcin Rudolf
Marcin Rudolf is co-founder and CTO of dltHub, the company behind dlt, the open source Python library for building data pipelines. With over 17 years as a CTO across machine learning, blockchain, and data engineering, he focuses on making a new generation of Python users autonomous when they create and use data in their organizations. Marcin remains one of the main contributors to dlt and a frequent reviewer.
For more from Probabl
- Follow our latest updates on LinkedIn
- Subscribe to our monthly newsletter
- Check out over 100 tutorial videos on our YouTube account
- Level up your machine learning skills for free with Skolar