:probabl.blog

Probabl is a mission-driven company: This is what that means in practice

Written by Yann Lechelle | Thursday, May 21 2026

By Yann Lechelle, Executive President & Chairman of Probabl

When we founded Probabl in 2023, we made a deliberate choice: to incorporate the company as a mission-driven, for-profit company (in French, Société à Mission) – a legal designation under French commercial law that writes our social purpose directly into our bylaws and makes us accountable). Not a values statement on a website. Not a voluntary certification. A legal binding.

Our company’s mission, inscribed in our bylaws, is:

"To develop, maintain at the state of the art, and sustain a complete suite of open source tools for data science to benefit France, Europe, and the world."

Today, I'm proud to announce the launch of the independent body that holds us to it: the Probabl Mission Alignment Committee.

Why this mission matters

Supporting the scikit-learn ecosystem that millions of data scientists depend on every day

Scikit-learn is the Python library for machine learning. It has been downloaded over 4.6 billion times, with more than 200 million downloads last month. It’s a dependency for over 1.3 million GitHub repositories and over 28,700 open source packages. It’s cited in over 130,000 academic papers, including more than 7,300 Nature publications.

It’s taught in classrooms and bootcamps across the world, and used in research labs and production systems across every major industry, from manufacturing and energy to healthcare and financial services. More recently, as we can see from the skyrocketing downloads, it’s used by agents, too.

In short, scikit-learn is the open digital infrastructure of data science and machine learning that millions of practitioners depend on every day.

Scikit-learn is the open standard for machine learning, but it is not alone. When a data scientist opens a Jupyter notebook, they may reach instinctively for a set of tools: numpy and scipy for scientific computing, pandas for data wrangling, matplotlib for data visualization, scikit-learn for machine learning, PyTorch for deep learning, Transformers for accessing millions of pretrained models hosted on the Hugging Face Hub, and so on. The chart below illustrates a number of such open source tools.

The broader ecosystem of Python libraries for data science and AI follows a similar pattern to scikit-learn: massive, global adoption of these tools, which too often are developed and maintained by small teams of maintainers with uncertain long-term funding. This is the open source sustainability problem, and it is not hypothetical. It is happening now, to tools that millions of people and organizations depend on every day.


Figure 1: Download trends of the most downloaded Python libraries per category of data science and AI

 

The open digital infrastructure of data science and AI

There is a useful analogy here, and it is not flattering to how we've treated open source software. We would not build roads and then expect the crews who paved them to keep them maintained for free, indefinitely, out of passion for asphalt. We would not build a public water system and then shrug when it starts to decay because no one could be found to fund its upkeep.

Yet this is roughly how large swaths of the software industry have treated open source. We have extracted enormous value from it. We have built trillion-dollar businesses on top of it. And we have systematically underinvested in the people and organizations who maintain it.

We should consider the Python libraries shown in Figure 1, among many other important ones that are not shown, as core open digital infrastructure of data science and AI. In particular, scikit-learn is open digital infrastructure because it is:

  • Non-excludable: Anyone, anywhere, can use it freely.
  • Non-rivalrous: One person's use does not diminish another's.
  • Foundational: It sits beneath and enables a vast ecosystem of tools, research, and commercial applications, with over 200 million monthly downloads, over 1.3 million dependent repositories on GitHub, and over 28,700 dependent open source packages. Given this scale of adoption and dependency, the cost of replacing it would be staggering.

Scikit-learn is a global open patrimonial asset

Scikit-learn is at the same time a European and a global open source success story. It was created at Inria in Paris-Saclay, France in 2007 and the majority of the core maintainers are based in Europe, with a center of gravity in France and Germany.

In light of its widespread use, it represents one of the rare examples of genuine European leadership in the global AI software stack – comparable in this regard to ASML in semiconductors. The difference is that ASML has a market capitalization in the hundreds of billions based on a unique hardware business. Scikit-learn's maintainers, until Probabl was founded, were largely sustained by the goodwill of research institutions and a handful of forward-thinking companies willing to sponsor the project.

Scikit-learn is also, proudly, a global open source project, with core maintainers and both technical and non-technical contributors from many countries, including the USA, Canada, China, Japan, India, Australia, and others. The maps of issue and PR contributors on OSS Insights show how global the scikit-learn community truly is.

The need for stewardship to sustain critical open source software

Open source projects – be it critical libraries like scikit-learn or PyTorch, or any other project in AI or beyond – are not self-sustaining. Behind every release is a team of engineers fixing bugs, reviewing pull requests, adopting new APIs, updating documentation, and keeping pace with a field that moves extraordinarily fast. Without sustained investment, even the most widely used tools in the world quietly decay.

This is where stewardship comes in. Stewards – whether companies like Probabl, Red Hat, or Hugging Face; foundations like the Linux Foundation, the Python Software Foundation, or the PyTorch Foundation; or research institutions like Inria – are the organizations that support the sustained, long-term health of open source projects. They may employ maintainers or fund contributors. They may apply for grants, manage governance, nurture communities, and make the unglamorous investments – documentation, security audits, backward compatibility – that keep a project trustworthy and usable over years and decades.

Stewardship is not always glamorous work, and it is rarely the thing that gets celebrated. But without it, the open source ecosystem that data science and AI depend on does not hold together. The question for every critical open source project is not whether it needs a steward, but whether it has one, and whether that steward has the resources and the commitment to do the job properly.

What we've done so far as a mission-driven company

Probabl was founded with the conviction that open source tools and open source digital infrastructure of such importance need a steward – one with the resources, the expertise, and, uniquely, the legal obligation to keep it running. That is what our Société à Mission status means in practice.

Since our founding, we have:

  • Invested ~€2 million cumulatively in the development, maintenance, and sustainability of scikit-learn, skore, skrub, and skops employing core maintainers and contributors.
  • Helped win competitive grants supporting the development of scikit-learn: from NASA ROSES program with Quansight Labs, and from the Essential Open Source Software for Science program by the Chan Zuckerberg Initiative and the Wellcome Trust with NumFOCUS.
  • Launched the free and open source Skolar MOOC, ensuring that world-class machine learning education remains freely accessible to everyone. Since July 2025, over 13,700 unique users have levelled up their machine learning skills with Skolar.
  • Produced over 115 tutorial videos, explaining technical concepts and best practices in machine learning with Python. All videos are freely available on YouTube.
  • Raised €18.5 million in funding – one of the largest combined pre-seed and seed rounds ever raised in Europe for a commercial open source company – to sustain this mission long-term.

Introducing Probabl’s Mission Alignment Committee

A mission written into law requires oversight. That's not a burden – it's the point. Anyone can claim to care about open source. A Société à Mission has to prove it, annually, to a committee of experts and to the French government.

Our Mission Alignment Committee is composed of five domain experts drawn from open source software, enterprise technology, and the AI ecosystem. They are empowered to review our evidence, contest our findings, access our financial information, and escalate concerns directly to our Board. They produce an annual Mission Alignment Report submitted to an independent 3rd party auditor.

The five members of our inaugural Mission Alignment Committee are:

  • Peter Wang, Chief AI & Innovation Officer, Co-Founder, Anaconda Inc.
  • Emily Omier, Positioning consultant for open source companies and co-founder of Open Source Founders Summit
  • Mark Surman, President, Mozilla
  • Arnaud Le Hors, Senior Technical Staff Member, Open Technologies – Open Source Security and AI, IBM
  • Cailean Osborne, PhD, Head of Ecosystem Development, Probabl

I'm grateful to each of them for lending their expertise, their independence, and their credibility to this endeavor.

Statements from our Mission Alignment Committee

Peter Wang, Chief AI & Innovation Officer and Co-Founder of Anaconda Inc.:

“Machine learning has become central to every aspect of human life, and trusted, innovative open source projects like scikit-learn are a crucial part of that foundation”

Emily Omier, Positioning consultant for open source companies and co-founder of Open Source Founders Summit:

“It's important for data science tools to remain as independent and transparent as possible; keeping them open source is the best way to do so.”

Mark Surman, President of Mozilla:

“Open source creates choice. This is no different in data science and AI. With open source tools like scikit-learn and any-llm, data scientists can train, compare, and choose the best models for their use cases. Teams can swap models without rebuilding. Enterprises can switch vendors without starting over. That chain of freedom runs from the individual data scientist all the way to the top – but only if the foundations stay open and it takes dedicated people to develop, maintain, and sustain those foundations. That's what Probabl exists to do.”

Arnaud Le Hors, Senior Technical Staff Member, Open Technologies – Open Source Security and AI, IBM:

“In an era where we see some companies pull the rug on open source communities by suddenly switching to a business license, it is great to see projects like scikit-learn that so many depend on in the data science and AI space being backed by companies like Probabl that are committed to staying on the open source path.”

Cailean Osborne, Head of Ecosystem Development at Probabl (our company representative):

“Open source is the bedrock of data science and AI. Python libraries like scikit-learn, pytorch, and many more are used daily by millions in classrooms, labs, enterprises, and public institutions all over the world. But open source projects like these – and crucially the communities that underpin them – don't just sustain themselves. Their continued development requires dedicated stewardship and investment. Probabl was founded with this conviction, and that’s why I’m proud to be part of the team.”

What comes next

The Mission Alignment Committee will meet twice a year. It will produce an annual report. It will challenge us – and we welcome this, because that is exactly what it is there for.

The goal was never to build a compliance function. The goal was to build something that lasts – a sustainable model for stewarding open digital infrastructure for data science and machine learning that the world depends on. The Mission Alignment Committee is how we prove, year after year, that we mean it.

If you work in data science or AI, the tools that Probabl stewards are almost certainly part of your daily life. We don't ask for anything in return. But we do ask you to care – about who maintains the foundations you build on, about whether they are adequately supported, and about what happens if they are not.


For more from Probabl