Thursday, June 9th, 2022 (almost 2 years ago)
In early 2020, we submitted a proposal to the Chan Zuckerberg Initiative’s (CZI) Essential Open Source Software for Science (EOSS) grant program. The goal of the proposal was to accelerate progress on key parts of our roadmap, providing dedicated support to augment regular volunteer contributions, enabling work on issues that are often too large to tackle by through volunteer-only contributions. The proposal was funded in late 2020 and our team spent the last year and a half working on a handful of new features and complex internal refactors. Having just closed out the grant, this blog post takes a look at what we proposed and how the development turned out.
More than any single feature addition, we felt like a key need for the Xarray project was a concerted focus on understanding and supporting our user community. To this end, we took on three tasks:
The second focus area in our proposal was a long-awaited reworking of Xarray’s Indexes in order to better support complex index objects (e.g. MultiIndex or KDTreeIndex). Indexes in Xarray support coordinate-based indexing, slicing, and alignment. Benoît Bovy led this work which culminated in a very large Pull Request refactoring the internals of Xarray objects to explicitly include indexes are part of the data model. With the internal refactor in place, it is now possible to develop custom indexes for Xarray objects (see GH Project #1 for more detail on the status of this effort). Expect more on this new functionality in another blog post.
Finally, the third focus area in our proposal was a refactor of Xarray’s storage backends to enable reading from and writing to a variety of storage formats. In this work, we wanted to standardize the backend API and support the development and integration of third-party backends through a plugin interface (i.e. entrypoints). Aureliana Barghini of B-Open led this work which, starting with Xarray v0.18, added read-only support for custom backends. Since then, multiple third-party libraries have made use of this new functionality (e.g. rioxarray, cfgrib). For now, we’ve held off on adding entrypoint support for write operations but hope to see this progress in the future. GH Project #3 provides a current view on the status of this topic. If you are interested in writing a custom backend, check out this step-by-step guide.
Overall, our first CZI EOSS grant was an overwhelming success. Each of the core areas of our proposal saw significant development progress. We did find however that we were overly optimistic about the rate of development. We ended up needing a 1-year extension to complete the work on the proposal and left a few development tasks in partially completed states. Going forward, we’ll take these learnings into future proposals and grant funded development work.
Credits: Joe Hamman (@jhamman) led the proposal and grant coordination efforts. Stephan Hoyer (@shoyer), Ryan Abernathey (@rabernat) and Deepak Cherian (@dcherian) contributed to the proposal and provided input to the various development efforts. Benoît Bovy (@benbovy), Anderson Banihirwe (@andersy005), Alessandro Amici (@alexamici) and Aureliana Barghini (@aurghs) contributed to the proposal and made significant contributions to Xarray under this grant.