Awesome
<!-- markdownlint-disable --> <!-- prettier-ignore-start --><!-- prettier-ignore-end --> <!-- markdownlint-enable -->[!IMPORTANT] On June 26 2024, Linux Foundation announced the merger of its financial services umbrella, the Fintech Open Source Foundation (FINOS), with OS-Climate, an open source community dedicated to building data technologies, modeling, and analytic tools that will drive global capital flows into climate change mitigation and resilience; OS-Climate projects are in the process of transitioning to the FINOS governance framework; read more on finos.org/press/finos-join-forces-os-open-source-climate-sustainability-esg
OS-Climate Data Commons
OS-Climate Data Commons is a unified, open Multimodal Data Processing platform used by OS-Climate members to collect, normalize and integrate climate and ESG data from public and private sources, in support of:
- Corporates in efficiently disclosing and managing their own climate and ESG data, including correcting, reporting and confirming the information in an auditable and secure manner.
- Data scientists in collaboratively solving data collection, cleaning and normalization issues, based on shared modeling standards, tooling and commnunity development following a data pipeline as code approach.
- Decision makers such as investors, financial institutions, regulators in integrating new or existing scenario-based predictive analytics with an open repository of trustworthy climate data.
Overview
The Data Commons platform aims at bridging climate-related data gaps across 3 dimensions:
-
Data Availability: The platform supports data availability through data democratization via self-service data infrastructure as a platform. A self-service platform is fundamental to a successful data mesh architectural approach where existing data sources are federated and can be made discoverable and shareable easily across an organization and ecosystem through open tools and readily available infrastructure supporting data creation, storage, transformation and distribution.
-
Data Comparability: The platforms supports data comparability through domain-oriented decentralized data ownership and architecture i.e. data is treated like a product. The goal is to stop proliferation of data puddles to “connect” the data with proper referential and relevant industry identifiers in order to have collections of data aligned with business goals.
-
Data Reliability: The platform supports data reliability through a federated data access, data lifecycle management, security and compliance. This supports a data as code approach where the data pipeline code, the data itself and data schema are versioned so as to have transparency and reproducibility (time machine), while enforcing authentication and authorization required for data access management with consistent policies across the platform and throughout the data lineage.
For more information on this and how Data Commons fits into the picture, good introduction links include the official Data Commons page on OS-Climate website, as well as the video recording of the Data Commons Platform Overview at the COP26 in Glasgow. Detailed platform documentation maintained by our community is available in this repository and accessible through the links below.
Architecture
Data Commons Architecture Blueprint