While human centric data governance models have been on the rise, to unlock the value of responsible data sharing and bottomup governance – there have been myriad roadblocks along the way. Yet, the landscape of data stewardship has matured considerably in recent years; both in the volume of stewardship efforts, as well as an increased interest from other players within the ecosystem to magnify initiatives that foreground community value. One of the primary challenges of these stewardship efforts has been an inability to scale – to move beyond pilots or otherwise temporally bound structures. Scalability, particularly for data stewards, is a multi faceted and multi stakeholder challenge. While this play speaks to stewards themselves, it is not without the support of other pillars in the ecosystem that sustainable scale can be realised. There is a dearth of literature and evidenced research for stewarding entities to fall back on when considering revenue models, partnership avenues and more. Avenues for funding remain limited or unknown to many stewarding initiatives, despite a marked increase in investors’ interests in data sharing. Technical capacity, both in the communities being served as well as within the steward itself also poses a major challenge.
Collective governance is strengthened most significantly by an increase in the volume of the collective, the capacity of the collective, and support for the collective. Thus, if we envision data stewardship as a means to unlock wider societal value from data, while also preserving and meaningfully amplifying data rights – scalability becomes crucial. The question of scale operates not just at the level of an individual steward, but is a concern even at an ecosystem level. Scaling of data stewardship initiatives, or rather, an increase in the number of stewarding initiatives across communities, is also a matter of concern. Our research has found a disconnect between various stewardship efforts despite the inherent possibilities of sharing learnings, challenges and opportunities. Without this, the ecosystem of data stewardship and consequently, its benefits, will remain scattered.
Challenge g3.1
Ensuring financial sustainability beyond the pilot phase
The field of data stewardship has been dominated by pilot projects and experiments that fail to reach scale or are abandoned when funding dries up. Relying on time-limited funding, such as grants can be challenging to predict future revenues. To maximise the value and impact of their work, data stewards are faced with the task of managing institutional change towards cementing the culture of data stewardship. While data stewards might generate revenue from varied sources (including selling data and membership fees) the most common source of income / funding for stewards are grants / funds received from institutions – be it private philanthropies or public sector actors. While a sustainable business model for a steward will rely on a healthy mix of funding sources, stewards should look to prioritise specific sources based on their lifecycle. Critical to carrying this out is to draw a distinction between earned revenue (revenue derived from supply of data and allied services) and non-earned revenue (donations, grants and other funding sources). Typically, a steward will have to rely more on non-earned revenue sources during the initial stages of their lifecycle, and slowly shift the dependency towards earned revenue as they grow. A healthy diversity in revenue sources has been associated with sustainability of an organisation.
This is easy enough on paper, and generally a model followed by most businesses. However, with data stewards this is particularly challenging for a few reasons: (a) the altruistic goals, of data protection and seeking public value for data, are ones that people are ordinarily not inclined to pay for; (b) data collected by stewards is of value only when it is of a certain critical mass, making it representative of the population it relates to; and (c) sharing data under conditions of privacy with strict purpose limitations is not how the digital economy has worked so far. Identifying a clear business model is therefore critical for the financial sustainability of a steward, but also a challenging one.
The public sector can also play a crucial role in ensuring the growth and sustainability of data stewardship initiatives. However, funding from the public sector is more often than not limited (for more, please see Play I). Similarly, tensions may arise if funding agencies and data stewards have goals and priorities that are not in sync. Funding for data management is often irregular and of limited time and scope. This, in turn, also affects the potential to increase technical capacity of data stewardship efforts through involvement of developers or technologists, keeping most pilots suspended in small-scale, volunteer efforts. Philanthropic funding is often fixed over multiyear contracts. There is a danger that funding will not keep pace with growing data volumes impacting the scalability plans of data stewards.
Strategy g3.1.1
Assess value delivery and product accessibility in order to identify suitable business models
Earned revenue is generated from two main sources: (a) from the community members (either through membership fees or through services provided to members); and (b) from external sources (through the provision of data and allied services). At the outset, identifying the governance structure of the organisation and the value that the steward seeks to deliver to the community are key in delineating strategies for membership or subscription pricing. Research shows that membership rather than subscription fees might offer additional benefits for some data institutions. Membership encourages active participation and is used in a way that ensures people contributing data are the one governing access to it. Membership models convey a sense of belonging, trust and community based on shared values and interests. Members may be expected to contribute in both monetary and non-monetary ways, for example with their energy, expertise and time. On the other hand, subscriptions are a simpler, transactional exchange of services for a fee. Crossref, a membership organisation that assigns and maintains identifiers for research outputs, supports membership because it emphasises the notion that data stewardship is a collective endeavour.
However, not all members will always have the time and energy to participate in the stewarding organisation in the same way Based on their driving aims, a steward could have a decision making system that involves each and every one of its members, or can opt for a model where the governance structure of the stewardship can delegate the management burden and entrust the steering responsibility to properly chosen people or organisations. Stewards can opt for a hybrid model, and accordingly stagger membership fees, thus maximising on revenue from community members.
Equally important are the incentives for members to participate in the activities of the steward. Most participatory stewarding organisations are structured as community-owned, and it is important for the steward to determine how returns from the steward’s activities are distributed amongst its members in a way that incentivises participation, but not at the cost of overall membership figures. Research into sustainable revenue-allocation schemes for data cooperatives shows that a “Robin-hood” model works best, where the right amount of additional incentive is provided for privacy-sensitivity. The research cautions that such incentive cannot be too lopsided as it can be taxing on one set of members, possibly fracturing the membership group.
In terms of external revenue sources, data stewards may seek to change how they deliver value by restructuring their business models to make their products and services more accessible. E.g. Idaho Health Data Exchange partnership with Amadeus platform proved to be successful in enhancing value for patients. The partnership with Amadeus platform provided dedicated resources in technical onboarding of new data providers and their interfaces. This increased the overall value of Idaho Health Data Exchange participants while creating systemwide improvements in the value of patient care. This example also relates to another manner in which stewards can generate revenue outside of their core business. While stewards can sell the data they collect, stewards can also look into partnerships providing in-kind support, such as technical infrastructure, administrative support, or access to a particular set of actors.
The task of assessing value delivery and product accessibility is not one-time. Based on the stage of the steward, the results of assessment are liable to change and evolve. It is therefore important for stewards to periodically undertake this exercise. Our research and conversations with CSOs has identified key questions and considerations for data stewards based on where in the lifecycle of a steward they are.
Strategy g3.1.2
Funder flexibility to understand long-term stewardship goals
For funders, the priority must be to provide sustainable funding to support infrastructures required for long-term stewardship of data. Supporting bottom-up approaches that are already in place with grants is a recommended approach instead of developing new data governance programmes. On a practical level, funders should include multi-year flexible funding, streamlined applications and reporting based on commitment to build relationships based on feedback, transparency and mutual learning. For example, Co-impact, a philanthropic collaborative, recommends trust-based philanthropy in everyday practice to address social issues.
Grant seekers are often encouraged to customise their proposals to fit funder priorities, which may have been developed based on inadequate consultation with the target audience. This can create tensions on the functioning of a data steward and its intended goal. Giving grant seekers the space to step back and proactively articulate their own strategy and vision can lead to greater sustained change and success over time. Pooled funding models encourage collaboration among funders. This can help to reduce the transaction costs associated with multiple processes for managing, verifying and sourcing. It reduces the risk of duplicating efforts. Network-building efforts from funders like collaborative philanthropies can be useful in ecosystem strengthening as well – e.g. arranging convenings of funders (and/or grantees) to expand their knowledge on ongoing work.
Challenge g3.2
Identifying incentives and community-building
A factor that is important for a steward to scale up, is the ability of a steward to continually expand their user base. However, this has proven to be a major stumbling block for many stewards. Our research and conversations with ecosystem stakeholders indicates that this might have to do with incentives for being part of a data stewarding initiative. The primary goal of most data stewarding initiatives, if not all, is to provide community members with more control and agency over their data, including in matters of who it is shared with and what uses it is put to. However, this is unlikely to be a sufficient incentive for most people. While there are groups of people concerned about their privacy and the impact to their rights from misuse or widespread sharing of their data, this remains a relatively restricted group of people. This is a problem that is more pervasive depending on the context and the community in question.
The problem of scaling of data stewardship, particularly for non-steward stakeholders in the ecosystem, is not restricted to scaling of one particular initiative alone, but scaling of the concept of stewardship as a whole. Despite the growth of data stewardship in discourse on participatory data governance, stewardship remains a rather esoteric concept. Even within the data governance community, many people and organisations are entirely unaware of the concept. And in some cases, organisations that would be categorised as data stewards were unaware of the literature around this concept, or were unsure of the taxonomy to be used in finding adequate resources.
Strategy g3.2.1
Identifying incentives beyond agency over data
As mentioned above, agency over one’s data is often not incentive enough for people to join a stewarding initiative. Marginalised or poorer communities from the Global South, for example, typically either do not know or care about their data rights. For example, a research effort undertaken by Aapti Institute and the Open Data Institute for the Global Partnership on AI involved co-designing data trusts for climate action. This included the design of three data trusts: one for cyclists in London, one for small shareholder farmers in India, and the last for climate migrants in Peru. While there was clear indication that there was a demand for a bottom up data trust from cyclists in London, this demand was missing in the case of India. Farmers and CSOs working with farmers we spoke all articulated that agency and control over data was quite low on a list of priorities for farmers, and unless we could show incentives in terms of better financial access or access to wider markets, farmers would be unlikely to sign up for a data trust.
This issue of identifying incentives is especially critical at the stage when stewards are looking to expand their user base. Undoubtedly, there are likely to be data minded community members who have not joined a stewarding initiative simply because they were unaware of the initiative. In such cases, the usual strategies of widening communication channels, and advertising the initiative to the target audience will be successful. However, in order to reach a wider audience and gain the increase in membership base that is required to scale, it is imperative that stewards identify incentives beyond simply agency and control over data. These incentives will vary based on the community the steward is looking to serve and the nature of data they collect. However, given that the data the steward collects can be put to various uses, once stewards have built a strong foundational dataset, they can use the services and partnerships this dataset unlocks to provide varied incentives to attract new members. Aapti’s recent work as part of the 17 rooms project involved assessing the value of adding a data layer to an existing agricultural cooperative. As part of this, we identified the specific benefits that would accrue to the cooperative members from adding the data layer. These included better access to credit by building a digital identity for farmers, generating data on group funds to help secure more credit, and possible improvements in yield data-driven advice on better farming techniques. Women farmers Aapti spoke to noted that these were much more attractive incentives for them.
It should also be noted that the incentives need not be linked solely to the data collection effort alone. Abalobi, a steward for small scale fisheries in South Africa, for example has helped visibilise the labour of women in the fishing value chain and has been able to realise actual value to them. This is only one in a range of incentives that Abalobi can speak to that are a result of its data stewarding efforts – including strengthening the community, capacity building, and sustainable fishing.
Strategy g3.2.2
Identifying and communicating challenges, enablers and strategies
Scaling of data stewardship, at the ecosystem level, relies on proper communication of learnings and principles from existing initiatives and reporting what the challenges the initiatives have faced and enablers they have had based on their socio-economic and political contexts. In our engagement with data stewardship over the past three years, we have come across numerous steward or stewardlike initiatives from across the globe, as well as various other organisations in the stewardship ecosystem that would benefit from the work / learnings of other organisations in the ecosystem. However, in many cases, there was a lack of awareness of the other organisations, or even a dearth of literature around these efforts. As an anecdotal example, in addition to Abalobi referenced above, we spoke to 2 other stewards / steward-like initiatives operating in the small fisheries space, and came across three more such initiatives, all from different parts of the world. Even in the case where these initiatives were aware of the work the others were doing, which was not always the case, there was a dearth of resources that an initiative could access to inform them of the lessons learned by the others.
The burden of creating these resources however must not be placed entirely on stewards themselves. Civil society organisations can play a major role in identifying the key learnings from these stewards and distilling them into resources that can be made accessible widely. An example of this is the paper that Aapti Institute wrote on the lessons that could be learned from the various stewards for small scale fisheries, and some of these learnings were applicable not just to fisheries, but to stewards in other sectors as well. Indeed, even this Playbook serves as a resource to provide stakeholders in the stewarding ecosystem with knowledge about other initiatives that they could possibly benefit from.
The community focus a steward has might put a limit to the scale of members the steward can reach. In many cases, attempting to expand beyond that specific community might not be the most feasible option from a business perspective. However, the model itself can be replicated in another community, or another context, and resources that speak to the challenges faced by the steward as well as strategies that worked for them can be critical in helping new initiatives thrive. Additionally, such resources also serve as important informants for governments and funders in their process of setting their funding priorities, and identifying possible avenues for funding.
Challenge g3.3
Limited technical capacity
Collecting data, implementing stringent systems and protocols to ensure data security, being able to analyse the datasets for insights and building applications to collect data are all complex tasks that require a high degree of technical capacity. The 2022 MongoDB Report on data and innovation revealed that 73% of respondents agreed that working with data is the hardest part of building and developing applications. Addressing these challenges is critical to ensure a data stewards ability to scale successfully – not only is it a core component of how the steward functions, it can also have significant impact on internal efficiencies. The challenges faced in technical pathways for operationalising community participation in data stewardship are dealt with in Play 3. This Play will focus instead on strategies stewards can take to address the challenge of limited technical capacity.
Strategy g3.3.1
Collaboration between data stewards and tech developers
The ability to manage and anticipate risk in design and development, particularly those systems that are complex because of sensitive data can prevent ‘techlash’. Stewards can look to partner with tech developers who are looking to avoid the pitfalls of the current digital economy and instead are focused on creating new alternatives designed to foster a more participatory future. The UK Behavioral Insights team for example, has designed two platforms: Your Priorities, a citizen engagement platform connecting citizens with the government, and Applied, a recruitment platform, with a focus on fostering diversity.
Such organisations can anticipate user needs more accurately given their wealth of experience. This will help build applications that are easy and intuitive to use, thereby improving user retention and helping acquire new users. However, stewards don’t always work with communities that are digitally literate. It is therefore key that such partnership ensures that developers should focus not only on safe and secure design, but also on how the interface of that technology remains accessible, developing applications that work equally well for groups that lack digital skills in order to allow for equal access. Play 3 looks at how co-designing strategies can address such challenges.
Strategy g3.3.2
Easier data discovery, privacy enhancing technologies and trusted research environments as drivers of scalability
Enabling easier data discovery and its potential application can scale a data steward. To address the challenge of stale data, stewards can start with a core discovery platform. This can empower data stewards to uncover context for data usage, reduce time to understand impact analysis and derive meaningful insights from data. Further, including Privacy Enhancing Technologies (PETs) can offer the steward the ability to accelerate secure data collaboration/exchange, build trust and maximize data value without compromising privacy. Given that privacy is on the forefront of a data steward’s activities, including PETs can enable a host of data collaborations that make data more valuable to internal teams as well as external partners while preserving privacy. However, as the Global Partnership on AI notes, there are challenges to data usability commonly faced when working with PETs, and it is incumbent on other stakeholders in the ecosystem to address these in order to provide stewards who are typically smaller with the ability to compete with larger organisations.