A review of 80 assessment tools measuring water security

Scholars and practitioners have been working on methodologies to measure water security at a variety of scale and focus. In this paper, we critically examine the landscape of water security metrics, discussing the progress and gaps of this rich scholarship. We reviewed a total of 107 publications consisting of 17 conceptual papers and 90 methodological papers that propose 80 metrics to measure water security and observed that there are two dominant research clusters in this field: experiential scale‐based metrics and resource‐based metrics. The former mainly focus on measuring the water experiences of households and its impact on human well‐being, while the majority of the latter assess freshwater availability or water resources security. We compare their approaches and the arguments used to develop them. We posit that the more local the scale and the more specific the water domain, the more meaningful results that the metrics can provide. Acknowledging the interrelationship between different water domains (e.g., water resources and water hazards) is important, but their aggregation for measurement may be problematic. We offer our views on future work in this field relating to topics beyond water, the need to conduct validation tests, and collaboration among academics and with other stakeholders.


| INTRODUCTION
Water security scholarship emerged in the early 2000s and has become a growing body of work, with diverse conceptualizations (for compilations of definitions of water security, see Zeitoun et al., 2016;Jepson, Wutich, et al., 2017). To evaluate water security status in a city, country, or other unit of analysis, scholars and practitioners have been operationalizing the concept and developing measurement tools. Despite these efforts, many metrics tend to be limited in both their application and their influence on the policymaking process (Howlett & Cuenca, 2017). Before the widespread use of water security terminology, tools proposed to measure water conditions, primarily the hydrology of freshwater resources, were numerous (Plummer, de Loë, & Armitage, 2012), varying in spatial and temporal resolution and ranging from single to multiple dimensions. For example, Dunn and Bakker (2009) observed that in 2 | METHODS

| Data collection
We used Web of Science with the search string. 1 "water *security" AND "tool*," and other similar terms, such as indicators and metrics, in the title, keyword, and abstract of articles. The search string was constructed over numerous iterations to include as many possible publications on water security metrics. For example, "water *security" AND "metric*" yielded only 42 publications, while "water *security" AND "index" yielded 194 publications. We also checked references of some key papers to ensure we have included all relevant publications. The iterative search yielded 264 articles. This paper focuses on peer-reviewed publications, but it builds on reports from the gray literature to capture tools developed by practitioners (e.g., ADB, 2013;GWP, 2014).
While trying to be as comprehensive as possible but keeping the number of articles manageable, we set the following criteria in selecting articles: (a) their full texts are available in English; (b) they explicitly employ water security as a concept (it is not just mentioned in passing); and (c) they propose or test specific analytical tools for evaluating water security. Following Albrecht, Crootof, and Scott (2018) who conducted a systematic review of tools measuring the nexus of food, energy, and water (FEW) security, we categorize papers that are not explicitly proposing or testing analytical tools as "conceptual" (a list of conceptual papers is presented in Table A1). This paper includes those reviewing the progress of research on a certain scale (e.g., Jepson, Wutich, et al., 2017, reviewing household water security metrics) or in a particular location (e.g., Sun, Staddon, & Chen, 2016, reviewing water security metrics in China). Of the 264 articles identified, 90 articles are methodological and 17 articles are conceptual. We excluded the remaining 157 articles. From 90 methodological papers, 80 distinct assessment tools were identified (Tables 1 and 2).
While we focus on methodological papers, all 17 conceptual papers were reviewed and informed our analysis. They highlighted research gaps and needs and therefore were useful when we assessed the extent to which methodological papers meet these needs. In relation to this, we derived normative attributes or desirable qualities of an analytical approach in water security measurement from these conceptual papers.

| Bibliometric analysis
Our main approach for reviewing progress in water security metrics is by using qualitative analysis, whose findings will be further discussed in Section 3. However, as a first step, we also conducted a bibliometric analysis to help us identify  The format of "scale1,scale2" suggests that water security was measured at the first level (scale 1) and then aggregated at the second level (scale 2). b Some studies developed their framework from the DPSIR framework (D = driver, P = pressure, S = State, I = Impact, R = Response). c Acronym: AHP (analytical hierarchy process), SWAT (Soil and Water Assessment Tool), and PCA (principal component analysis).
Abbreviations research clusters. We used VOSviewer, a free software tool specifically designed for visualizing and exploring the landscape of a scientific field using bibliometric data (van Eck & Waltman, 2020). It enables us to "construct networks of scientific publications, researchers, research organizations, countries, keywords, or terms. Items in these networks can be connected by co-authorship, co-occurrence, citation, bibliographic coupling, or co-citation links" (van Eck & Waltman, 2020, p. 3). For this analysis, we include both conceptual and methodological papers because the former informs the latter in developing specific tools to measure water security.
We first imported the bibliometric information of articles, such as authors, year, and journal title, from Web of Science into VOSviewer, and then explored the landscape of the scholarship, to identify influential publications and the relationships within the body of literature (Zhao & Strotmann, 2015). After exploring some visualizations, we concluded that bibliographic coupling and co-occurrences of keywords maps provided the most useful information for the purpose of our research. Figure 1 shows six research clusters in the data set from a bibliographic coupling analysis that uses citation analysis to establish a similarity relationship between documents. The clusters are denoted by different colors and are a mix of small and large clusters.
The first cluster (red) represents publications that focus on assessing water resources, with a majority of case studies focused on China. The second cluster (blue) groups papers that measure experiences of households in accessing water. The third cluster (green) classifies research measuring water security at the urban level. The papers in the fourth cluster (yellow) are not so distinct in terms of topic but are grouped because they focus on non-China case studies including Africa, Middle East, Indonesia, and South America. The fifth cluster (purple) groups publications that measure "blue" and "green" water to assess water security using a water footprint approach. Blue water is the water in freshwater lakes, rivers, and aquifers, whereas green water refers to the portion of precipitation that infiltrates to become soil moisture (Rodrigues, Gupta, & Mendiondo, 2014). The last and smallest cluster (light blue) represents the papers by , Nilsson, Destouni, et al. (2013) that propose metrics to measure food and water security in the Arctic.
Those six clusters are not completely distinct from each other and provide limited insights; therefore, for a more substantial analysis, we used a co-occurrences of keywords map to explore if a more informative research cluster can be identified. From Figure 2, we can discern two large clusters of publications using similar keywords. The first cluster (red), with water security at the center of it, consists of those publications who use similar terms, such as resources, governance, and model. The second cluster (blue) classifies papers that have keywords such as emotional distress, water insecurity, and community. We combine the green cluster with the red cluster because most papers assessing water resources use vulnerability as a framework and often associate water security with water scarcity.
The co-occurrences of keywords map corresponds nicely with the previous bibliographic coupling map, with experiential scale-based metrics (blue cluster in Figure 2) being distinct in the scholarship (Cluster 2 in Figure 1), while the resource-based metrics (red and green clusters in Figure 2) represent a combination of the rest of the publications (Clusters 1, 3, 4, 5, and 6 in Figure 1).
This finding also triangulates with our qualitative review, which identified two dominant research clusters in water security metrics scholarship: the experiential scale-based metrics and the resource-based metrics. This classification will be used for further analysis in this paper.
The next sections present our observations upon reviewing methodological papers based on spatial scale, water domains, and development techniques. In Section 4, we summarize the different characteristics of dominant research clusters and discuss the substantial arguments behind their formulation followed by our views on future work. The final section concludes our arguments.

| THE LANDSCAPE OF WATER SECURITY METRICS
Focusing on 90 methodological papers, this section assesses the progress of water security metrics based on three variables: (a) the scale of analysis, (b) the water domains in which the tools were developed, and (c) the methods used to measure water security, including their high-level development techniques. This qualitative review corroborates our bibliometric analysis that two research clusters dominate the literature. The first relates to tools that measure experiences of households (Table 1), and the second relates to tools measuring water resources security at other levels, such as city and country (Table 2).

| Spatial scales
Three types of scale units are used as the basis of water security assessment: political boundaries (from municipality to country), hydrological boundaries (basin), and the social construction of scale (urban-rural, community, household). In this review, we classify the spatial scale of the papers based on authors' self-identification in those scale types.
Earlier works on water security measurement were developed at the global level, such as the analysis by Vörösmarty et al. (2010) who consider human water security and biodiversity perspectives simultaneously. The significance of their work is confirmed by the bibliographic coupling in Figure 1 because it triggered further research in this field. The scholarship then grew to cover other macro levels (e.g., ADB, 2013;Dumont, Williams, Keller, Voß, & Tattari, 2012). For global and regional assessment, data were usually collected from countries and then aggregated to create a global or regional picture of water security. While this information is useful to see which regions or countries are facing water insecurity, the uptake of the information for local stakeholders is often limited. They may also mask striking variations happening at a much lower scale (Molle & Mollinga, 2003).
At the other end of the spectrum is individual or household water (in)security assessment. We observe that studies in this cluster developed differently compared with other nonhousehold water metrics, with one defining feature being the use of an experiential scale (not indicators) to measure how water security is felt by individuals or household representatives. Hence, household surveys are essential for this measurement. Jepson, Wutich, et al. (2017) have critically reviewed studies in this cluster, which were initially developed by anthropologists, such as Wutich and Ragsdale (2008). One exception from the majority of works at the household level that used an experiential scale is the Hailu, Tolossa, and Alemu (2020). Deriving from Sullivan's (2002) Water Poverty Index (WPI), they developed a Household Water Security Index based on water resources availability, access, utilization, capacity, environment, and water institution indicators.
In between the global and individual scales are metrics capturing the dynamics of water security at the scale of region (within or beyond countries), basin, urban (and rural), and municipality, with urban (n = 17, 20%) and basin (n = 17, 20%) scales being the most popular ones. We discuss our main observations for each scale below.
A total of eight (9%) articles that we studied propose metrics to measure water security at the regional level. They refer to either a small region within a country, such as China's One Belt and One Road Region  and Gansu Province (Li, Su, & Wei, 2019), or a number of countries, such as the Arctic , Central Asia (Wang et al., 2020), and Middle East and North Africa (MENA) regions (Scardigno et al., 2017). Data were typically collected as per unit analysis: at the country level for multicountry analysis or at the province or state level if the region is within a country.
Studies using basin scale as a unit of analysis, ranging from subbasin (e.g., Ma, Liu, & Chen, 2010;Zende, Patil, & Patil, 2018) to multiple catchments (e.g., Nie et al., 2018), have been growing rapidly. This is partly because such a unit enables modeling exercises to assess freshwater availability. Basin-level research was seen by authors in this group as the most appropriate scale for giving useful information to local stakeholders.
In terms of a socially constructed scale, water security assessment at the urban level has attracted much interest since 2015. In their conceptual paper, Romero-Lankao and Gnatz (2016) call for more research on urban water security, given more than 50% of the world's population reside in urban areas. Extreme events, such as a prolonged drought in Sao Paulo in 2013-2015 (Empinotti, Budds, & Aversa, 2019), can compromise the economic outlook of the country and present a great challenge to achieving water security in urban areas. While urban areas range from small towns to megacities, most papers in this cluster use large cities as case studies, such as Addis Ababa in Ethiopia (Assefa et al., 2019), Kolkata in India (Mukherjee et al., 2020), Islamabad in Pakistan (Khan et al., 2020), and Singapore and Hong Kong (Jensen & Wu, 2018). Still on the urbanicity spectrum, rural areas have been increasingly investigated by researchers as these settings pose different water challenges (Basu, Hoshino, & Hashimoto, 2016;Dickson, Schuster-Wallace, & Newton, 2016). The reason we classify the urban-rural spectrum into a "social" scale is because of the debatable definition of what constitutes urban areas. For example, Brenner and Schmid (2014) argue that there is no specific rule for drawing the boundaries of an urban, or urbanizing, area. Hence, many countries differ in their classification.
Another social scale is community. A research consortium in Canada (Norman et al., 2013) developed the Water Security Status Indicator as a method to measure water security at a community scale, integrating variables pertaining to water quality and quantity and their relationship to human health and aquatic ecosystems. They also developed other water metrics, such as water security risk assessment (Dunn et al., 2012). They argue that measuring water security at such a scale provides useful information for end users of the metrics, including the community itself.
Nonetheless, there has been little development in this cluster, which may has been translated into other local scales, such as municipality.
With regard to location, China is the most popular case study (n = 25, 29%). This is partly because water security has become a rising priority in Chinese national policy (China Power, 2020). Sun et al. (2016), reviewing water security metrics in China, found 160 publications, of which 91% were written in Chinese. This suggests that the country has vibrant water security-related research. Some of our findings share what Sun et al. (2016) found in their paper, such as the use of the Driver-Pressure-State-Impact-Response) framework in formulating metrics.
Looking at the temporal dimension, resource-based water security tools evaluate the past and present (if the data is available) status of water security using secondary data, thus measuring progress over time. Some of these studies extend the timeline to include predictions of water security at a given time in the future by conducting some modeling exercises (e.g., Giri, Arbab, & Lathrop, 2018). This means they are able to provide recommendations on how to improve water security, such as by increasing supply and increasing adaptive capacity.
On the other hand, experiential scale-based tools measure the present status of water security, with a recall period ranging from 1 week to 1 month (Jepson, Wutich, et al., 2017) or up to 4 months (Tomaz, Jepson, & de Oliveira Santos, 2020). Given this snapshot picture, researchers suggest conducting multiple surveys in different seasons or conducting a follow-up survey if the effectiveness of particular interventions is part of the investigation.

| Water domains
The diverse conceptualizations of water security have led to various operationalizations of the term, that is, what water domains count under the terminology. There are three main water domains being measured by the tools in the literature: (a) domestic water and human well-being, (b) freshwater availability, and (c) water-related hazards (notably flooding).
First, researchers developing experiential scale-based metrics focus on the relationship between domestic water and human well-being. They measure water supply and hygiene behavior of households via the following questions: "In the last 4 weeks, how frequently did you or anyone in your household worry you would not have enough water for all of your household needs?"; and "In the last 4 weeks, how frequently have you or anyone in your household had to go without washing hands after dirty activities?" (Young, Boateng, et al., 2019, p.8;Young, Collins, et al., 2019). This directly links to sustainable development goal 6.1 on the proportion of population using safely managed drinking water services. The experiential scale is yet to be adopted by the F I G U R E 1 A network map of bibliographic coupling of documents, which shows six research clusters in the field of water security assessment tools. The size of the label and the circle is determined by the weight of the item or citation links. The distance between two items approximately indicates the relatedness of the items monitoring surveys (JMP, 2018) that gather data on physical access to water supplies. Some household water insecurity studies (Aihara, Shrestha, Kazama, & Nishida, 2015;Stevenson, Ambelu, Caruso, Tesfaye, & Freeman, 2016) investigate the connection between water insecurity and wider well-being (e.g., psychological stress) felt by the households.
Second, resource-based metrics give a more complicated picture of what waters are considered under the proposed metrics. However, we observe that the selection of indicators is underpinned by the need to understand freshwater availability or water resources security (e.g., Xiao, Li, Xiao, & Liu, 2008;Zhou, Su, & Zhang, 2019). Some use Falkenmark, Lundqvist, and Widstrand's (1989) water stress indicators (water availability measured in m 3 per capita per year) as a reference. These water resources data are typically combined with water use data from domestic, agricultural, industrial, and other sectors. Sanitation coverage or treated wastewater data are often included to represent threats from untreated wastewater in the water bodies. Some include this information to assess water environment security (Su, Gao, & Guan, 2019). Very few studies put ecology or the environment, such as river health (Nichols & Dyer, 2013) and biodiversity (Dumont et al., 2012), at the center of their inquiry. Another approach used to measure freshwater availability is by evaluating "blue-green water" availability (e.g., Kaur et al., 2019), taking into account, for example, groundwater availability (blue water) and the evapotranspiration process (green water). Water hazards, primarily drought, are often incorporated into the metrics as their occurrence compromises freshwater availability. Another approach for measuring freshwater availability is by using the FEW nexus lens. Mohammadpour et al. (2019) calculate the security of each sector before aggregating them, while Zhu, Jia, Devineni, Lv, and Lall (2019) take a deeper look at food and water connections and focus only on indicators trying to quantify their relationship.
Third, some studies (n = 28, 33%) include water hazards (droughts and floods) as one of the domains in the water security measurement. While the occurrence of drought is relevant in analyzing the availability of freshwater as noted above, flood events disrupt livelihoods in different ways (Octavianti, 2020). This water hazard security (focusing on flooding) is an important domain in water security, but it may risk providing a distorted picture when aggregated to other domains of water security assessment.
F I G U R E 2 A network map of co-occurrences of keywords in the data set, showing three research clusters that were further collapsed into two large research clusters: experiential scale-based metrics (blue) and resource-based metrics (red and green)

| Measurement techniques
We now analyze how metrics have been developed by focusing on their data requirement and weighting process.

| Data requirement
For experiential scale-based metrics, data come from household surveys that have been developed in several stages, from formative research, survey development, and implementation. Some studies have validated their scale, such as Stevenson et al. (2016) and , Young, Collins, et al. (2019). Survey design (e.g., sampling number and methods and the timing of the survey) and questions (e.g., recall period) are key to the success of these metrics.
However, resource-based metrics usually use the following frameworks or methods in their assessment: pressurestate-response, system dynamics, "blue and green water" footprint, risk-based water management, integrated assessment, carrying capacity, and vulnerability (Li et al., 2019). They generally rely on secondary data published by governmental agencies or others, such as the total annual freshwater withdrawals of a country from the World Bank (2015) and the total renewable freshwater resources of a country from FAO (2015). Authors admit that their selection of indicators is based, to some extent, on the availability of data, and therefore, it is an iterative process. When data are not available, or not of sufficient quality, researchers have to balance this against the costs of conducting data collection. Extrapolating or modeling data have been undertaken by some researchers when only partial data can be obtained (e.g., Ghosh, Kansal, & Venkatesh, 2019).
When using secondary data for measuring water security, Venghaus and Dieken (2019), who analyzed the differences and inconsistencies of FEW indices, found that the RAND FEW Security Index and STE FEW Security Index provided different results, particularly for Kazakhstan (out of the 40 countries assessed). The differences occurred primarily because the former index mostly uses data from the Food and Agricultural Organization (FAO) for water security (Willis et al., 2016), while the latter builds on the water risk framework in which the FAO is only one source of many (Gassert et al., 2015). Holmatov et al. (2017) assessing water security in the Southern African Development Community (SADC) also found their findings differed from those of Fischer et al. (2015) who suggest that all SADC countries are water secure, except Zimbabwe. In contrast, Holmatov et al. (2017) found that Malawi is the most water insecure in the SADC. The different focus of the two studies, with Holmatov focusing on economic water security and Fischer on water security in general, and the variables selected may explain this discrepancy. The above examples show that the effects of data choices can be significant for water security measurement.
In developing an experiential scale and selecting indicators, some researchers opened up the research process by engaging stakeholders and/or end users. Dunn et al. (2012) was among the first to incorporate insights from not only local policymakers but also the community who would be using the indicators and insights from the study. More recent papers (e.g., Jensen & Wu, 2018;Krueger et al., 2019) used a participatory approach, acknowledging the increasing importance of co-producing knowledge with other sectors.
Despite the recognition of the importance of qualitative data in measuring water security, or what Garfin et al. (2016) called "Metric Plus," most metrics in our database use only quantitative data (e.g., water resources availability and connection to wastewater services). Challenges associated with integrating quantitative and qualitative data include determining the appropriate unit of analysis where meaningful qualitative information can add value to quantitative data. Technically, such integration may lead to questions about how to actually incorporate them into the metrics: whether the qualitative data are used only for making sense of numbers in data analysis or are incorporated early, which may mean that the qualitative information is "quantified" to a certain extent (Zhu & Li, 2014).
With two distinct metrics developed (experiential scale-and resource-based metrics), Shrestha et al. (2018) sought to integrate both the "experiential (subjective) and physical (objective)" dimensions of water security (Shrestha et al., 2018, p. 277). They developed what they call an objective index of water security to measure the physical dimension of water security in communities and to integrate the data with a cross-sectional household survey (n = 1,500). However, since many scholars view water security as a relational concept, which "demands a fuller consideration of the political structures and processes through which water is secured" (Jepson, Budds, et al., 2017, p. 47), they would argue that there cannot be an "objective" water security measure. In addition to proposing a metric, the majority of papers also pilot it in one site, while a minority conduct measurements in multiple sites. For example, , Young, Collins, et al. (2019) implemented the Household Water Insecurity Experience (HWISE) Scale survey in 28 sites in 23 low-and middle-income countries to validate the scale so that it can be used widely. Jensen and Wu (2018) developed urban water security indicators and implemented the metric in Singapore and Hong Kong, two locations that share similar urban challenges. Krueger et al. (2019) developed an integrated framework for the quantification of urban water supply security and produced seven urban case studies, selected from a wide range of hydroclimatic and socioeconomic conditions.

| Weighting and aggregation
The main challenges of composite indices have been identified as the selection of variables or subindices, an appropriate aggregation function, and the weighting formula (Santeramo, 2016). The usefulness of aggregated indices has been constantly questioned as they need to balance two demands: making large data sets accessible for informing policymakers and providing the details and interrelations of variables in order not to prompt incorrect conclusions (Burgass et al., 2017). With these challenges in mind, we examine how authors weight variables and aggregate them in developing composite indices of water security.
Composite indicators consist of variables that have various units of measurement. There is a need to normalize the data (0-1) so that the diverse variables can be aggregated. One of the problematic steps in the process is how weights are assigned to variables to reflect their importance as authors weight variables differently.
For example, the WPI (Sullivan, 2002) consists of five variables (resource, access, use, capacity, and environment) whose weights are determined by local experts. Some authors (e.g., Gong et al., 2017) use WPI as the basis for formulating their metrics and one can see how authors assign weights to different variables depending on local context and their judgment of importance. It is not uncommon to find that authors propose a metric and leave the weighting formula up to the users. In such cases, comparing the results is a clear issue. Assigning weights to variables is an important part of water security measurement because different weights can lead to a completely different water security status (Sun et al., 2016). Because weighting reflects the hierarchy of importance, it is possible that this process may favor certain outcomes. Therefore, it is recommended to conduct sensitivity and uncertainty analysis during the process, to test how sensitive the result is to different weights assigned to the variables (Burgass et al., 2017).
In an attempt to deal with this subjectivity problem, researchers from statistical, engineering, and computer science backgrounds develop weighting methodologies to determine the index weight according to the degree of index connection. Techniques that can be applied include the entropy method, the fuzzy clustering analysis method, the multilevel fuzzy method, system dynamics method, principal component analysis method, and the gray relational degree analysis evaluation (Ali et al., 2014;Zarghami, Gunawan, & Schultmann, 2018).
Some prefer to assign equal weight to variables in the metrics, such as Krueger et al. (2019), because they consider all five components (natural capital/water resources, physical capital/infrastructure, political capital/management efficacy, financial capital, and social capital/community adaptation) equally important as urban water supply indicators.
Another problem that may arise during this process is when different water domains are aggregated into one single score to determine water status. A different approach was outlined by Su et al. (2019), who did not attempt to aggregate information from three different water security domains into a single score.
Features of water security metrics, such as spatial scale and water domains, of the reviewed papers are summarized in Table 1 (experiential scale-based metrics) and Table 2 (resource-based metrics).

| DISCUSSION
The first half of this section discusses the different characteristics of experiential scale-based and resource-based metrics and the conceptions underpinning their development. We then offer our views regarding topics on which future work could focus to contribute to knowledge production in this field.

| Experiential scale-based and resource-based metrics
The two research clusters have quite distinct characteristics (outlined in Table 3). Most of these (i.e., spatial scale, water domains, and development technique) have been discussed in Section 3. The way authors conceptualize water security plays a significant role in how they measure it.
Analyzing the substantial arguments underpinning the development of water security metrics, we found that experiential scale-based metrics were developed to understand how water insecurity was associated with psychological ill health (Stevenson et al., 2012(Stevenson et al., , 2016. Later developments in this field were informed by the capabilities approach of Amartya Sen (Sen, 2009) and Martha Nussbaum (Nussbaum, 2011). This approach "assess[es] how well-being and social arrangements contribute to or detract from human flourishing and freedom" (Jepson, Budds, et al., 2017, p. 47), and hence, water access is important to "allow people to enjoy a host of capabilities" (Mehta, 2014, p. 66). For resourcebased metrics, their approaches tend to be mixed: developed from existing theories, models, or frameworks (deductive), such as a blue-green water-based accounting framework (Rodrigues et al., 2014), but also based on available data (inductive) and on value judgments (normative), especially those of experts, in the aggregation of indicating variables. Deductive arguments usually applied first in their methodology to select variables.
Analyzing the purpose of metrics using Hinkel's (2011) conceptual framework to review climate change vulnerability indicators, we observe that the two research clusters are meant to address different problems of water security. The purpose of experiential scale-based metrics is to identify water insecure people and communities and evaluate the performance of water interventions. By contrast, resource-based metrics primarily aim to identify mitigation targets, where awareness raising and allocation of funding could then take place (partly because of the benchmarking function of the metrics). This reflects a subtle yet fundamental discrepancy between the two clusters: the use of the terminology of water security and insecurity in the metrics. Experiential scale-based metrics measure household water insecurity as a minimum basis from which humans can thrive, as derived from the capabilities approach. These studies aim to speak "directly to the question of equity and experiences underpinning… demand-side analysis at the household scale" (Tomaz et al., 2020, p. 2). Resource-based metrics measure water security status through quantifying freshwater availability of a particular unit of analysis, usually at the level of nation-state. Although politically useful, choosing a high-level scale may conceal issues of inequality. Socioeconomic structures may limit the availability of national resources for households and individuals, which can differ significantly from the national-level estimates (Schlör, Venghaus, & Hake, 2018). However, experiential scalebased metrics also have some weaknesses, such as not including information on water quality, which is an important component of household water security, and not providing much insight into the underlying causes of household water insecurity-whether the barriers are primarily physical ability, economic, or social (Slaymaker & Johnston, 2020). An early observation in the water security scholarship  about broad versus narrow framing is relevant in assessing water metrics. Some authors who perceive water security as a broad issue tend to include many water domains in its quantification, including water supply, sanitation, flooding, and others, whereas those who use a narrow framing only measure one or limited topics. Nonetheless, there are some authors who use a broad definition in defining the topic but focus on certain issues when it comes to measurement. In line with Cook and Bakker (2012) and Jepson, Wutich, et al. (2017), we posit that a narrow framing is more appropriate for operationalizing the concept of water security in meaningful ways, as it can provide precise analytics including those related to the weighting and aggregation issues discussed above.

| Universal metrics
Some authors have attempted to create universal metrics to measure water security, but this may raise a question about how local contexts can be incorporated and valued in the assessment. Like a spectrum with locality at one end and universality at the other end, one may want to strike a balance between the two to enable results to be compared with other sites without sacrificing important contexts.
For resource-based metrics, there are studies, such as Vörösmarty et al.'s (2010) global human security link to biodiversity and Gain, Giupponi, and Wada's (2016) global water supply security, that focus on assessing certain aspects of water security at a high level with metrics intended to be universal. The majority usually collect available secondary data and collate them to produce a regional or global figure. However, it is unclear whether the metrics have been validated to be applicable across different biophysical and sociocultural contexts. For composite indicators that use secondary data, it may be challenging to validate the metrics particularly because of layers of uncertainties (following Wilby and Dessai's (2010) cascade of uncertainty) and some other methodological drawbacks (Molle & Mollinga, 2003).
Multidimensional thinking will clearly be important to the construction of reliable and valid metrics. In some locations, water insecurity may derive from challenges related to quantity of available water; in other locations, it may be characterized more by water quality challenges; and in still other locations, the type of water insecurity may change temporally or vary by sociodemographic group.
In the literature of household water insecurity, some authors have validated their metrics. Tsai et al. (2016) validated a Household Water Insecurity Access Scale for rural Uganda, and their work has been a benchmark for subsequent studies. Data collected using the HWISE Scale in 28 sites globally, with a target of at least 250 participants from each site, were used for validation of the scale that can now be implemented in widely varying biophysical and sociocultural settings . Methodologically, it was a difficult process to select the most relevant questions to ask about household experience, with robust statistical methods to test the questions or methods in general.
Seeking to advance the work on household water insecurity metrics, Tomaz et al. (2020) developed the Household Water Insecurity Index to assess differences across the urban waterscape using a small urban center as a case study. Their vision is to develop a validated regional tool for Brazil's semiarid regions. They tried to develop a metric that is applicable in multiple sites, but which retains the contextuality (a semiarid region in this case), so that policymakers can develop targeted policy interventions from the measurement.

| Future work
After discussing some issues underpinning the current literature of water security metrics, we now outline some avenues for future work, focusing on topic, methodology, and collaboration. First, we posit that studies on water security measurement could be more impactful if they also quantify the impact of water insecurity to other sectors. Some authors have progressed research from a water-centric approach (Briscoe, 2009) to quantifying connections between water security and other sectors. Research on measuring FEW security is one of them, although, in our view, attempts to cover broad issues of the three sectors often give only high-level observations that are not specific enough to trigger policy actions (see Albrecht et al., 2018). Research on the significance of water security to livability and the feasibility of investment in a city, for example, could be a way to raise awareness about the importance of having a water secure city with the public and local stakeholders. Topics beyond water are increasingly important, and experimentation on the marriage of water metrics and other metrics is worth investigating. In addition, we observe that recent publications have researched a different geographical scale, such as an island (Holding & Allen, 2016). This niche area of research, measuring water security at a specific geographical location, might become more frequent in the future, considering those areas have different water challenges.
Second, we see the need to conduct validation tests on metrics, especially for those claimed to have universal application. Metrics are often developed from limited case studies and to be applicable outside its original context; robust validation testing should be conducted. Many authors do not deal with this issue, perhaps, because they think that the long-term acceptance of their tools by users is sufficient to assure credibility. Bockstaller and Girardin (2003) present three kinds of validation for indicators: a) design validation, to evaluate if the indicators are scientifically founded; b) output validation, to assess the soundness of the indicators' outputs; and c) end-use validation, to ensure the indicator is useful for a decision aid tool. For scale development and validation,  review the process of to create new, valid, and reliable scales in nine steps. Testing the validity of the scale is the last step to ensure that an instrument measures the dimension it was developed to evaluate. However, validation is an ongoing process that should be considered when defining the scope of the study. The most common tests of scale validity are content validity, which can be done prior to fieldwork, and criterion and construct validity, which occurs after survey administration. As far as one can tell, validation tests can be resourceintensive and filled with unfamiliar techniques, but this is an important step to ensure the credibility of metrics and the results they produce.
Finally, with rich scholarship and active publication, it would be more fruitful and impactful if scholars could build collaboration with each other and with practitioners and policymakers. Currently, there are limited connections between studies in the literature, especially in the resource-based metrics. On the contrary, authors in the experiential scale-based metrics show close collaboration with each other (Figure 3). This might be because the field is relatively new and relevant literature is much smaller compared with the resources-based metrics. Authors or groups of researchers in the resources-based cluster seem to be competing to create the best metrics while only minimally considering, or building from, previous work. Looking at some of the appendices or explanatory materials in some publications (e.g., van Ginkel, Hoekstra, Buurman, & Hogeboom, 2018), one can tell the amount of effort being put into creating such metrics. With a joint effort, scholars can not only create and validate a metric together but also mainstream the metrics for wider application. Engagement with other stakeholders outside academia, including the community, is equally important as it will improve the applicability of the metrics. F I G U R E 3 A network map of bibliographic coupling of authors showing that the red cluster (authors working on experiential scalebased metrics) has close citation links

| CONCLUSION
The prevalent use of water security as a concept in academic and policy papers prompts publications that offer tools, frameworks, or methods for quantifying water security. In this paper, we assessed those publications based on their spatial scale, water domains, and development techniques. Two research clusters were observed in the literature: experiential scale-based metrics at the household level and resources-based metrics at other levels beyond household. The former measures household water experiences through surveys and focuses its analysis on water supply and human well-being, whereas the latter mainly assesses freshwater availability by considering the interrelationships between different water domains and usually relies on secondary data. We found that metrics measuring freshwater availability have dominated the landscape of water security methods, and while these approaches provide useful insights, the emerging scholarship on household water insecurity offers new perspectives that expand our understanding of water security, while also addressing the social and political questions, notably distributional inequity. Selecting which approach to use will be based on the purpose of the investigation and scholarly interests, which depends on the disciplinary backgrounds of the researchers. By discussing both research clusters including their strengths and weaknesses, we hope that this paper would be able to help researchers make some informed choices when developing or using a water security metric. We further identify that the discrepancies between the two clusters have been driven by different theoretical underpinnings. The experiential scale-based metrics are developed from a capabilities approach that recognizes the need for human beings to thrive, including through water provision, while resource-based metrics are motivated by the need to secure physical water, often at a high (such as country) level. In the quantification process, there are always risks associated with the reduction of a complex system. These include models that tend to be black boxes because of difficulties in unraveling assumptions embedded in their calculation and the risk of obscuring important variations at the local scale. Despite these caveats, measuring water security is an essential step toward reducing waterrelated risks, and for future planning. This effort can be understood as a way to operationalize the broadly stated water security concept into a more tangible and measurable concept.

RELATED WIREs ARTICLES
Progress in household water insecurity metrics: A cross-disciplinary approach Community water governance for urban water security in the Global South: Status, lessons, and prospects On considering climate resilience in urban water security: A review of the vulnerability of the urban poor in sub-Saharan Africa