Decision Support Tools

Surveys and Reports

A statistical-temporal analysis of Phoenix Water Data from 2000

Phoenix water stressors
Data Sets K-12 Education

Products » Surveys and Reports


Tom Taylor: A statistical-temporal analysis of Phoenix Water Data from 2000

A statistical-temporal analysis of Phoenix Water Data from 2000*
by
Tom Taylor
Department of Mathematics and Statistics
Arizona State University

This document discusses an analysis of water use data provided by the City of Phoenix Water Services Department. The dataset includes monthly water use totals for more than 300,000 meters during a 15-year period (1990-2005). The study presented here is a pilot study which utilizes only data from single family units for the year 2000.

The following graphics are an animation of the histograms of this water data by month in 2000. The horizontal axis is monthly water consumption in hundreds of cubic feet of water, the vertical axis denotes number of water meters at the given level of water consumption. The vertical black line is the mean and the red lines are the mean plus/minus the standard deviation. We see that the both the average water consumption and consumption variability depend on the season.

Note that all months of the year there are a surprising number of water meters at zero consumption. Also these histograms characterize the increased water use in the warm months of May through October.

One question that these suggest is: are these zero consumption water meters the same ones each month? Not usually, as it happens. More generally, how does consumption vary from month to month? Are water users consuming according to a plan?

One tool to get at this information would be to look at the second order statistics of this data set. This involves computing the mean monthly water consumption as well as the covariance matrix. The following are the eigenvalues of the covariance; note the the largest eigenvalue predominates by about 1500%.

This is followed by the a plot of the mean monthly consumption (in black) and the eigenvectors corresponding to the three largest eigenvectors (in red, blue and green, in order of decreasing eigenvalues):

Note that the red eigenvector, corresponding to the largest eigenvalue, has much the same shape as the mean. Thus the predominant mode of variability of water consumption is that the overall water consumption may increase or decrease, but the change in summer consumption will be larger proportionately than the change in winter.

Other tools to get at temporal variability are the joint histograms of water consumption in a given month with the consumption the previous month. Here are those joint distributions for 2000. The vertical axis still represents number of meters at a given pair of water consumptions for the two successive months; the horizontal axis parallel to the plane of the page is consumption in a given month; the horizontal axis perpendicular to the plane of the page is consumption the previous month. Equivalent information would be in the consumption distributions conditioned on consumption the previous month, which are the joint distribution normalized by the total number of meters at a given consumption the previous month.

Note the knife-edge shape of this distribution for small to moderate consumption values, but that the shape is much broader for larger consumptions. These have an interpretation in terms of the mean and variance of the conditional distribution. We offer another, more detailed presentation of the conditional distribution next.

The following graphics are conditional histograms. Again the horizontal axis represents monthly water consumption and vertical axis number of water meters at the given consumption. Now, however, now each frame represents a histogram of water consumption in a given month, conditioned on particular value of consumption the previous month. The frames are displayed in order of increasing consumption. The following are the conditional histograms for February consumption conditioned on January consumption; the value of the January consumption is listed in the graph header, this is also denoted by the black vertical line in the graph. The red vertical line denotes the mean of the conditional histogram. The standard deviation is listed as "Std" in the header and the conditional sample size is the listed in the header as "No". (Division by this number turns the conditional histogram into the conditional distribution.)

We note the following features of these Feb-Jan graphs. Those meters which are zero water consumption for January are most likely to be at positive water consumption in February, however zero water consumption for February is still the most likely specific water consumption. The conditional mean in the case of small consumption (i.e. less than 700cu ft.) is larger than the previous months: there is a trend to use more water the next month. From about 700 to 1200 cubic ft of consumption, the conditional mean of February consumption and the January consumption are approximately equal, after which point the conditional mean is strictly and statistically significantly smaller than the January consumption, although within one standard deviation. From about 3000 cubic ft January consumption the February consumption conditional mean is well within one standard deviation of constant at about 2500 cu. ft.; however, the standard deviation of the conditional distribution is very large, about 1000 cu. ft. Thus, for small to moderate values of January consumption the conditional mean is a good predictor of February consumption, while larger values is a relatively poor predictor. This seems to indicate that small to moderate consumption is according to a specific habit or pattern of water use, while large consumption is a sporadic event most likely to be followed by a much smaller water use.

March-February April-March May-April June-May July-June
August-July September-August October-September November-October December-November

The following animation is a representation of the statistical variability in monthly water consumption as a function of urban geography, as represented by census tract. The vertical axis represents monthly water consumption in hundreds of cubic feet, the horizontal axis represents month of the year. The red graph is the mean monthly water consumption over all single family residences in Phoenix. The black graphs are the mean monthly water consumption of a specific census tract in Phoenix. The vertical bar represents relative sample size, to a maximum of 2856. We see that that census tract mean monthly consumption tends to stick quite close to the overall mean monthly consumption. There are a few census tracts which show much lower water use than the mean, all of these census tracts have an extremely small number of water meters and hence little statistical strength. There are also a few census tracts for which the mean monthly water consumption is much larger than the overall mean; by contrast these census tracts a large number of water meters, and correspondingly greater statistical strength. As it happens these are areas with larger properties and a high proportion of swimming pools.

To make further progress in exploring the nature of water use in Phoenix, a number of efforts can be undertaken. The effect of precipitation on water use could be explored by correlating these two. These types of analyses can be continued for all of the years 2000-2005. If we manage to get geographic coordinates for meters from the City of Phoenix, we could correlate consumption with land use data from the Maricopa Association of Governments and Will Stepanov's land cover data. We are poised to obtain some data from the City of Tempe, and could extend this analysis to Tempe and other cities.

*This material is based upon work supported by the National Science Foundation under Grant No. SES-0345945, Decision Center for a Desert City (DCDC). Any opinions, findings and conclusions or recommendation expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).
 

Copyright
Decision Center for a Desert City
Global Institute of Sustainability
Arizona State University
P.O. Box 878209, Tempe AZ 85287-8209, Phone: 480-965-3367, Fax: 480-965-8383
dcdc@asu.edu