### Documentation

### A Population-Interaction Index (PII)

The PIZA codes are derived from a classification scheme that indexes small geographic areas according to the size and proximity of population concentrations. Designation of the zones begins with use of common GIS software to assign an index number to each of many small (five-kilometer) grid cells laid out (figuratively) across the contiguous 48 States. These "population-interaction indexes" (PII) are designed to provide a cardinal measure of the potential interaction between nearby urban-related population and agricultural production activities in each grid cell. (By a cardinal measure, we mean that the codes effectively rank each location or area on a continuous scale.) The population-interaction indexes are based on the regional economist's or geographer's concept of a "gravity" model, which provides measures of accessibility to population concentrations (Shi, Phipps, and Colyer, 1997).^{ 1} This model measures population interaction by accounting for the size of all populations in the proximity of a given location or grid cell and the distance of that location or grid cell from those populations.

In our case, the population-interaction index (**PII**) for a single location (grid cell) is defined as follows:

**PII**ij = Pj / Dij

where **PII**ij is the computed index number representing the influence on cell i of population located in cell j, Pj is the population of cell j, and Dij is the distance from cell i to cell j.

In order to assess the effect of proximity to multiple population concentrations, the index is aggregated across a number (N) of possible locations (cells). In an aggregate form, the index used in this study is given by:

**PII**i = S (Pj / Dij) where the summation is over j = 1 to N,

where the index j represents one of N grid cells within a 50 mile radius of cell i.

Essentially, the population-interaction index provides a continuous measure of proximity to nearby population concentrations, accounting for both population size (within a 50-mile radius in our study) and location of the parcel relative to the population (distance).^{ 2} The index increases as population increases, (since population is in the numerator) and/or as distance to the population decreases (since distance is in the denominator).

The first step in constructing the population-interaction index was to develop a nationwide grid of population density. This was accomplished by assigning the geographic centroid of each census block to a 5-km grid cell, then using GIS techniques to add up the population in each grid cell, and then dividing grid-cell population by the area of the grid cell.

The next step involved using GIS software to calculate the population-interaction index for each grid cell using the formula described above. Our construction of the population-interaction index is calculated on the basis of population within a 50-mile radius of each grid cell. Essentially, population (or, equivalently, in our case, population density) is weighted by the inverse of distance. With readily available GIS software, the population-influence indexes for any latitude/longitude in the U.S. can be obtained. By aggregation within the GIS system, a PII can be calculated for any specified region.

### Using Population-Interaction Indexes to Classify Agricultural Land: PIZA Codes

In order to classify grid cells into either a "rural" zone or a "population-interaction" zone, regional thresholds were established based on index levels in the most rural areas of each region. Index numbers below a threshold represent rural (background) levels of population interaction, which exist even in the absence of urban-related population interaction. Any grid cell whose index exceeds the rural threshold set for its region is classified into a "population-interaction zone."

The rural or background level includes population that supports an active commercial farming industry, including employees of input and output industries that support production agriculture as well as other population associated with the rural-community infrastructure. That background level can be expected to vary regionally due to differences in the productivity of farmland. Consequently, we established thresholds for each of the twenty U.S. Department of Agriculture regions called Land Resource Regions (LRRs).

In order to establish the rural thresholds for each region, we examined levels of the PII in areas that clearly had not been subject to nonfarm-related population influence. Cromartie (200l) and Cromartie and Swanson (1996) identify Census tracts that are "totally rural," which are based on 1990 commuting data and U.S. Census Bureau geographic definitions. The term totally rural means that the tract does not contain any part of a town of 2,500 or more residents and the primary commuting pattern was to sites within the tract. (These are category 10 in the RUCA codes.) Thresholds for individual LRRs were established at the 95th percentile of the distribution of PII for 5-kilometer grid cells in the set of totally rural tracts in the LRR.

Grid cells initially classified into the population-interaction zone were further classified into one of three categories representing increasingly higher levels of population interaction. Thresholds to distinguish the three categories were set at levels of the index that split the original sample into three sets containing equal numbers of sample observations. The resulting Population-Interaction Zones for Agriculture (PIZA) consist of a four-category classification:

- 1 = rural (little or no urban-related population interaction)
- 2 = population interaction, low
- 3 = population interaction, medium
- 4 = population interaction, high

The indexes (PII) and zone codes (PIZA), which can be used to classify any geographic point in the 48 contiguous states, are available for download. GIS software is necessary, however, to retrieve the indexes and zone codes and relate them to any given geographic point (latitude/longitude) or 5-km grid cell. A complementary scheme similarly indexes and classifies the geographic center of U.S. counties, providing a county-level version of both.

^{1}The concept of a "gravity" model evolved from marketing analysis, where it was first used to assess the attraction of consumers to retail markets (as described in Shi, Phipps, and Colyer). Recently the concept has been applied in the agricultural and resource economics literature. Shi, Phipps, and Colyer describe the gravity model as a "parsimonious method for capturing urban influence in a single variable that combines [population] size and distance [from urban concentrations]."

^{2}In Shi, Phipps, and Colyer and in Hardie, Narayan, and Gardner, distance is accounted for using the square of D. Based on results reported in Song, we used D rather than D2.