# Introduction

In addition to shaping its hydrography and history, the particular topography of Quebec City has had an important impact on its vegetation. This post explores how the cliffs delineating the upper and lower parts of the city represent hard to build terrain that serve as sanctuary for fauna and flora in an urban environement.

# A quick glance at the canopy

As the following maps reveals1, there seems to be a qualitative relationship between the presence of trees (top) in green and the slope of the region (bottom) 2. We can also make out the location of the rivers, train tracks and highways simply by looking at the canopy polygons.

The importance of the cliffs on the biodiversity of the region is well documented. The Coteau Sainte-Geneviève and Quebec City Promontory cliffs are categorized as special natural zones by the city. These constrained regions harbour rare vegetation such as the endangered butternut tree and the not-so-rare thrash can emptying racoon. The following map shows these three special natural zones: the Coteau Sainte-Geneviève - sector Montcalm Saint-Jean-Baptiste in red, Coteau Sainte-Geneviève - Saint-Sacrement sector in green and the Quebec pomontory cliff - southern sector in blue.

The cliffs represent a natural green belt in the heart of the city that offers all the benefits of vegetation without any downside. Indeed, the land is unfit for housing and uses up limited horizontal space due to the slope of the terrain. It therefore cannot be accused of unfairly driving up the price of housing in sought after areas as argued in some regions like the golden horseshoe near the Greater Toronto Area. In addition, the presence of vegetation may reduce the risk of landslide, which is extremelly useful given the past tragedies that occured in the region.

# Assessing the relationship between slope and vegetation

Even if this qualitative analysis remains valuable and interesting, it is also worth while to build a model to quantitatively evaluate the strengh of the relationship between slope and vegetation. It is also beneficial to control for other factors to determine what is actually accounted for by the slope effect. For instance, overlaying public parks and river banks suggests that other spatial covariates may also be correlated with the presence of trees and cliffs.

It is possible to perform this analysis by considering a simple pixel-based 34567 binomial generalized linear model (GLM):

$\log(\frac{p(x)}{1-p(x)}) = \beta_c x_c + \beta_bx_b + \beta_p x_p$

where $$x_c, x_b, x_p$$ are binary variables indicating presence of cliffs, banks and parks respectively and $$p(x)$$ is the probability that a given pixel contains a tree given the covariate vector $$x$$. The coefficients $$\beta$$ are real valued parameters to optimize that we can later analyze.

Parks were added because they likely contain more trees than other neighbouring urban areas. River banks were added because many were renaturalized and enjoy a significant amount of vegetation (see this past post for more details).

Neighbourhood was also added as a categorical variable. As mentionned in this previous post, the different histories, built environment and socioeconomic composition of the city have led to large differences in vegetation cover. Newer and more affluent neighbourhoods like Saint-Sacrement and Montcalm have territories that are covered by roughly 25% of trees while other older and poorer sectors like Saint-Jean-Baptiste and Saint-Roch have only roughly 10%. This leads to the enhanced model:

$\log(\frac{p(x)}{1-p(x)}) = \beta_c x_c + \beta_bx_b + \beta_p x_p + \beta_n x_n$ where $$x_n$$ is the categorical variable indicating the neighbourhood.

Interactions between presence of parks and neighbourhoods were also added, because some large parks like the battlefield park in the Montcalm/Colline parlementatire neighbourhoods have large patches of grass while others like Domaine Maizeret in the Maizeret neighbourhoods are covered by a larger proportion of trees (more details on this in a following section). We obtain the following model:

$\log(\frac{p(x)}{1-p(x)}) = \beta_c x_c + \beta_bx_b + \beta_p x_p + \beta_n x_n + \beta_{np}x_nx_p$

Residual analysis of this simple model revealed some correlation of residuals in certain specifc areas. Adding landuse significantly helped improve the quality of the fit and reduce violations of residual independence. This namely helped account for poorly vegetated industrial zones and correct for minor inconsistencies with the hydrography shapefiles and ensure that trees have a low probability of occurring in water.

Adding interactions with neighbourhood also proved very valuable. The model identified a statistically significant and strong negative relationship between the presence of trees and land dedicated to transport in the Vieux-Quebec neighbourhood, while this is not the case in general. This makes sense as the map below indicates that this area corresponds to the largely industrialized and treeless port. However, in the rest of the city, land dedicated to public utilites and transport enjoys a fair amount of vegetation cover. Accounting for port activities also helped improve the inference on the overall coefficients of river banks.

Interactions also proved helpful since many cliffs facing the Saint-Lawrence river are zoned as low density housing while those near Saint-Sacrement are predominantly vacant lots or parks. As a side note, it seems questionable to allow construction in these dangerous zones where landslides could have devastating effects on inhabitants of the private property and beyond.

Adding landuse results in the following model:

$\log(\frac{p(x)}{1-p(x)}) = \beta_c x_c + \beta_bx_b + \beta_p x_p + \beta_n x_n +\beta_l x_l + \beta_{np}x_nx_p + \beta_{nl}x_nx_l$

where $$x_l$$ is the categorical variable indicating landuse.

Another interesting variable to consider is population density. Indeed, preliminary analysis of the univariate relationship between population density and vegetation seemed to show a non-linear decreasing relationship (see figure below).

Even if landuse already accounts for high and low density housing areas, a spline of the population density was added to create the following generalized additive model (GAM):

$\log(\frac{p(x)}{1-p(x)}) = \beta_c x_c + \beta_bx_b + \beta_p x_p + \beta_n x_n +\beta_l x_l + \beta_{np}x_nx_p + \beta_{nl}x_nx_l + s_{\beta_d}(x_d)$ where $$x_d$$ is the population density and $$s_{\beta_d}$$ is the spline considered, which depends on coefficients $$\beta_d$$. Although this did not result in significant improvement in model fit, it still proved statistically significant.

## Model results

The estimated model suggest that after accounting for all variables, the presence of cliffs increases the log-odds of tree presence by 2.047, all things equal. This is a relatively strong effect. By comparison, presence of river banks increases the log-odds by only 0.243.

The table below shows the statistically signifcant (at the 5% level) univariate effects sorted by absolute estimated value. The intercept corresponds to the estimated effect of the base neighbourhood Lairet outside of a region with parks, cliffs and river banks and in the base landuse level administration and service.

Perhaps surprisingly, the variables associated with the manually created park shapefiles (corresponding to variable isPark) and the area classified by parks and green spaces by the landuse data (Parc et espace vert) are both highly statistically significant and strongly correlated with the log-odds of tree presence.

## Model limitations

Although useful, the statistcal model suffers from various limitations. First, it only manages to explain roughly 16% of the total deviance. Other covariates or more complex interactions may therefore be needed. The residuals also display a fair share of spatial correlation. As seen below, certain portions of parks namely seem to be either over or underrepresented:

Adding a non-linear function of normalized vegetation density index (NVDI) raster could perhaps improve inference and precision by discriminating between grass and trees. Indeed, the figure below shows that grass seems to have higher reflectance than trees.

The variograms below also highlight the presence of heterogeneity and spatial correlation. Greener neighbourhoods like Montcalm show more variance. Certain patterns such as the one in Maizeret should also be investigated.

# Conclusion

This post explored the relationship between terrain and vegetation in Quebec City. The simple GAM showed that there is a statistically significant and strong relationship between cliffs and the presence of trees. Although the model is admittedly very simple, it provides interesting insights on certain lesser known facts about the city.

Regardlesss of these interesting quantitative findings, the value of this 30 km long natural green belt in the center of the city remains unquestionable. These cliffs showcase some of the potential benefits of vertical green zones, which could prove extremelly valuable in densely populated areas accross the world.

1. Canopy polygon donwloaded from the canopée dataset on the Quebec City open data portal. The polygons were post-processed in QGIS and R with the sf package.

2. In order to preprocess the data, all polygons must be rasterized. The resulting layers are then merged by the longitude, latitude of their cell centroid and a large dataframe is created where each observation represents a cell/pixel with different associated covariates. Parameter inference and overall results may therefore change depending on the resolution of the grid considered. The resolution considerd in this post is relatively precise since it is taken as the one used by the slope raster which was computed from the raw digital elevation model provded by Natural Ressources Canada. The final dataset used contains over 46 000 rows/pixels.

3. Various data engineering steps also play an important role in the final results. The procedure used to identify cliffs is namely important. In this post, a slope greater or equal to 13.7 degrees was categorized as a cliff. This corresponds to the 95th percentile of slope over the entire La Cite borough elevation raster and qualitatively represents the boundary of the promontory very well.

4. The identification of river banks was performed by taking the hydrography shape files from the Quebec open data geoportal, taking a 250 meter buffer around the water bodies and then taking the symmetric difference between both polygons to keep only the banks.

5. Even if the Quebec Promontory extends beyond La Cite-Limoilou to Sainte Foy and Sillery, the models was purposely calibrated only on this borough because 1) it is rather computationally intensive to perform raster operations on large regions with a fixed resolution and 2) the presence of cliffs presumably has a stronger effect in more densely populated regions like La Cite Limoilou with less trees overall than in the suburbs.

6. Park shapefiles were manually created because the quality of the landuse data is variable and at times incoherent. For instance, some parks are classified as public utilities, but this category also contains other types of urban zoning.