Urban Grammar AI
We know little about how the way we organise cities over space influences social, economic and environmental outcomes, in part because it is hard to measure.
Satellite imagery, combined with cutting-edge AI, can provide a source of data to track the evolution of the built environment at unprecedented detail.
This project develops a conceptual framework to characterise urban structure through the notions of spatial signatures and urban grammar, and will deploy it to generate open data products and insight about the evolution of cities.
29 September - How to create a vector-based web map and host it on GitHub
This is the map we have created for the Urban Grammar AI project. It is created using open source software stack and hosted on GitHub, for free.
This post will walk you through the whole process of generation of the map, step by step, so you can create your own. It is a bit longer than usual, so a quick outline for better orientation:
Tile format and compression
28 September - Scientific Data paper
The second output of the month, a very related one to the previous one! If in the Habitat paper we set out to describe the core concepts behind spatial signatures, in this one we showed it is possible to deploy at scale while retaining detail and consistency. We applied the notion of spatial signatures to Great Britain to create an open data product. Here are the full coordinates:
Fleischmann, M., & Arribas-Bel, D. (2022). “Geographical characterisation of British urban form and function using the spatial signatures framework”. Scientific Data, 9(1), 1-15. https://doi.org/10.1038/s41597-022-01640-8
Published version (Open Access)
Code repository@Github (Web render)
27 September - Habitat paper
This month has been a big on in terms of academic outputs. The first one has been the conceptual paper putting forth many of the ideas that underpin much of the Urban Grammar project. Here are the full coordinates:
Arribas-Bel, D., & Fleischmann, M. (2022). “Understanding (urban) spaces through form and function”. Habitat International, 128, 102641. https://doi.org/10.1016/j.habitatint.2022.102641
27 July - The Spatial Signatures on ITINERANT
Dani and Martin participated earlier this month in the stakeholder engagement workshop organised by the ITINERANT project. This was an event where the project presented final results to a series of stakeholders including the ONS Data Science Campus, the Liverpool City Region, or the UK 2070 Commission. The event was held in Liverpool but streamed online and the talks are available on YouTube now (Spatial Signatures starting at minute 23).
14 July - “Open by Default” talk at the RSS Merseyside
Last month, Dani participated in the “Using open data sources” event put
together by the lovely folks at the Merseyside chapter of the RSS
and HiPy. There, he delivered the talk “Open by
default - Developing reproducible, computational research”. You can find the
slides on the usual place (or
and in PDF directly).
The video is also available on YouTube:
This was a similar, albeit refined, talk to the one he did in Chicago and it is great
it is now available to watch online.
01 July - World Urban Forum
Last June, Martin and Dani attended the World Urban Forum. It was a fascinating experience in many ways. Dani wrote down his thoughts on a blog post you can read here.
28 June - GeoPython
Earlier this month, Martin represented the Urban Grammar at this year’s edition of GeoPython. During his time there, he kept himself rather busy. Here is a quick summary with references of all the bits he participated in:
Delivered the talk “Open by Default - Developing reproducible, computational research” (slides available here)
Co-delivered the talk “State of GeoPandas”, together with Joris Van den Bossche (slides available here)
Co-delivered with Joris Van den Bossche the workshop “Scaling up vector analysis with Dask-Geopandas (materials available in this repo)
10 June - Data in Brief paper
This month saw a new paper/open data product of the Urban Gramar sphere see the light of day. As part of the spin-off ITINERANT project, and in collaboration with GDSL colleagues Krasen Samardzhiev, Alessia Calafiore, and Francisco Rowe, we published an open data product and data descriptor that presents a new classification of signatures entirely based on function. Here are the coordinates where you can find everything:
Samardzhiev, K.; Fleischmann, M.; Arribas-Bel, D.; Calafiore, A.; Rowe, F. (2022).
“Functional signatures in Great Britain: A dataset”.
Data in Brief, 43.
Published Version (Open Access)
09 June - ISUF Italy
In early June, Martin attended the 6th ISUF Italy conference “Morphology and Urban Design. New strategies for a changing society”, organised by the University of Bologna in a hybrid mode (Martin was present online). We took the opportunity to talk about Detecting urban typology from multispectral satellite imagery using neural networks, where we combined the work we currently focus on in the Urban Grammar project and the one published in a recent CEUS paper. It allowed us to illustrate the application of remote sensing and neural networks on urban form in both supervised (detection of signatures from Sentinel 2) and unsupervised (the CEUS paper) use cases. With the audience composed of the ISUF community revolving heavilty around architecture, the talk started an engaging discussion on the current limits of openly available satellite imagery in the context of urban morphology.
The talk itself was not recorded but the slides are available from the usual place on HTML (this time we couldn’t build a PDF for download satisfactorily).
19 May - OS visit
On May 17th and 18th, Martin and Dani spent a week visiting the Ordnance
Survey in Southampton. The trip was planned as an institutional visit to
exchange ideas, discuss the project and, more generally, interact with our
advisory board member, Dr. Isabel Sargent and her team at OS Research.
We had a fantastic time. We presented our current work on using computer
vision, a bit of geographic magic, and satellite imagery to automate the
recognition of our upcoming spatial signatures for Great Britain. You can
check out the slides we used for that presentation in our usual talks
repository in our two common formats:
HTML for the
PDF for download
Beyond the purely scheduled, we had the oportunity to see OS from the inside
and to chat to a lot of really clever people about our project, its value, and
potential solutions to the challenges we’re currently facing.
05 May - Chicago visit
The first week of May, Dani visited the University of Chicago. As part of the
trip, he (re-)connected with folks at the Center for Spatial Data Science
and met a bunch of new friends at the Mansueto Institute for Urban
Innovation. All in all, it was a fantastic week where there was even a bit
of time to discuss all things urban form and function.
On May 4th, he delivered the lecture “‘Open by Default’ - Developing
reproducible, computational research” on open workflows for modern
computational research at one of the GIS courses offered at Chicago and
taught by Dr. Marynia Kolak. The
slides of the talk are available at the Urban Grammar’s talks
repository in our two common formats:
This was a particularly tricky talk to conceive and prepare. Conceptually, it
was a bit out of our “comfort zone” in that it is not really about the
research we are doing at the Urban Grammar, but about the process we follow
to realize it. It was also hard to structure it in a way that made sense because
it pulled from many different aspects of the project, from the approach we
take to writing slide decks to the computational infrastructure on which all
of our computations rely. We are nevertheless happy with the outcome. It’s not
ideal, and we will probably refine it in successive iterations (we’re planning
on giving similar talks in the near future, stay tuned!), but this is a great
25 April - CEUS paper
This week saw the first peer-reviewed publication related to the Urban Grammar
see the light of day. Full coordinates of the paper are available here:
Singleton, A.; Arribas-Bel, D.; Murray, J.; Fleischmann, M. (2022). “Estimating generalized measures of local neighbourhood context from multispectral satellite images using a convolutional neural network”. Computers, Environment and Urban Systems, 95. 10.1016/j.compenvurbsys.2022.101802
20 April - Form-based Signatures paper and data
Last year, we attended the International Seminar on Urban Form 2021,
which was held virtually in Glasgow, Scotland and presented a classification of Great
Britain based on the form component of Spatial Signatures; we called those morphosignatures. Earlier this year the
proceedings from the conference were published, including our paper describing the work.
On that occasion, we also published the dataset of resulting morphosignatures.
You can get the paper either from the original proceedings under DOI
10.17868/strath.00080527 or from our Github
(8MB). The bibtex citation for the paper is attached below.
The code we used to generate the morphosignatures is available on the WP2 Jupyter book of the project, since a lot of it is shared with the codebase we developed to generate our main signature classification. If you want to have a peek into the specific notebooks for the clustering step (the most unique part of this exercise), you can find it at the following links:
Second level form signatures
11 April - GISRUK talk
On April 7th, we had a chance to attend the first large in-person academic conference
since the project started more than two years ago. Martin joined GISRUK, coincidentally
organised by the Geographic Data Science Lab, and presented ongoing work on developing
an AI model to detect spatial signatures based on satellite imagery. The talk focused on
the progress we have made so far and specifically on geographical questions arising when
we try to link granular geometry of signatures to satellite images and rectangular chips
sampled from them.
We have received good feedback and suggestions for the next steps we are currently
trying to implement in our plans. For anyone interested, the slides presented during the
conference are available
here or as a PDF (33MB)
here. The extended abstract
of the talk is also available in the conference
29 March - Contribution to tobler once again: Areal interpolation even faster
Last year, we needed areal interpolation to transfer data from various administrative
and statistical spatial units like Output areas to enclosed tessellation cells. To do
that, we used the Tobler Python package from PySAL
family. And since we needed to scale it to a national level, we
spent some time refactoring the tools to make it much more performant. All that is
discussed in this blog post.
After a year, we needed areal interpolation again. This time, we took a slightly
different approach. We did not want to partition the data a priori. We also knew that
the first part of the function would work well as we had already reimplemented it
before. However, we hit another bottleneck. This time in the second part of the code, we
did not touch before. What has happened?
It is a bit technical. We need to store a relation between source and target geometries.
Say that we have 100 000 source geometries and 1 000 000 target ones. We essentially
want an array of 100 000 rows and 1 000 000. That does not fit in memory. But since a
lot of the cells of such an array would say 0, we can omit them and use a sparse array
(or sparse matrix) format, which behaves like a normal dense array but is way more
memory efficient. However, there are multiple ways of storing the data in a sparse
array. Tobler was using DOK - Dictionary of Keys. That is really fast if you need to
access individual records quickly. But that is not the application we have in tobler. We
need quick row indexing (apart from a few other things). Fortunately, the fix was
simple. We replaced DOK with CSR - Compressed Sparse Row matrix, and the results we
miraculous. While the small test benchmark was about 45x faster, our actual
interpolation was faster by several orders of magnitude. The original code did not
finish after hours. The new once was done in under a second.
Small things, like storage formats, matter. If you want to see the effect of different
sparse array formats, see this
notebook. And the
change in tobler is here and will be part of
the next release.
18 March - Third Advisory Board
On March 17th 2022, the (hopefully) last online Advisory Board meeting was held over
Zoom. The assumption is that the final one in June will happen as an in-person meeting
It was the first meeting when the discussion could focus on the third WP dealing with
developing an AI model to detect signatures from satellite imagery. However, AI was not
the only point of discussion as we covered an overview of the current progress on the
publications, open data, dissemination, and future plans for an impact strategy. The
final section of the meeting was dedicated to the remaining time on the project and
tried to answer the question, “what should be done by the end of the year?”.
We enjoyed a fruitful discussion and are looking forward to meeting all board members in
10 February - The Urban Grammar on MapScaping podcast!
Last month, Dani had a chance to speak with Daniel from the Mapscaping podcast about the Urban Grammar project. You can listen to the conversation over at the podcast’s website:
Or on your preferred purveyor of podcasts (Apple Podcasts, Stitcher, Google Podcasts, Spotify).
17 December - British Spatial Signatures at Towards urban analytics 2.0
On Tuesday, November 30th, Martin gave a lighting talk on the Spatial Signatures of Great Britain at the Alan Turing Insititute event Towards urban analytics 2.0 that was in person in Leeds, UK. The talk introduced the classification of Great Britain that is now available in the form of an interactive map and for download either from the Consumer Data Research Centre’s open data portal or the archived version from figshare. While there was no space for a Q&A directly after the talk, it has raised an interest leading to several great informal discussions afterwards.
You can watch the talk at:
The slides from the talk are here or here is the PDF file.
10 December - Third Advisory Board
On December 9th 2021, we held the meeting of the Advisory Board for the project. Still being limited by the global pandemic, we had a frutiful Zoom call filled with an exciting discussion on the opportunities Spatial Signatures offer.
The session started with an overview of the current progress focusing on the current progress and explanation of the whole process of generating spatial signatures from data acquistion to empirical exploration. We then spent
some time discussed the open infrastruture built around the project resulting in an open data product and maximum transparency of the process behind it. We followed by the focused discussion on dissemination and impact of the classification.
Three hours later, we finished with a lot of ideas and potential research avenues and collaborations to be explored. We hope that the situation permits another Advisory board meeting soon and hopefully, in person.
10 December - British Spatial Signatures talk at CASA
On Monday November 29th, Dani gave a talk on the Spatial Signatures project at the CASA seminars. The talk covered known bits from previous talks, mostly about framing of the problem and the foundational blocks of the Spatial Signatures but, more importantly, presented for the first time our results from the British signatures. The audience was engaging and had many super interesting questions. Thanks!
CASA recorded and published the video on YouTube:
The slides used are here or, if you prefer it, here is the PDF file.