Urban Grammar AI
We know little about how the way we organise cities over space influences social, economic and environmental outcomes, in part because it is hard to measure.
Satellite imagery, combined with cutting-edge AI, can provide a source of data to track the evolution of the built environment at unprecedented detail.
This project develops a conceptual framework to characterise urban structure through the notions of spatial signatures and urban grammar, and will deploy it to generate open data products and insight about the evolution of cities.
14 July - “Open by Default” talk at the RSS Merseyside
Last month, Dani participated in the “Using open data sources” event put
together by the lovely folks at the Merseyside chapter of the RSS
and HiPy. There, he delivered the talk “Open by
default - Developing reproducible, computational research”. You can find the
slides on the usual place (or
and in PDF directly).
The video is also available on YouTube:
This was a similar, albeit refined, talk to the one he did in Chicago and it is great
it is now available to watch online.
10 June - Data in Brief paper
This month saw a new paper/open data product of the Urban Gramar sphere see the light of day. As part of the spin-off ITINERANT project, and in collaboration with GDSL colleagues Krasen Samardzhiev, Alessia Calafiore, and Francisco Rowe, we published an open data product and data descriptor that presents a new classification of signatures entirely based on function. Here are the coordinates where you can find everything:
Samardzhiev, K.; Fleischmann, M.; Arribas-Bel, D.; Calafiore, A.; Rowe, F. (2022).
“Functional signatures in Great Britain: A dataset”.
Data in Brief, 43.
Published Version (Open Access)
09 June - ISUF Italy
In early June, Martin attended the 6th ISUF Italy conference “Morphology and Urban Design. New strategies for a changing society”, organised by the University of Bologna in a hybrid mode (Martin was present online). We took the opportunity to talk about Detecting urban typology from multispectral satellite imagery using neural networks, where we combined the work we currently focus on in the Urban Grammar project and the one published in a recent CEUS paper. It allowed us to illustrate the application of remote sensing and neural networks on urban form in both supervised (detection of signatures from Sentinel 2) and unsupervised (the CEUS paper) use cases. With the audience composed of the ISUF community revolving heavilty around architecture, the talk started an engaging discussion on the current limits of openly available satellite imagery in the context of urban morphology.
The talk itself was not recorded but the slides are available from the usual place on HTML (this time we couldn’t build a PDF for download satisfactorily).
19 May - OS visit
On May 17th and 18th, Martin and Dani spent a week visiting the Ordnance
Survey in Southampton. The trip was planned as an institutional visit to
exchange ideas, discuss the project and, more generally, interact with our
advisory board member, Dr. Isabel Sargent and her team at OS Research.
We had a fantastic time. We presented our current work on using computer
vision, a bit of geographic magic, and satellite imagery to automate the
recognition of our upcoming spatial signatures for Great Britain. You can
check out the slides we used for that presentation in our usual talks
repository in our two common formats:
HTML for the
PDF for download
Beyond the purely scheduled, we had the oportunity to see OS from the inside
and to chat to a lot of really clever people about our project, its value, and
potential solutions to the challenges we’re currently facing.
05 May - Chicago visit
The first week of May, Dani visited the University of Chicago. As part of the
trip, he (re-)connected with folks at the Center for Spatial Data Science
and met a bunch of new friends at the Mansueto Institute for Urban
Innovation. All in all, it was a fantastic week where there was even a bit
of time to discuss all things urban form and function.
On May 4th, he delivered the lecture “‘Open by Default’ - Developing
reproducible, computational research” on open workflows for modern
computational research at one of the GIS courses offered at Chicago and
taught by Dr. Marynia Kolak. The
slides of the talk are available at the Urban Grammar’s talks
repository in our two common formats:
This was a particularly tricky talk to conceive and prepare. Conceptually, it
was a bit out of our “comfort zone” in that it is not really about the
research we are doing at the Urban Grammar, but about the process we follow
to realize it. It was also hard to structure it in a way that made sense because
it pulled from many different aspects of the project, from the approach we
take to writing slide decks to the computational infrastructure on which all
of our computations rely. We are nevertheless happy with the outcome. It’s not
ideal, and we will probably refine it in successive iterations (we’re planning
on giving similar talks in the near future, stay tuned!), but this is a great
25 April - CEUS paper
This week saw the first peer-reviewed publication related to the Urban Grammar
see the light of day. Full coordinates of the paper are available here:
Singleton, A.; Arribas-Bel, D.; Murray, J.; Fleischmann, M. (2022). “Estimating generalized measures of local neighbourhood context from multispectral satellite images using a convolutional neural network”. Computers, Environment and Urban Systems, 95. 10.1016/j.compenvurbsys.2022.101802
20 April - Form-based Signatures paper and data
Last year, we attended the International Seminar on Urban Form 2021,
which was held virtually in Glasgow, Scotland and presented a classification of Great
Britain based on the form component of Spatial Signatures; we called those morphosignatures. Earlier this year the
proceedings from the conference were published, including our paper describing the work.
On that occasion, we also published the dataset of resulting morphosignatures.
You can get the paper either from the original proceedings under DOI
10.17868/strath.00080527 or from our Github
(8MB). The bibtex citation for the paper is attached below.
The code we used to generate the morphosignatures is available on the WP2 Jupyter book of the project, since a lot of it is shared with the codebase we developed to generate our main signature classification. If you want to have a peek into the specific notebooks for the clustering step (the most unique part of this exercise), you can find it at the following links:
Second level form signatures
11 April - GISRUK talk
On April 7th, we had a chance to attend the first large in-person academic conference
since the project started more than two years ago. Martin joined GISRUK, coincidentally
organised by the Geographic Data Science Lab, and presented ongoing work on developing
an AI model to detect spatial signatures based on satellite imagery. The talk focused on
the progress we have made so far and specifically on geographical questions arising when
we try to link granular geometry of signatures to satellite images and rectangular chips
sampled from them.
We have received good feedback and suggestions for the next steps we are currently
trying to implement in our plans. For anyone interested, the slides presented during the
conference are available
here or as a PDF (33MB)
here. The extended abstract
of the talk is also available in the conference
29 March - Contribution to tobler once again: Areal interpolation even faster
Last year, we needed areal interpolation to transfer data from various administrative
and statistical spatial units like Output areas to enclosed tessellation cells. To do
that, we used the Tobler Python package from PySAL
family. And since we needed to scale it to a national level, we
spent some time refactoring the tools to make it much more performant. All that is
discussed in this blog post.
After a year, we needed areal interpolation again. This time, we took a slightly
different approach. We did not want to partition the data a priori. We also knew that
the first part of the function would work well as we had already reimplemented it
before. However, we hit another bottleneck. This time in the second part of the code, we
did not touch before. What has happened?
It is a bit technical. We need to store a relation between source and target geometries.
Say that we have 100 000 source geometries and 1 000 000 target ones. We essentially
want an array of 100 000 rows and 1 000 000. That does not fit in memory. But since a
lot of the cells of such an array would say 0, we can omit them and use a sparse array
(or sparse matrix) format, which behaves like a normal dense array but is way more
memory efficient. However, there are multiple ways of storing the data in a sparse
array. Tobler was using DOK - Dictionary of Keys. That is really fast if you need to
access individual records quickly. But that is not the application we have in tobler. We
need quick row indexing (apart from a few other things). Fortunately, the fix was
simple. We replaced DOK with CSR - Compressed Sparse Row matrix, and the results we
miraculous. While the small test benchmark was about 45x faster, our actual
interpolation was faster by several orders of magnitude. The original code did not
finish after hours. The new once was done in under a second.
Small things, like storage formats, matter. If you want to see the effect of different
sparse array formats, see this
notebook. And the
change in tobler is here and will be part of
the next release.
18 March - Third Advisory Board
On March 17th 2022, the (hopefully) last online Advisory Board meeting was held over
Zoom. The assumption is that the final one in June will happen as an in-person meeting
It was the first meeting when the discussion could focus on the third WP dealing with
developing an AI model to detect signatures from satellite imagery. However, AI was not
the only point of discussion as we covered an overview of the current progress on the
publications, open data, dissemination, and future plans for an impact strategy. The
final section of the meeting was dedicated to the remaining time on the project and
tried to answer the question, “what should be done by the end of the year?”.
We enjoyed a fruitful discussion and are looking forward to meeting all board members in
10 February - The Urban Grammar on MapScaping podcast!
Last month, Dani had a chance to speak with Daniel from the Mapscaping podcast about the Urban Grammar project. You can listen to the conversation over at the podcast’s website:
Or on your preferred purveyor of podcasts (Apple Podcasts, Stitcher, Google Podcasts, Spotify).
17 December - British Spatial Signatures at Towards urban analytics 2.0
On Tuesday, November 30th, Martin gave a lighting talk on the Spatial Signatures of Great Britain at the Alan Turing Insititute event Towards urban analytics 2.0 that was in person in Leeds, UK. The talk introduced the classification of Great Britain that is now available in the form of an interactive map and for download either from the Consumer Data Research Centre’s open data portal or the archived version from figshare. While there was no space for a Q&A directly after the talk, it has raised an interest leading to several great informal discussions afterwards.
You can watch the talk at:
The slides from the talk are here or here is the PDF file.
10 December - Third Advisory Board
On December 9th 2021, we held the meeting of the Advisory Board for the project. Still being limited by the global pandemic, we had a frutiful Zoom call filled with an exciting discussion on the opportunities Spatial Signatures offer.
The session started with an overview of the current progress focusing on the current progress and explanation of the whole process of generating spatial signatures from data acquistion to empirical exploration. We then spent
some time discussed the open infrastruture built around the project resulting in an open data product and maximum transparency of the process behind it. We followed by the focused discussion on dissemination and impact of the classification.
Three hours later, we finished with a lot of ideas and potential research avenues and collaborations to be explored. We hope that the situation permits another Advisory board meeting soon and hopefully, in person.
10 December - British Spatial Signatures talk at CASA
On Monday November 29th, Dani gave a talk on the Spatial Signatures project at the CASA seminars. The talk covered known bits from previous talks, mostly about framing of the problem and the foundational blocks of the Spatial Signatures but, more importantly, presented for the first time our results from the British signatures. The audience was engaging and had many super interesting questions. Thanks!
CASA recorded and published the video on YouTube:
The slides used are here or, if you prefer it, here is the PDF file.
09 December - Urban Grammar in the Turing Catch Up
Last month, on November 23rd, Dani gave a quick overview of the Urban Grammar project at the monthly catch up of the Alan Turing Institute. This is an internal call open to all fellows and staff. Dani gave an overview of the project, and covered in a bit more detail the aspects that have been completed already, including the development of a Spatial Signature classification for Great Britain.
There is no video of the talk, but you can check out the slides used here in the talks page of the project or, if you prefer to download a PDF version, you can do so here.
03 August - xyzservices: a unified source of XYZ tile providers in Python
Within the project, we often need to map the results within different contexts ranging from static to interactive maps. We felt that it could be a smoother experience and built xyzservices.
A Python ecosystem offers numerous tools for the visualisation of data
on a map. A lot of them depend on XYZ tiles, providing a base map layer,
either from OpenStreetMap, satellite or other sources. The issue is that
each package that offers XYZ support manages its own list of supported
We have built xyzservices package to support any Python library making
use of XYZ tiles. I’ll try to explain the rationale why we did that,
without going into the details of the package. If you want those
details, check its documentation.
Let me quickly look at a few popular packages and their approach to tile
management - contextily, folium, ipyleaflet and holoviews.
contextily brings contextual base maps to static geopandas plots. It
comes with a dedicated contextily.providers module, which contains a
hard-coded list of providers scraped from the list used by leaflet
(as of version 1.1.0).
09 July - Signatures for (all of) GB at Turing Urban Analytics
On June 30th, Dani provided an overview of ongoing work to build Spatial Signatures for all
of Great Britain at the Turing’s Urban Analytics monthly meetup.
This was the first public presentation covering our work at a national scale and we are excited at how positively it was
The talk is published now on YouTube:
You can have a look at Martin’s presentation at:
And if you want to check out the slides, you can see the HTML deck or the PDF version.
We are nearing the point where we can release some of the data products we have been building and feature in this talk so, if you are interested, stay
tuned as we will have more news to share shortly!
05 July - Form-based Signatures at ISUF 2021
On June 30th, Martin presented a classification of Great Britain based on the form
component of Spatial Signatures at the International Seminar on Urban Form 2021, which
was held virtually in Glasgow, Scotland. This was the first time we have presented results covering the
whole GB, albeit using a single component only (function has been temporarily excluded). The work was received positively and spurred
a bit of discussion. Within a few months, the results should be published in the conference proceedings.
The slides used are here or, if you prefer it, here is the PDF file (24Mb).
04 June - Google Summer of Code
Members of the Urban Grammar project are getting involved in developing the next
generation set of tools for distributing processing of geospatial vector data. In its
part, the Urban Grammar
project heavily depends on the processing of vector geospatial data using
GeoPandas Python library. However, to scale GeoPandas
algorithms to the extent of Great Britain, we need to do more than the library can do by
default. GeoPandas operations are currently all single-threaded, severely limiting the
scalability of its usage and leaving most of the CPU cores just laying around, doing
nothing. Dask is a library that brings parallel and distributed
computing to the ecosystem. For example, it provides a Dask DataFrame that consists of
partitioned pandas DataFrames. Each partition can be processed by a different process
enabling the computation to be done in parallel or even out-of-core.
We are using Dask within our workflows in bespoke scripts. However, Dask could provide
ways to scale geospatial operations in GeoPandas in a similar way it does with pandas.
There has been some effort to build a bridge between Dask and GeoPandas, currently
taking the shape of the dask-geopandas
library. While that already supports basic parallelisation, which we used in our code,
some critical components are not ready yet. That should change during this summer within
the Google Summer of Code project Martin is (co-)mentoring. We hope that this effort
will allow us to significantly simplify and even speed up the custom machinery we built
to create spatial signatures in
29 April - Second Advisory Board
On April 15th. 2021, we held the second meeting of the Advisory Board for the project. We are delighted that all board members joined us on Zoom for a few hours of exciting discussions on the progress and the future of the project.
Dani started with an overview of our progress since the last meeting, which you can check in his UBDC talk. We followed by the focused discussion on the concepts of Spatial Signature and Enclosed Tessellation and our initial paper illustrating both on the sample of cities worldwide. We discussed the clarity of our ideas and the needs for new spatial units and classification methods, and their potential drawbacks and enhancements. In the last part, we tried to zoom out to see a bigger picture and fit the research within existing projects within academia and the public sector.
After three hours of a very fruitful discussion, we finished with a lot of food for thought and ideas to be explored in the future. Let’s just hope that the Advisory Board meetings will soon happen physically in Liverpool to have an even more productive and friendly environment!