Urban Grammar AI research project

We know little about how the way we organise cities over space influences social, economic and environmental outcomes, in part because it is hard to measure.

Satellite imagery, combined with cutting-edge AI, can provide a source of data to track the evolution of the built environment at unprecedented detail.

This project develops a conceptual framework to characterise urban structure through the notions of spatial signatures and urban grammar, and will deploy it to generate open data products and insight about the evolution of cities.

GDSL University of Liverpool ATI


  • 14 July - “Open by Default” talk at the RSS Merseyside

    Last month, Dani participated in the “Using open data sources” event put together by the lovely folks at the Merseyside chapter of the RSS and HiPy. There, he delivered the talk “Open by default - Developing reproducible, computational research”. You can find the slides on the usual place (or here and in PDF directly). The video is also available on YouTube:

    This was a similar, albeit refined, talk to the one he did in Chicago and it is great it is now available to watch online.

  • 10 June - Data in Brief paper

    This month saw a new paper/open data product of the Urban Gramar sphere see the light of day. As part of the spin-off ITINERANT project, and in collaboration with GDSL colleagues Krasen Samardzhiev, Alessia Calafiore, and Francisco Rowe, we published an open data product and data descriptor that presents a new classification of signatures entirely based on function. Here are the coordinates where you can find everything:

    Samardzhiev, K.; Fleischmann, M.; Arribas-Bel, D.; Calafiore, A.; Rowe, F. (2022). “Functional signatures in Great Britain: A dataset”. Data in Brief, 43. 10.1016/j.dib.2022.108335

    Published Version (Open Access)

    Code repository@Github

    Data product@Figshare

  • 09 June - ISUF Italy

    In early June, Martin attended the 6th ISUF Italy conference “Morphology and Urban Design. New strategies for a changing society”, organised by the University of Bologna in a hybrid mode (Martin was present online). We took the opportunity to talk about Detecting urban typology from multispectral satellite imagery using neural networks, where we combined the work we currently focus on in the Urban Grammar project and the one published in a recent CEUS paper. It allowed us to illustrate the application of remote sensing and neural networks on urban form in both supervised (detection of signatures from Sentinel 2) and unsupervised (the CEUS paper) use cases. With the audience composed of the ISUF community revolving heavilty around architecture, the talk started an engaging discussion on the current limits of openly available satellite imagery in the context of urban morphology.

    The talk itself was not recorded but the slides are available from the usual place on HTML (this time we couldn’t build a PDF for download satisfactorily).

  • 19 May - OS visit

    On May 17th and 18th, Martin and Dani spent a week visiting the Ordnance Survey in Southampton. The trip was planned as an institutional visit to exchange ideas, discuss the project and, more generally, interact with our advisory board member, Dr. Isabel Sargent and her team at OS Research.

    We had a fantastic time. We presented our current work on using computer vision, a bit of geographic magic, and satellite imagery to automate the recognition of our upcoming spatial signatures for Great Britain. You can check out the slides we used for that presentation in our usual talks repository in our two common formats:

    HTML for the browser

    PDF for download

    Beyond the purely scheduled, we had the oportunity to see OS from the inside and to chat to a lot of really clever people about our project, its value, and potential solutions to the challenges we’re currently facing.

  • 05 May - Chicago visit

    The first week of May, Dani visited the University of Chicago. As part of the trip, he (re-)connected with folks at the Center for Spatial Data Science and met a bunch of new friends at the Mansueto Institute for Urban Innovation. All in all, it was a fantastic week where there was even a bit of time to discuss all things urban form and function.

    On May 4th, he delivered the lecture “‘Open by Default’ - Developing reproducible, computational research” on open workflows for modern computational research at one of the GIS courses offered at Chicago and taught by Dr. Marynia Kolak. The slides of the talk are available at the Urban Grammar’s talks repository in our two common formats:

    HTML for the browser

    PDF for download

    This was a particularly tricky talk to conceive and prepare. Conceptually, it was a bit out of our “comfort zone” in that it is not really about the research we are doing at the Urban Grammar, but about the process we follow to realize it. It was also hard to structure it in a way that made sense because it pulled from many different aspects of the project, from the approach we take to writing slide decks to the computational infrastructure on which all of our computations rely. We are nevertheless happy with the outcome. It’s not ideal, and we will probably refine it in successive iterations (we’re planning on giving similar talks in the near future, stay tuned!), but this is a great start.

  • 25 April - CEUS paper

    This week saw the first peer-reviewed publication related to the Urban Grammar see the light of day. Full coordinates of the paper are available here:

    Singleton, A.; Arribas-Bel, D.; Murray, J.; Fleischmann, M. (2022). “Estimating generalized measures of local neighbourhood context from multispectral satellite images using a convolutional neural network”. Computers, Environment and Urban Systems, 95. 10.1016/j.compenvurbsys.2022.101802

    Published Version (Open Access)

    Code repository@GitHub

    Data repository@Dataverse

  • 20 April - Form-based Signatures paper and data

    Last year, we attended the International Seminar on Urban Form 2021, which was held virtually in Glasgow, Scotland and presented a classification of Great Britain based on the form component of Spatial Signatures; we called those morphosignatures. Earlier this year the proceedings from the conference were published, including our paper describing the work. On that occasion, we also published the dataset of resulting morphosignatures.

    You can get the paper either from the original proceedings under DOI 10.17868/strath.00080527 or from our Github repository (8MB). The bibtex citation for the paper is attached below.

    The code we used to generate the morphosignatures is available on the WP2 Jupyter book of the project, since a lot of it is shared with the codebase we developed to generate our main signature classification. If you want to have a peek into the specific notebooks for the clustering step (the most unique part of this exercise), you can find it at the following links:

    Form-based signatures

    Second level form signatures

  • 11 April - GISRUK talk

    On April 7th, we had a chance to attend the first large in-person academic conference since the project started more than two years ago. Martin joined GISRUK, coincidentally organised by the Geographic Data Science Lab, and presented ongoing work on developing an AI model to detect spatial signatures based on satellite imagery. The talk focused on the progress we have made so far and specifically on geographical questions arising when we try to link granular geometry of signatures to satellite images and rectangular chips sampled from them.

    We have received good feedback and suggestions for the next steps we are currently trying to implement in our plans. For anyone interested, the slides presented during the conference are available here or as a PDF (33MB) here. The extended abstract of the talk is also available in the conference proceedings.

  • 29 March - Contribution to tobler once again: Areal interpolation even faster

    Last year, we needed areal interpolation to transfer data from various administrative and statistical spatial units like Output areas to enclosed tessellation cells. To do that, we used the Tobler Python package from PySAL family. And since we needed to scale it to a national level, we spent some time refactoring the tools to make it much more performant. All that is discussed in this blog post.

    After a year, we needed areal interpolation again. This time, we took a slightly different approach. We did not want to partition the data a priori. We also knew that the first part of the function would work well as we had already reimplemented it before. However, we hit another bottleneck. This time in the second part of the code, we did not touch before. What has happened?

    It is a bit technical. We need to store a relation between source and target geometries. Say that we have 100 000 source geometries and 1 000 000 target ones. We essentially want an array of 100 000 rows and 1 000 000. That does not fit in memory. But since a lot of the cells of such an array would say 0, we can omit them and use a sparse array (or sparse matrix) format, which behaves like a normal dense array but is way more memory efficient. However, there are multiple ways of storing the data in a sparse array. Tobler was using DOK - Dictionary of Keys. That is really fast if you need to access individual records quickly. But that is not the application we have in tobler. We need quick row indexing (apart from a few other things). Fortunately, the fix was simple. We replaced DOK with CSR - Compressed Sparse Row matrix, and the results we miraculous. While the small test benchmark was about 45x faster, our actual interpolation was faster by several orders of magnitude. The original code did not finish after hours. The new once was done in under a second.

    Small things, like storage formats, matter. If you want to see the effect of different sparse array formats, see this notebook. And the change in tobler is here and will be part of the next release.

  • 18 March - Third Advisory Board

    On March 17th 2022, the (hopefully) last online Advisory Board meeting was held over Zoom. The assumption is that the final one in June will happen as an in-person meeting in Liverpool.

    It was the first meeting when the discussion could focus on the third WP dealing with developing an AI model to detect signatures from satellite imagery. However, AI was not the only point of discussion as we covered an overview of the current progress on the publications, open data, dissemination, and future plans for an impact strategy. The final section of the meeting was dedicated to the remaining time on the project and tried to answer the question, “what should be done by the end of the year?”.

    We enjoyed a fruitful discussion and are looking forward to meeting all board members in Liverpool soon.

  • 10 February - The Urban Grammar on MapScaping podcast!

    Last month, Dani had a chance to speak with Daniel from the Mapscaping podcast about the Urban Grammar project. You can listen to the conversation over at the podcast’s website:

    mapscaping logo

    Or on your preferred purveyor of podcasts (Apple Podcasts, Stitcher, Google Podcasts, Spotify).

  • 17 December - British Spatial Signatures at Towards urban analytics 2.0

    On Tuesday, November 30th, Martin gave a lighting talk on the Spatial Signatures of Great Britain at the Alan Turing Insititute event Towards urban analytics 2.0 that was in person in Leeds, UK. The talk introduced the classification of Great Britain that is now available in the form of an interactive map and for download either from the Consumer Data Research Centre’s open data portal or the archived version from figshare. While there was no space for a Q&A directly after the talk, it has raised an interest leading to several great informal discussions afterwards.

    You can watch the talk at:

    The slides from the talk are here or here is the PDF file.

  • 10 December - Third Advisory Board

    On December 9th 2021, we held the meeting of the Advisory Board for the project. Still being limited by the global pandemic, we had a frutiful Zoom call filled with an exciting discussion on the opportunities Spatial Signatures offer.

    The session started with an overview of the current progress focusing on the current progress and explanation of the whole process of generating spatial signatures from data acquistion to empirical exploration. We then spent some time discussed the open infrastruture built around the project resulting in an open data product and maximum transparency of the process behind it. We followed by the focused discussion on dissemination and impact of the classification.

    Three hours later, we finished with a lot of ideas and potential research avenues and collaborations to be explored. We hope that the situation permits another Advisory board meeting soon and hopefully, in person.

  • 10 December - British Spatial Signatures talk at CASA

    On Monday November 29th, Dani gave a talk on the Spatial Signatures project at the CASA seminars. The talk covered known bits from previous talks, mostly about framing of the problem and the foundational blocks of the Spatial Signatures but, more importantly, presented for the first time our results from the British signatures. The audience was engaging and had many super interesting questions. Thanks!

    CASA recorded and published the video on YouTube:

    The slides used are here or, if you prefer it, here is the PDF file.

  • 09 December - Urban Grammar in the Turing Catch Up

    Last month, on November 23rd, Dani gave a quick overview of the Urban Grammar project at the monthly catch up of the Alan Turing Institute. This is an internal call open to all fellows and staff. Dani gave an overview of the project, and covered in a bit more detail the aspects that have been completed already, including the development of a Spatial Signature classification for Great Britain.

    There is no video of the talk, but you can check out the slides used here in the talks page of the project or, if you prefer to download a PDF version, you can do so here.

  • 03 August - xyzservices: a unified source of XYZ tile providers in Python

    Within the project, we often need to map the results within different contexts ranging from static to interactive maps. We felt that it could be a smoother experience and built xyzservices.

    A Python ecosystem offers numerous tools for the visualisation of data on a map. A lot of them depend on XYZ tiles, providing a base map layer, either from OpenStreetMap, satellite or other sources. The issue is that each package that offers XYZ support manages its own list of supported providers.

    We have built xyzservices package to support any Python library making use of XYZ tiles. I’ll try to explain the rationale why we did that, without going into the details of the package. If you want those details, check its documentation.

    Let me quickly look at a few popular packages and their approach to tile management - contextily, folium, ipyleaflet and holoviews.

    contextily brings contextual base maps to static geopandas plots. It comes with a dedicated contextily.providers module, which contains a hard-coded list of providers scraped from the list used by leaflet (as of version 1.1.0).

  • 09 July - Signatures for (all of) GB at Turing Urban Analytics

    On June 30th, Dani provided an overview of ongoing work to build Spatial Signatures for all of Great Britain at the Turing’s Urban Analytics monthly meetup. This was the first public presentation covering our work at a national scale and we are excited at how positively it was received.

    The talk is published now on YouTube:

    You can have a look at Martin’s presentation at:

    And if you want to check out the slides, you can see the HTML deck or the PDF version.

    We are nearing the point where we can release some of the data products we have been building and feature in this talk so, if you are interested, stay tuned as we will have more news to share shortly!

  • 05 July - Form-based Signatures at ISUF 2021

    On June 30th, Martin presented a classification of Great Britain based on the form component of Spatial Signatures at the International Seminar on Urban Form 2021, which was held virtually in Glasgow, Scotland. This was the first time we have presented results covering the whole GB, albeit using a single component only (function has been temporarily excluded). The work was received positively and spurred a bit of discussion. Within a few months, the results should be published in the conference proceedings.

    You can have a look at Martin’s presentation at:

    The slides used are here or, if you prefer it, here is the PDF file (24Mb).

  • 04 June - Google Summer of Code

    Members of the Urban Grammar project are getting involved in developing the next generation set of tools for distributing processing of geospatial vector data. In its first part, the Urban Grammar project heavily depends on the processing of vector geospatial data using GeoPandas Python library. However, to scale GeoPandas algorithms to the extent of Great Britain, we need to do more than the library can do by default. GeoPandas operations are currently all single-threaded, severely limiting the scalability of its usage and leaving most of the CPU cores just laying around, doing nothing. Dask is a library that brings parallel and distributed computing to the ecosystem. For example, it provides a Dask DataFrame that consists of partitioned pandas DataFrames. Each partition can be processed by a different process enabling the computation to be done in parallel or even out-of-core.

    We are using Dask within our workflows in bespoke scripts. However, Dask could provide ways to scale geospatial operations in GeoPandas in a similar way it does with pandas. There has been some effort to build a bridge between Dask and GeoPandas, currently taking the shape of the dask-geopandas library. While that already supports basic parallelisation, which we used in our code, some critical components are not ready yet. That should change during this summer within the Google Summer of Code project Martin is (co-)mentoring. We hope that this effort will allow us to significantly simplify and even speed up the custom machinery we built to create spatial signatures in WP2.

  • 29 April - Second Advisory Board

    On April 15th. 2021, we held the second meeting of the Advisory Board for the project. We are delighted that all board members joined us on Zoom for a few hours of exciting discussions on the progress and the future of the project.

    Dani started with an overview of our progress since the last meeting, which you can check in his UBDC talk. We followed by the focused discussion on the concepts of Spatial Signature and Enclosed Tessellation and our initial paper illustrating both on the sample of cities worldwide. We discussed the clarity of our ideas and the needs for new spatial units and classification methods, and their potential drawbacks and enhancements. In the last part, we tried to zoom out to see a bigger picture and fit the research within existing projects within academia and the public sector.

    After three hours of a very fruitful discussion, we finished with a lot of food for thought and ideas to be explored in the future. Let’s just hope that the Advisory Board meetings will soon happen physically in Liverpool to have an even more productive and friendly environment!