On September 9, somewhere in the Pacific Ocean, a small weather disturbance led to a tropical depression. Ten days later, 9 million people were asked to evacuate as super typhoon Nanmadol bore down on Japan, bringing with it winds gusting at up to 250 km/h and rains “like never before”. Houses in Japan were hit by a devastating mix of wind, floods and landslides.
In the aftermath, our award-winningTractable Property Estimator solution was working hard to help homeowners process and settle their insurance claims as quickly as possible, with payouts into some of their bank accounts on the same day. This is revolutionary in the industry and is having real-world impact on people’s livelihoods, allowing them to recover from a natural disaster much faster than before.
But can we do even better?
Going further and predicting damage from natural disasters
When so much is at stake, it is vital to form an accurate map of the damage so that relief and support can be provided where they are needed most. The quicker we can build this picture, the more lives we can help.
To really change the game, we cannot wait for individual homeowners to send information or wait for satellites to take images of the damage on the ground. We need to do the best we can with the best information available to us.
At Tractable, we asked the question: “Can we predict which homes will be damaged by a hurricane, and how badly, without direct visual evidence of the damage?” The motivation to predict damage goes beyond helping homeowners and communities recover from the aftermath of a disaster. It could also help avert loss of life.
Together with researchers from Georgian, our biggest investor, we formed a crack team of ten artificial intelligence (AI) researchers to try and answer this question.
What data can we use?
In the literature, researchers typically try to estimate how badly damaged a building is by a natural disaster by using what we call “gray skies” or “post-disaster” images. These are images that are captured by drone, plane or satellite in the immediate aftermath of the disaster. In these images, it is relatively easy to assess the damage to the building:
If the building is visually undisturbed, then we can say there is No Damage;
If roof elements are missing, there is water surrounding the building, or if it has cracks then we call it Minor Damage;
If we can see that the wall or roof has collapsed, then it is considered Major Damage; and
If it has completely collapsed or is no longer present in the image, then it is Destroyed.
Such a task is standard for modern computer vision based on deep learning, and several successful papers have been published on the subject.
However, capturing post-disaster images can be delayed by several days, and developing countries may not have sufficient resources to capture such images so quickly after a disaster.
With GaLeNet we were investigating whether we can use proactive modelling to predict damage severity and location to reduce lost time.
Therefore, our research team limited ourselves to three data sources:
Weather data (from OpenWeather API) in the seven days before a disaster hits
Data about the trajectory of the hurricane up until the point of impact (from NOAA)
Pre-disaster or “blue skies” images that contain visual context about the building structures and materials, but no visual evidence of the damage (from xBD dataset)
To align and merge these disparate data sources, we used the longitude and latitude of each building. This yielded a dataset of 36,625, 9,283, and 12,791 building instances for training, validation and testing.
We compared the accuracy of our predictions to a baseline that made use of the post-disaster images, to see the difference in performance when direct visual evidence of the damage was provided to the model and when it was not.
We designed and implemented a multi-modal neural network that we call GaLeNet. The name comes from the conjunction of “Gale” (a strong gust of wind) and “LeNet” (a famous neural network proposed by Yann LeCun). The model uses a concept called late fusion, a paradigm in which modality-specific latent representations are learned independently before fusing all data sources for the downstream task.
The most important part of this work was figuring out how to represent the input data given the limited amount of data we had to play with.
Making the most of our data
Due to the limited size of our dataset, it was not possible to train a visual representation from scratch. Instead, we tried several pre-trained models, including the well-known CLIP model from Open AI.
We varied the level of zoom of the pre- and post-disaster images and found that for pre-disaster images, the performance improved the more we zoomed out. We think this is because it exposed the model to the surroundings of the building and allowed it to gather a useful context to help predict the damage to the building that would occur if a disaster were to hit.
The opposite was true for the post-disaster image, as performance improved progressively as we zoomed in. This allowed the model to closely inspect the damage caused by the disaster. For both approaches, we found that the performance was best if we provided the model with zooms at four different scales.
For the hurricane trajectory, we came up with a novel featurization method. Since the Earth is a sphere, we computed the Haversine distance between a given building and each location along the trajectory of the hurricane. We then found the point at which the hurricane was closest to the building and where, in theory, where it would have had maximum impact. At this point, we then took the wind speed and pressure as well as the distance from the building. This featurization was then fed into a multi-layered perceptron to refine the representation so that it could then be fused by GaLeNet with the other modalities. We found this simple representation worked better than a more complex representation that attempted to make use of the entire hurricane trajectory.
To our surprise, we found that GaLeNet was able to predict the damage to property before it had visual evidence of it and did it well. It achieved this by combining information about the weather, the trajectory of the hurricane and images of the building and its surrounding before they were damaged. It was able to combine the different streams of data and boost performance beyond that of using a single modality on its own. The performance is still somewhat lower than if the model has access to direct visual evidence of the damage, which was to be expected.
The main caveat to our work was that the dataset was relatively small, and we were unable to test generalization of the model to completely unseen natural disasters. This is a critical step before the method can be used in production.
However, we feel that this opens up a new direction for disaster management and relief, by making early predictions of where to focus resources before the full extent of the damage is known. Since publication, GaLeNet has been well received and was awarded best paper at the Fragile Earth Workshop 2022 and was also presented at the CVPR MultiEarth Workshop.