#5488. A Natural Language Processing Approach to Understanding Context in the Extraction and GeoCoding of Historical Floods, Storms, and Adaptation Measures
August 2026 | publication date |
Proposal available till | 20-05-2025 |
4 total number of authors per manuscript | 0 $ |
The title of the journal is available only for the authors who have already paid for |
|
|
Journal’s subject area: |
Library and Information Sciences;
Media Technology;
Management Science and Operations Research;
Information Systems;
Computer Science Applications; |
Places in the authors’ list:
1 place - free (for sale)
2 place - free (for sale)
3 place - free (for sale)
4 place - free (for sale)
Abstract:
Despite the known financial, economical, and humanitarian impacts of hurricanes and the floods that follow, datasets consisting of flood and flood risk reduction projects are either small in scope, lack in details, or held privately by commercial holders. However, with the amount of online data growing exponentially, we have seen a rise of information extraction techniques on unstructured text to drive insights. On one hand, social media in particular has seen a tremendous increase in popularity. On the other hand, despite this popularity, social media has proven to be unreliable and difficult to extract full information from. In contrast, online newspapers are often vetted by a journalist, and consist of more fine details. As a result, in this paper we leverage Natural Language Processing (NLP) to create a hybrid Named-Entity Recognition (NER) model that employs a domain-specific machine learning model, linguistic features, and rule-based matching to extract information from newspapers. To the knowledge of the authors, this model is the first of its kind to extract detailed flooding information and risk reduction projects over the entire contiguous United States. The approach used in this paper expands upon previous similar works by widening the geographical location and applying techniques to extract information over large documents, with minimal accuracy loss from the previous methods. Specifically, our model is able to extract information such as street closures, project costs, and metrics.
Keywords:
Floods; Information Extraction; Machine Learning; Natural language processing; Newspapers
Contacts :