Improving Data Integrity and Efficiency in Geospatial Applications
Recently, while working as an IT analyst, I’ve received constant complaints about a widely used tool by many departments within the company. This tool allows printing maps with georeferenced and alphanumeric customer information, making it easier to locate them and perform support tasks. However, it has been failing lately, causing disruptions in daily work.
The main problem was that the table storing the entity information along with their coordinates (located in an Oracle geospatial database) did not contain enough data to perform the necessary queries. This prevented the generation of PDF files with maps that included the entities’ locations, affecting the work of field technicians who rely on this information to carry out their tasks efficiently.
Upon analyzing the situation, I realized that the process responsible for populating that table was implemented in a Java application and followed these steps:
- Perform a complete table deletion (TRUNCATE).
- Execute a query to fetch new records to be inserted (SELECT).
- Insert the records into the target table (INSERT).
Everything seemed fine up to this point, right? Well, no. The issue arose when the connection to the source database (where the data to be inserted is located) failed. This happened right after performing the complete deletion of the destination table, leaving it empty and without updated data.
How Did I Solve It?
To address this problem, I implemented a new flow that ensures data integrity even if the connection fails. The updated flow performs the following:
- Check the connection to the source database.
- If the connection is successful, execute the following steps:
- Insert data that exists in the source table but not in the destination table.
- Delete data that exists in the destination table but not in the source table.
- Update data that exists in both tables but differ in coordinate fields.
Results
Thanks to this optimization, the process, which previously took almost three hours to complete with the old (inefficient) configuration, now runs in a maximum of 10 minutes. This improvement is mainly because the coordinates in the source table usually don’t change much, significantly reducing processing time.
These small adjustments in the workflow not only improve efficiency but also guarantee service continuity, avoiding unnecessary downtime. Implementing similar strategies can help you optimize internal processes in your organization and maintain the performance of the applications under your responsibility.