From chaos to relief: How we solved a critical problem in the telemetered meter disconnection and reconnection system
For weeks, our system responsible for remote cut, reconnection, and meter readings was failing consistently. Commands we sent were stuck in timeouts—and this wasn’t a mere technical hiccup:
- Customers who had already paid remained disconnected.
- An entire department had to submit commands manually.
- Service orders (SOs) were being closed outside the normal flow between the Commercial interface and the Remote Management system.
It drained both our internal teams and our customers, and we needed an urgent solution.
Problem Context
Under normal conditions, the process works like this:
- The Commercial interface generates the service order.
- The Remote Management system, built in Java, sends the command to the meter (cut, reconnect, or read).
- The meter responds, confirming the action.
- The service order is closed automatically in the system.
However, once the issue began, step 3 never happened. The system sent the command… and then waited until it hit the timeout.
Investigating the Cause
At first, all signs pointed to a network infrastructure problem. We ran multiple checks:
- Monitoring communications with meters.
- Verifying servers and links.
- Reviewing logs for connection errors.
Everything looked fine… yet the failure persisted. That’s when we considered that the real issue might not be technical at the network level, but logical at the system level.
The Key Finding
On deeper review, we discovered that the user configured to send commands in the system had limited permissions. Due to how the interface is designed, those permissions restricted bulk command sending and interfered with automatic SO closures.
In other words, the system wasn’t failing due to connectivity—it was failing because the user account lacked the necessary access level to execute all operations.
The Solution
The fix was straightforward: we requested elevated privileges for the service user used by the Java program so it could send the corresponding commands.
The change had an immediate impact:
- Commands started sending and completing correctly.
- SOs closed automatically within the Commercial–Remote Management flow.
- The department that had been doing manual work returned to its normal function.
Lessons Learned
- It’s not always the network: sometimes the failure is as simple as user permissions.
- Document roles and privileges: knowing what each user can do prevents blockers.
- Keep a high-privilege test user: vital for diagnosing critical issues.
- Avoid manual workarounds: breaking the automated flow creates more errors long term.
Final Reflection
This case reminded me that in technology, root causes aren’t always obvious. We spent days checking cables, routers, and servers, when the answer was a single permission setting. Thank God we restored the service and spared customers from further, unnecessary delays. In the end, seeing everything flow correctly again is the best reward for any technical team.