Completely lost. Traccar stops receiving data

imarinkovic3 years ago

Hello friends, I have the following problem to solve, but I already ran out of ideas:

  • Windows 2016 Server 16GB RAM, 4vcpu (VPS Hostgator)
  • Traccar version 4.7
  • Mysql 8
  • Java: jre that comes with traccar and Oracle jre 1.8

Problem: in random periods of time, Traccar stops processing messages received from the GPS (different brands and models, in total about 200 devices), the traccar log simply stops being written and only logins to the web interface are observed. It is observed in the web interface that the devices begin to appear in the "Unknown" state gradually until they are all in that state. Some times keeps runnig hours, some time just 20 minutes.

Analysis performed: I have gone through different layers of analysis, going through

  • Database check (maintenance of the table of positions and events was carried out), without finding problems.
  • Analysis at Java level, here I am more of a beginner and I have limited myself to move some memory parameters, without success
  • Restarting traccar solves the problem, data traffic comes again.

I applied a monitoring software to perform some correlations, finding the following:

  • Prior to the need to restart traccar, it's observed that the number of database connections begins to drop; In the MySQL error log it's observed that the connections begin to be aborted.

Aborted connection 31 to db: 'traccar' user: 'traccar' host: '127.0.0.1' (Got an error reading communication packets)

  • Along with the above, it is observed that the number of connections in CLOSE_WAIT state begins to grow (see image attached)

1 hour detail
24 hour detail

What have I done?
I have modified Windows network parameters like TcpTimedWaitDelay, MaxUserPort
Modified timeouts from traccar (server, status, gt06)

I'm already completely lost about it. Any ideas?

Relevant debug just before stop receiving data

2021-04-19 09:51:35  INFO: [3a5799f8: gt06 < 186.11.44.161] HEX: 78780d01086520503168352601ce88510d0a
2021-04-19 09:51:35  INFO: [3a5799f8: gt06 > 186.11.44.161] HEX: 7878050101cefeff0d0a
2021-04-19 09:51:37  INFO: [c6ad2944: h02 < 181.43.226.178] HEX: 2a48512c383137303835313430362c56312c3039353133362c412c333332342e393935362c532c30373034302e323635312c572c3030302e30332c3030302c3139303432312c46464646424246462c3733302c30312c31333031342c323433363323
2021-04-19 09:51:37  INFO: [c6ad2944: h02 > 181.43.226.178] HEX: 2a48512c383137303835313430362c56342c56312c323032313034313931323531333723
2021-04-19 09:51:38  INFO: [c6ad2944] id: 8170851406, time: 2021-04-19 06:51:36, lat: -33.41659, lon: -70.67109, speed: 0.0, course: 0.0
2021-04-19 09:51:39  INFO: [089047fb: gt06 < 181.42.41.92] HEX: 78782222150413093326ca039828d807965bc600180002da0132cc00602f000000007306b90d0a
2021-04-19 09:51:39  INFO: [089047fb: gt06 > 181.42.41.92] HEX: 78780522007365160d0a
2021-04-19 09:51:39  INFO: [089047fb] id: 352672102546082, time: 2021-04-19 06:51:38, lat: -33.50199, lon: -70.71911, course: 0.0
2021-04-19 09:51:39  INFO: [964722f2: h02 < 181.43.227.212] HEX: 2a48512c3836353230353033343939363032342c56312c3039353133382c412c333334322e30313239362c532c30373035302e30343039382c572c302e30302c3138392c3139303432312c4646464646424646230d0a
2021-04-19 09:51:39  INFO: [964722f2: h02 > 181.43.227.212] HEX: 2a48512c3836353230353033343939363032342c56342c56312c323032313034313931323531333923
2021-04-19 09:51:40  INFO: [964722f2] id: 865205034996024, time: 2021-04-19 06:51:38, lat: -33.70022, lon: -70.83402, course: 189.0
2021-04-19 09:51:40  INFO: [6798f436: gt06 < 181.42.47.21] HEX: 78781f120d0b0b000000c0000000000000000000200002da01138f006113026205b10d0a
2021-04-19 09:51:40  INFO: [6798f436: gt06 > 181.42.47.21] HEX: 787805120262d1000d0a
2021-04-19 09:51:40  INFO: [6798f436] id: 865205031687485, time: 2013-11-11 01:00:00, lat: -0.00000, lon: 0.00000, course: 0.0
2021-04-19 09:51:40  INFO: [c13978d5: h02 < 181.43.225.217] HEX: 2a48512c3836353230353033373033393632342c56312c3039353133392c412c333333332e32313139312c532c30373034332e32343236372c572c302e30302c3134362c3139303432312c4646464646424646230d0a
2021-04-19 09:51:40  INFO: [c13978d5: h02 > 181.43.225.217] HEX: 2a48512c3836353230353033373033393632342c56342c56312c323032313034313931323531343023
2021-04-19 09:51:41  INFO: [c13978d5] id: 865205037039624, time: 2021-04-19 06:51:39, lat: -33.55353, lon: -70.72071, course: 146.0
2021-04-19 09:51:41  WARN: [c13978d5] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-04-19 09:51:41  INFO: [c13978d5] disconnected
2021-04-19 09:51:42  INFO: [aed3755f] connected
2021-04-19 09:51:47  INFO: [316e81a8: h02 < 181.42.39.14] HEX: 2a48512c3836353230353033363936383736352c56312c3039353134362c412c333333342e35313033322c532c30373034302e38313232302c572c302e30302c3133302c3139303432312c4646464646424646230d0a
2021-04-19 09:51:47  INFO: [316e81a8: h02 > 181.42.39.14] HEX: 2a48512c3836353230353033363936383736352c56342c56312c323032313034313931323531343723
2021-04-19 09:51:47  INFO: [316e81a8] id: 865205036968765, time: 2021-04-19 06:51:46, lat: -33.57517, lon: -70.68020, course: 130.0
2021-04-19 09:51:47  WARN: [316e81a8] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-04-19 09:51:47  INFO: [316e81a8] disconnected
2021-04-19 09:51:47  INFO: [fee6851a] connected
2021-04-19 09:51:49  INFO: [2f9c7dc3: h02 < 186.11.40.194] HEX: 2a48512c3836353230353033393235393337382c56312c3039353134372c412c333332312e36383939332c532c30373034312e39303130352c572c302e30302c3335332c3139303432312c4646464646424646230d0a
2021-04-19 09:51:49  INFO: [2f9c7dc3: h02 > 186.11.40.194] HEX: 2a48512c3836353230353033393235393337382c56342c56312c323032313034313931323531343923
2021-04-19 09:52:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:52:32 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:53:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:53:32 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:54:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:54:32 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:55:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:55:32 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:56:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:56:32 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:57:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:57:32 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:58:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:58:32 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:59:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 09:59:26 DEBUG: node0 scavenging sessions
2021-04-19 09:59:26 DEBUG: org.eclipse.jetty.server.session.SessionHandler1602422340==dftMaxIdleSec=-1 scavenging sessions
2021-04-19 09:59:26 DEBUG: org.eclipse.jetty.server.session.SessionHandler1602422340==dftMaxIdleSec=-1 scavenging session ids []
2021-04-19 09:59:26 DEBUG: org.eclipse.jetty.server.session.DefaultSessionCache@3f9d3913[evict=-1,removeUnloadable=false,saveOnCreate=false,saveOnInactiveEvict=false] checking expiration on []
2021-04-19 09:59:32 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 10:00:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 10:00:32 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 10:01:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 10:01:32 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 10:02:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 10:02:32 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 10:03:02 DEBUG: HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2021-04-19 10:03:13 DEBUG: Queued change org.eclipse.jetty.io.ManagedSelector$Accept@d4f8a05 on ManagedSelector@7542a6a4{STARTED} id=0 keys=0 selected=0 updates=0
2021-04-19 10:03:13 DEBUG: Wakeup on submit ManagedSelector@7542a6a4{STARTED} id=0 keys=0 selected=0 updates=1
2021-04-19 10:03:13 DEBUG: Selector sun.nio.ch.WindowsSelectorImpl@457b3a05 woken with none selected
2021-04-19 10:03:13 DEBUG: Selector sun.nio.ch.WindowsSelectorImpl@457b3a05 woken up from select, 0/0/0 selected
2021-04-19 10:03:13 DEBUG: Selector sun.nio.ch.WindowsSelectorImpl@457b3a05 processing 0 keys, 1 updates
2021-04-19 10:03:13 DEBUG: updateable 1
2021-04-19 10:03:13 DEBUG: update org.eclipse.jetty.io.ManagedSelector$Accept@d4f8a05
2021-04-19 10:03:13 DEBUG: queue org.eclipse.jetty.io.ManagedSelector$Accept@d4f8a05 startThread=0
2021-04-19 10:03:13 DEBUG: run org.eclipse.jetty.io.ManagedSelector$Accept@d4f8a05 in QueuedThreadPool[qtp374465550]@1651e40e{STARTED,8<=8<=200,i=2,r=4,q=0}[ReservedThreadExecutor@8d5bae4{s=0/4,p=0}]
2021-04-19 10:03:13 DEBUG: updates 0```
Anton Tananaev3 years ago

You should probably start by upgrading your server to the latest version.

imarinkovic3 years ago

Upgraded to 4.12, no change in behavior.

Anton Tananaev3 years ago

Have you configured connection timeouts?

imarinkovic3 years ago

Yes, the following.

<entry key='gt06.timeout'>30</entry>
<entry key='osmand.port'>5055</entry>
<entry key='server.timeout'>180</entry>
<entry key='status.timeout'>120</entry>

I can see a LOT of this messages in log file:

2021-05-03 17:48:13 WARN: [35a5329f] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:13 WARN: [b7dbbaf7] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:13 WARN: [561f603c] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:18 WARN: [dbd69c69] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:18 WARN: [3d70611f] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:18 WARN: [5b8b32f5] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:19 WARN: [a5cf1291] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:19 WARN: [2f7dbac5] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:20 WARN: [284e8184] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:21 WARN: [bbe4877e] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:22 WARN: [2020afa6] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:22 WARN: [7ef05fab] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:22 WARN: [415d5774] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:22 WARN: [bc720fa6] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:23 WARN: [3a60ff85] error - An existing connection was forcibly closed by the remote host - IOException (...)
2021-05-03 17:48:24 WARN: [4fc6405f] error - An existing connection was forcibly closed by the remote host - IOException (...)

I supposed they are from devices connections, but I can't see in the same log a connection first (like [ID] connected):

PS C:\Program Files\Traccar\logs> cat .\tracker-server.log |findstr "bbe4877e"
2021-05-03 17:48:21  WARN: [bbe4877e] error - An existing connection was forcibly closed by the remote host - IOException (...)
PS C:\Program Files\Traccar\logs>

I'm going to take a look with Wireshark on those connections to see where they come from, maybe an attack??

Any help or idea will be appreciated.

Anton Tananaev3 years ago

An existing connection was forcibly closed by the remote host

This means device force-closed connetion.

imarinkovic3 years ago

That's correct, but I cannot identify origin connection.

Anyway, traccar keeps failing at random periods. The only thing I can detect is a high number in CLOSE_WAIT state connections, a pair of "Aborted connections" in MySQL error an nothing else. Out of ideas.

Quade dennis2 years ago

did you ever get to the bottom of this, also seeing same issue

imarinkovic2 years ago

After a lot analysis, we faced two problems:
1.- devices disconnection: caused by phone operator, nothing to do here unless changing SIM's to 300 devices...

2.- random failures of service, disconnection from database: didn't find the cause, finally created a kind of watchdog program restarting the service when detected stopped.

After some time, that server was migrated to a linux server, and no more problems.

Quade dennis2 years ago

interesting, so you think it was mainly windows issue?
Was is a terminal version of linux you went for or something with an desktop IDE ? which version did you use?

Gaspar2 years ago

Hi, in our case the problem was generated because our SIM Card was running out of credit

imarinkovic2 years ago

Sorry Quade, I lost your question.

The server was migrated to Amazon Linux 2

Hello imarinkovic, first of all, let me tell you that you made a perfect explanation of the problem and what you did to try to identify the problem.

I currently have exactly the same error and I practically did everything you did without being able to find the cause. I want to ask you clearly: was this completely solved by migrating your server to Linux? If so, could you recommend the exact version of Linux and MySQL that you used? Thank you very much, I am looking forward to reading your response.

imarinkovic7 months ago

Hi Datasoft, greetings from Chile.

Yes, the problem was completely gone after migration to linux server; in this case we used a Amazon Linux 2 server, but in my experience, any linux flavor is completely capable of run tracer without a problem.

Thank you very much for your response, really thank you very much, I was going crazy with this topic. Greetings from Colombia, I hope we can contact each other to share about Traccar, which for me is simply one of the best GPS tracking projects.