Most likely type 0x8a means command response or something.
Messages with 7979 header are documented in some newer protocol documents.
I think you need to get latest protocol documentation from the device vendor and check if we need to modify anything on the server side. It sounds like device doesn't receive response that it expects.
As for CLOSE_WAIT sockets, you can set a timeout.
Hello, Anton,
Thanks for swift response. We are waiting for the latest protocol documentation from them to see what's changed. Hope it'll change things, because it's a production system.
I wonder if there's many changes on the protocol decoding, because we used traccar core from one year ago (about version 2.4ish) with no complications.
For timeout, i already set 60-second timeout for 5023 (gt06) protocol, and it worked - sometimes. But, most of the time, i see devices going from established to close_wait and the connection is not terminated until long (>10-15 mins) which clog server's socket if traccar isn't restarted. Any idea why? Thx.
Please share the document if you manage to get it.
The problem might be that response format from the server has changed. I can check that as soon as I get up to date documentation.
Timeout should solve the problem with waiting connections. You can also try to increase connection limit on the server.
OK, i will. And i don't have problem with memory and connection limit as far as i can tell. I have raised and tuned up tcp/ip, file descriptor and conntrack there - and it can get up to about 20000-30000 connection in netstat before packet started to drop. But i'll see to it, it's a bit hard to tune for >5000 concurrent connection, really... Any suggestion for tuning? Thanks a lot.
If you have 5000 devices, I would recommend to set OS connection limit to at least 50k.
Usually connection limit + timeout is enough. Not sure what else can be done. I guess we need to fix the protocol issue if it causes devices to re-connect all the time. That would definitely help.
This is my sysctl.conf - I hope someone can help if im tuning this correctly or not.
net.core.rmem_max = 33554432
net.core.wmem_max = 33554432
net.ipv4.tcp_rmem = 4096 87380 33554432
net.ipv4.tcp_wmem = 4096 65536 33554432
net.ipv4.tcp_slow_start_after_idle = 0
net.ipv4.ip_local_port_range = 10024 65535
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 1
net.nf_conntrack_max = 196608
net.netfilter.nf_conntrack_tcp_timeout_established=600
net.ipv4.tcp_fin_timeout = 1
net.ipv4.tcp_keepalive_intvl = 60
net.ipv4.tcp_keepalive_probes = 20
net.ipv4.tcp_no_metrics_save = 1
net.core.netdev_max_backlog = 4096
net.ipv4.tcp_keepalive_time = 600
net.netfilter.nf_conntrack_tcp_timeout_close = 10
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 10
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 10
net.netfilter.nf_conntrack_tcp_timeout_last_ack = 10
net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 10
net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 10
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 10
net.core.somaxconn=1024
net.ipv4.tcp_max_syn_backlog=4096
it should be able to get 40000-50000 connection no problem, as it only limited by the API range. also, traccar always runs with ulimit -n 1000000 , and that should be enough. and weirdly it seems some devices are 'queueing' and have a delayed packet to send to the server. It might be a problem with the gsm changing IP too fast, though, but i'm still searching for the clue.
Any suggestions regarding this? Thanks.
Looks good to me. Let me know if you find the cause of your problem.
Hello,
It seems like the GSM Mobile internet provider here have some... bad infrastructure designs. As in, its IP lifetime is below 2 minutes at best - the device's IP changed as fast as 30 seconds from what I observed. It might disrupt the handshake connection between device and server, as it disrupts persistent connection too, and it seems GT06 series rely on those persistent connection. It might be one of the reason it delayed and clogged in CLOSE_WAIT, and the connection isn't terminated - because the device on the other side of the IP changed, but the connection pipe is still there.
Any suggestion regarding this?
Usually IP changes when device reconnects, so it might be the result of the problem, not the cause.
Привет Антон.
Столкнулся с проблемой чтения архива трека трекера gt06:
http://52.34.254.76/1/786320161014_084024.jpg
трек как клетка, что это может быть как исправить?
Пожалуйста создайте отдельный топик.
А как создать топик, чо-то у меня с английским плохо? Не могу понять как создать тему?
Вот тут внизу страницы:
Hello everyone, thank you for all your information that you share here at the forum. Right now I have 5000 assets working on a virtual server with 4 cpus and 15 gb of RAM, this is my configuration of the sysctl.conf file:
[object Object]
and i want to know if theres is the correct configuration for 10,000 assets or what do i have to change, thanks for your help....
Hello,
I have used traccar server listener version 3.7 on Ubuntu 14.04 with Java 1.7/1.8, with custom database handling and front-end with no problem, with GT06 device (i think it's GT06N - need confirmation from the factory 1st) - ranging in number of 6000-8000 devices connected constantly, but lately, i have a problem.
1st, the log :
It has that '8a' protocol (7878058a0027b8a20d0a) that's not listed anywhere on protocol manual or the sourcecode decoder on latest traccar - any idea what that is? I can see this on almost all of my devices.
The next is, that '7979' starting bit - it's not listed anywhere on device protocol manual (unless the manufacturer did some customization - which they haven't confirmed), but it's there on your code (gt06protocoldecoder.java) - for MSG_INFO. but the subtype isn't right - it's 04 (797900709404) - that's not handled atm.
Those two things are happening a lot, and devices aren't responding after we send response data. it should respond with position/location data per 30 seconds, but it isn't - and it timed out quite fast, after 60 seconds - I set the timeout/resetDelay to 60 seconds for this device.
Those things clogged CLOSE_WAIT socket really fast (it raised to 18k close_wait socket in nearly 10 mins), so i am forced to restart traccar every 5-10 minutes to make sure the packet isn't dropped.
Another problem is, sometimes, a device sends login data, server responded, but no position/location data sent back at all, while they are still connected on the same IP. the problem is, sometimes, the device send another new login data with DIFFERENT IP - no wonder it thought we didn't send response data and it didn't send position packet.
Any idea how to handle these problems? Thank you a lot for the response in advance.
-Arshvein-