Server / Database scaling or optimization discussion thread

jaimzj5 years ago

Hello All,

I am starting this thread with hopes that everyone interested share their thoughts and suggestion on how Traccar or its database can be improved to better store large number of records.

The discussion is not about the number of devices/connection traccar can handle. But about the amount of data being stored in the primary/live database/tables. (Where report generation and few other aspects maybe be impacted when needed)

Sharing a suggestion below to set the discussion in motion ( The suggestion below might not sound feasible or practically logical in many ways, but am just hoping to give it a push start with whatever I feel so that everyone can share their suggestions, and hopefully one of the suggestion is good , efficient and possible to implement for real)

  1. My suggestion on this is to have a process in place that archives or migrates all data older than "X" days to different server/database. And any queries on reports etc which are older than the "X" days be retrieved from the other server/database.
  2. The ability to configure multiple db servers for year/month example:
    ***Note the config is just for example and not as per actual traccar config standards
<config archiveOlderThanDays>7</config>
<config archiveRange>Quarterly</config>
<config archive2019_1>192.168.0.22</config>
<config archive2019_2>192.168.0.23</config>
<config archive2019_3>192.168.0.24</config>
<config archive2019_4>192.168.0.25</config>
<config archive2020_1>192.168.0.26</config>
<config archive2020_2>192.168.0.27</config>
<config archive2020_3>192.168.0.28</config>
<config archive2020_4>192.168.0.29</config>

Benefits:

  1. Whenever live server needs to be migrated or any maintenance is required (the live database wont be bulky or large sized)
  2. The archive database servers will need zero management, as they will simply be used to store older records one time, and then onwards just read to generate reports etc.
  3. Having a high end server with 10000's of devices wont cost much, but the moment you keep adding storage to it the costs increase exponentially (thus having the above solution, helps in saving storage cost as well) as your primary traccar server can have lower storage and higher processing capacity and memory.
Anton Tananaev5 years ago

Have you heard of database clustering?

jaimzj5 years ago

Yes

Anton Tananaev5 years ago

OK, so how is your approach different?

jaimzj5 years ago

I was not looking at this from a clustering perspective, rather I took an approach of archiving data (with traccar being able to get the archived data as well for reports etc) basically with this in mind that the position table gets really big in terms of size. (Even in situation where clustering is used, position table will still retain same size and keep increasing over a period of time)

Other than the performance aspect, I was also looking at it from a cost point of view. I believe clustering will involve substantial increase in cost of running servers as well.

The idea of this thread is just for discussions, with so many users here. hoping that it may lead to some good suggestions that could possibly be validated by you and if found feasible maybe even implement it then.

I am not claiming my suggestion is a good approach, but just that when we have 6 years worth of data some times getting reports to work can be difficult.

Anton Tananaev5 years ago

I don't really see the difference between clustering and what your suggestion is. Maybe you can elaborate on what the difference would be.

jaimzj5 years ago

Sure, I can try to explain the difference.

In clustering all instances within the cluster will hold the same data / content which means same schema same data.

  • The suggestion I made basically means retain the traccar schema on primary traccar server. And then deploy multiple instances of only the database setup.
  • Instead of clustering we then archives data based on date/time range to the archive servers. In this case the primary traccar server will only hold data upto "x" number of days and any older data based on config will be moved to these archive servers.
  • And there after any user running a report for any date range older than the "x" day will get their data retrieved by traccar from a archive server that is assigned for that date/range (Thus faster loading, and lesser data to run the query through)

So primary difference. is clustering all instances will have same data/schema
And the suggestion i made, separates older data to smaller database instances. (each server will hold old data for a specific time period or months)

I hope, I was able to explain the difference ?

Anton Tananaev5 years ago

In clustering all instances don't hold the same data. You should probably read a bit more about clustering because it does exactly what you want. It can split data by any index (for example time).

jaimzj5 years ago

Okay, I Shall do that, thanks Anton.