First time I heard about something like this, but I'm sure with some scripting it's possible to do.
Claude 4 generated this but i dont trust it
#!/bin/bash
# Traccar data sampling script - keeps 5-minute intervals for data older than 1 year
# This preserves historical trends while significantly reducing storage
echo "Starting Traccar data sampling process..."
echo "Keeping 5-minute intervals for data older than 1 year"
# Function to log with timestamp
log() {
echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1"
}
# Position data sampling
log "Starting position data sampling..."
result=""
total_deleted=0
while [[ "$result" != *" 0 rows affected"* ]]; do
# Delete positions older than 1 year, but keep one record every 5 minutes per device
result=$(mysql -u root -proot traccar -vve "
DELETE p1 FROM tc_positions p1
WHERE p1.fixTime < DATE(DATE_ADD(NOW(), INTERVAL -1 YEAR))
AND p1.id NOT IN (
SELECT positionId FROM tc_devices WHERE positionId IS NOT NULL
)
AND p1.id NOT IN (
SELECT * FROM (
SELECT MIN(p2.id) as id
FROM tc_positions p2
WHERE p2.fixTime < DATE(DATE_ADD(NOW(), INTERVAL -1 YEAR))
AND p2.deviceId = p1.deviceId
GROUP BY
p2.deviceId,
YEAR(p2.fixTime),
MONTH(p2.fixTime),
DAY(p2.fixTime),
HOUR(p2.fixTime),
FLOOR(MINUTE(p2.fixTime) / 5)
) as keeper_ids
)
LIMIT 10000")
if [[ "$result" == *"rows affected"* ]]; then
affected=$(echo "$result" | grep -o '[0-9]\+ rows affected' | grep -o '[0-9]\+')
total_deleted=$((total_deleted + affected))
log "Deleted $affected position records (total: $total_deleted)"
fi
sleep 1
done
log "Position sampling completed. Total records processed: $total_deleted"
# Define important event types that should never be deleted
IMPORTANT_EVENTS="'alarm','geofenceEnter','geofenceExit','panic','sos','jamming','tampering','powerOff','accident','fatigueDriving','hardAcceleration','hardBraking','hardCornering','overspeed','maintenance'"
log "Important events that will be preserved: $IMPORTANT_EVENTS"
# Event data sampling
log "Starting event data sampling..."
result=""
total_deleted=0
while [[ "$result" != *" 0 rows affected"* ]]; do
# Delete events older than 1 year, but:
# 1. Keep ALL important events (alarms, geofence violations, etc.)
# 2. Keep one record every 5 minutes per device for non-important events
result=$(mysql -u root -proot traccar -vve "
DELETE e1 FROM tc_events e1
WHERE e1.eventTime < DATE(DATE_ADD(NOW(), INTERVAL -1 YEAR))
AND e1.type NOT IN ($IMPORTANT_EVENTS)
AND e1.id NOT IN (
SELECT * FROM (
SELECT MIN(e2.id) as id
FROM tc_events e2
WHERE e2.eventTime < DATE(DATE_ADD(NOW(), INTERVAL -1 YEAR))
AND e2.type NOT IN ($IMPORTANT_EVENTS)
GROUP BY
e2.deviceId,
e2.type,
YEAR(e2.eventTime),
MONTH(e2.eventTime),
DAY(e2.eventTime),
HOUR(e2.eventTime),
FLOOR(MINUTE(e2.eventTime) / 5)
) as keeper_ids
)
LIMIT 10000")
if [[ "$result" == *"rows affected"* ]]; then
affected=$(echo "$result" | grep -o '[0-9]\+ rows affected' | grep -o '[0-9]\+')
total_deleted=$((total_deleted + affected))
log "Deleted $affected event records (total: $total_deleted)"
fi
sleep 1
done
log "Event sampling completed. Total records processed: $total_deleted"
log "Data sampling process finished!"
# Optional: Analyze the results and show preserved important events
log "Analyzing remaining data..."
mysql -u root -proot traccar -e "
SELECT
'Positions' as table_name,
COUNT(*) as total_records,
COUNT(CASE WHEN fixTime < DATE(DATE_ADD(NOW(), INTERVAL -1 YEAR)) THEN 1 END) as old_records,
MIN(fixTime) as oldest_record,
MAX(fixTime) as newest_record
FROM tc_positions
UNION ALL
SELECT
'Events (All)' as table_name,
COUNT(*) as total_records,
COUNT(CASE WHEN eventTime < DATE(DATE_ADD(NOW(), INTERVAL -1 YEAR)) THEN 1 END) as old_records,
MIN(eventTime) as oldest_record,
MAX(eventTime) as newest_record
FROM tc_events
UNION ALL
SELECT
'Events (Important)' as table_name,
COUNT(*) as total_records,
COUNT(CASE WHEN eventTime < DATE(DATE_ADD(NOW(), INTERVAL -1 YEAR)) THEN 1 END) as old_records,
MIN(eventTime) as oldest_record,
MAX(eventTime) as newest_record
FROM tc_events
WHERE type IN ($IMPORTANT_EVENTS);
"
log "Important event breakdown by type:"
mysql -u root -proot traccar -e "
SELECT
type as event_type,
COUNT(*) as total_count,
COUNT(CASE WHEN eventTime < DATE(DATE_ADD(NOW(), INTERVAL -1 YEAR)) THEN 1 END) as old_records_preserved,
MIN(eventTime) as oldest_event,
MAX(eventTime) as newest_event
FROM tc_events
WHERE type IN ($IMPORTANT_EVENTS)
GROUP BY type
ORDER BY total_count DESC;
"
If you don't trust the Script, you could create a copy of your database and try it out on a 2 month subset of the Data. You don't need the whole time range to test it.
Whatever you do, please try doing it in the Database itself, it saves the round trip between the script and the Database. For MySQL its MySqlScript.
As soon as TimescaleDB is officially being supported, it should be quite easy to do exactly what you want to do, directly in TimescaleDB itself.
In my Traccar PostGIS database I'm using a Script on a Cronjob schedule, which thins the whole dataset based on Time and Distance parameters into different Level of Detail tables - for mapping purposes - and I hope that i can implement the same with TimescaleDB too.
Would be nice to have a thinned table extra for reports and replays or even load the data dynamically.
i wonder if anyone managed to get a way to trim older data but not delete everything
for example
after 100 days only keep 5 minutes data
after 200 days only keep 15 minutes data
Meaning taking first row of day 101 s a starting point add 5 minutes keep the row that is +5 minutes and remove everything in between