Request a quote →
Maintenance · preventive & corrective

Maintenance that prevents failure, not just one that repairs it.

An enterprise server rarely dies suddenly. It almost always announces it in the logs for days or weeks before. Preventive maintenance is the discipline of reading it in time: hardware refresh, health check, thermal optimization, firmware, targeted replacements before the blocking failure.

Eight maintenance areas, one logic: keep the system healthy.

Preventive maintenance

Calendar of scheduled interventions: cleaning, thermal paste renewal, health check, connector verification, preventive replacements. Typically every 24-36 months.

Hardware refresh & health check

Deep cleaning, CPU thermal paste, BMC/CMOS battery replacement, capacitor inspection. Health check with SEL/IPMI/SMART reading and written priority report.

Firmware, BIOS, microcode

Planned updates on validated compatibility matrix and rollback plan. Includes CPU microcode for side-channel vulnerabilities. Done in agreed maintenance windows.

Thermal optimization

Historic temperature analysis from BMC, targeted intervention to reduce throttling, fan curve calibration. Often the fastest way to recover 10-20% lost to throttling.

Hardware + software maintenance contracts with tnsolutions group.

We offer hardware-only maintenance contracts and hardware + software contracts. Three SLA tiers: Essential (annual health check, on-site within 5 business days), Business (semestral health check, programmed hardware refresh, on-site within 2 business days), Critical (quarterly health check, on-site Lombardy within 4 business hours, pre-allocated cold-spare pool, dedicated technical account).

See maintenance contracts →

FAQ
How often should preventive maintenance be scheduled?

Depends on load, thermal environment, criticality. Pragmatic rule: complete hardware refresh every 24-36 months for servers in continuous production. Health check with SEL log analysis every 6-12 months.

What does a complete hardware refresh include?

Deep cleaning, CPU heatsink thermal paste replacement, internal connector verification, RAM heatsink re-tightening, BMC/CMOS battery replacement, visual capacitor inspection, fan lubrication or replacement, redundant PSU verification, critical firmware updates.

My server has recurring kernel panics or BSOD: is it the OS or the hardware?

Often hardware that the OS exposes as a software error. Typical signs: kernel panics correlated with MCE events in logs, BSOD with WHEA_UNCORRECTABLE_ERROR codes, random reboots under load. Our analysis starts from BMC/IPMI logs before touching the OS.

Do you update firmware, BIOS and microcode without breaking production?

Yes, with planning. We verify the vendor compatibility matrix, check release notes for licensing/feature impact, prepare a rollback plan. Update in agreed window with full configuration backup. Post-update state validated with stress tests before returning to production.

What does "thermal optimization" mean on a server already in production?

A targeted intervention to reduce CPU thermal throttling and bring sensors back to nominal ranges. It includes: reading historical data from the BMC (CPU, DIMM, VRM, ambient temperatures), airflow inspection, cleaning heatsinks and filters, thermal paste replacement (it degrades measurably after 3-4 years), fan curve calibration, and sometimes replacing fans that are inadequate for the current load. Often the fastest way to recover 10-20% of performance lost to throttling.

How old is your server without ever being opened?

If the answer is "more than three years", it is probably losing performance to thermal throttling and accumulating events in the logs. Health check + hardware refresh is an investment that pays back in recovered useful life.