@@ -1376,6 +1376,61 @@ def cleanup_old_snapshots(table_name: str, snapshot_ids: list[int]):
13761376cleanup_old_snapshots(" analytics.user_events" , [12345 , 67890 , 11111 ])
13771377```
13781378
1379+ #### Advanced Retention Strategies
1380+
1381+ PyIceberg provides additional retention helpers on ` ExpireSnapshots ` to balance safety and cleanup:
1382+
1383+ Key table properties used as defaults (all optional):
1384+
1385+ - ` history.expire.max-snapshot-age-ms ` : Default age threshold for ` with_retention_policy `
1386+ - ` history.expire.min-snapshots-to-keep ` : Minimum total snapshots to retain
1387+ - ` history.expire.max-ref-age-ms ` : (Reserved for future protected ref/branch cleanup logic)
1388+
1389+ Protected snapshots (referenced by branches or tags) are never expired by these APIs.
1390+
1391+ Keep only the last N snapshots (plus protected):
1392+
1393+ ``` python
1394+ table.maintenance.expire_snapshots().retain_last_n(5 ).commit()
1395+ ```
1396+
1397+ Expire older snapshots but always keep the most recent N and a safety floor:
1398+
1399+ ``` python
1400+ from datetime import datetime, timedelta
1401+
1402+ cutoff = int ((datetime.now() - timedelta(days = 7 )).timestamp() * 1000 )
1403+ table.maintenance.expire_snapshots().older_than_with_retention(
1404+ timestamp_ms = cutoff,
1405+ retain_last_n = 3 ,
1406+ min_snapshots_to_keep = 4 ,
1407+ ).commit()
1408+ ```
1409+
1410+ Unified policy that also reads table property defaults:
1411+
1412+ ``` python
1413+ # Uses table properties if arguments omitted
1414+ table.maintenance.expire_snapshots().with_retention_policy().commit()
1415+
1416+ # Override selectively
1417+ table.maintenance.expire_snapshots().with_retention_policy(
1418+ retain_last_n = 2 , # keep 2 newest regardless of age
1419+ min_snapshots_to_keep = 5 , # never go below 5 total
1420+ # timestamp_ms omitted -> falls back to history.expire.max-snapshot-age-ms if set
1421+ ).commit()
1422+ ```
1423+
1424+ Parameter interaction rules:
1425+
1426+ - ` retain_last_n ` snapshots are always kept (plus protected refs)
1427+ - ` timestamp_ms ` filters candidates (older than only)
1428+ - ` min_snapshots_to_keep ` stops expiration once the floor would be violated
1429+ - If all of (` timestamp_ms ` , ` retain_last_n ` , ` min_snapshots_to_keep ` ) are None in ` with_retention_policy ` , nothing is expired
1430+ - Passing invalid values (` < 1 ` ) for counts raises ` ValueError `
1431+
1432+ Safety tip: Start with higher ` min_snapshots_to_keep ` when first enabling automated cleanup.
1433+
13791434## Views
13801435
13811436PyIceberg supports view operations.
0 commit comments