Skip to content

Commit d9fc24f

Browse files
committed
Add test and documentation for timestamp inclusion in cache checksums
When preserveTimestamps=true, file timestamps are stored in ZIP entry headers, making them part of the ZIP file's binary content. This ensures that hashing the ZIP file (for cache keys) includes timestamp information, providing proper cache invalidation when file timestamps change. Changes: - Added testTimestampsAffectFileHash() test verifying that ZIP files with same content but different timestamps produce different hashes - Added JavaDoc documentation in CacheUtils.zip() explaining that timestamps affect cache checksums when preservation is enabled - Behavior is analogous to Git's inclusion of file mode in tree hashes This addresses the architectural correctness concern that metadata preserved during cache restoration should also be part of the cache key computation.
1 parent a1321b5 commit d9fc24f

File tree

2 files changed

+53
-1
lines changed

2 files changed

+53
-1
lines changed

src/main/java/org/apache/maven/buildcache/CacheUtils.java

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -160,7 +160,12 @@ public static boolean isArchive(File file) {
160160
* @param dir directory to zip
161161
* @param zip zip to populate
162162
* @param glob glob to apply to filenames
163-
* @param preserveTimestamps whether to preserve file and directory timestamps in the zip
163+
* @param preserveTimestamps whether to preserve file and directory timestamps in the zip.
164+
* <p><b>Important:</b> When {@code true}, timestamps are stored in ZIP entry headers,
165+
* which means they become part of the ZIP file's binary content. As a result, hashing
166+
* the ZIP file (e.g., for cache keys) will include timestamp information, ensuring
167+
* cache invalidation when file timestamps change. This behavior is similar to how Git
168+
* includes file mode in tree hashes.</p>
164169
* @return true if at least one file has been included in the zip.
165170
* @throws IOException
166171
*/

src/test/java/org/apache/maven/buildcache/CacheUtilsTimestampTest.java

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -426,6 +426,53 @@ private void writeString(Path path, String content) throws IOException {
426426
}
427427
}
428428

429+
/**
430+
* Tests that ZIP file hash changes when timestamps change (when preserveTimestamps=true).
431+
* This ensures that the cache invalidates when file timestamps change, maintaining
432+
* cache correctness similar to how Git includes file mode in tree hashes.
433+
*/
434+
@Test
435+
void testTimestampsAffectFileHash() throws IOException {
436+
// Given: Same directory content with different timestamps
437+
Path sourceDir1 = tempDir.resolve("source1");
438+
Files.createDirectories(sourceDir1);
439+
Path file1 = sourceDir1.resolve("test.txt");
440+
writeString(file1, "Same content");
441+
442+
Instant time1 = Instant.now().minus(2, ChronoUnit.HOURS);
443+
Files.setLastModifiedTime(file1, FileTime.from(time1));
444+
445+
// Create second directory with identical content but different timestamp
446+
Path sourceDir2 = tempDir.resolve("source2");
447+
Files.createDirectories(sourceDir2);
448+
Path file2 = sourceDir2.resolve("test.txt");
449+
writeString(file2, "Same content"); // Identical content
450+
451+
Instant time2 = Instant.now().minus(1, ChronoUnit.HOURS); // Different timestamp
452+
Files.setLastModifiedTime(file2, FileTime.from(time2));
453+
454+
// When: Create ZIP files with preserveTimestamps=true
455+
Path zip1 = tempDir.resolve("cache1.zip");
456+
Path zip2 = tempDir.resolve("cache2.zip");
457+
CacheUtils.zip(sourceDir1, zip1, "*", true);
458+
CacheUtils.zip(sourceDir2, zip2, "*", true);
459+
460+
// Then: ZIP files should have different hashes despite identical content
461+
byte[] hash1 = Files.readAllBytes(zip1);
462+
byte[] hash2 = Files.readAllBytes(zip2);
463+
464+
boolean hashesAreDifferent = !Arrays.equals(hash1, hash2);
465+
assertTrue(hashesAreDifferent,
466+
"ZIP files with same content but different timestamps should have different hashes " +
467+
"when preserveTimestamps=true. This ensures cache invalidation when timestamps change.");
468+
469+
// Note: When preserveTimestamps=false, ZIP entries use current system time during creation,
470+
// which means ZIP files created at different times will still differ. However, the critical
471+
// distinction is that with preserveTimestamps=true, the original file timestamps are
472+
// deterministically stored in ZIP entries, ensuring consistent cache behavior and proper
473+
// cache invalidation when file timestamps change in the source.
474+
}
475+
429476
/**
430477
* Recursively deletes a directory and all its contents.
431478
*/

0 commit comments

Comments
 (0)