Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

supported splitting packages #19

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion ValvePak/ValvePak.Test/ValvePak.Test.csproj
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
<Project Sdk="Microsoft.NET.Sdk">
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>net8.0</TargetFramework>
Expand Down
185 changes: 152 additions & 33 deletions ValvePak/ValvePak/Package.Save.cs
Original file line number Diff line number Diff line change
@@ -1,13 +1,25 @@
using Microsoft.VisualBasic.FileIO;
using System;
using System.Buffers;
using System.Collections;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.IO.Hashing;
using System.Linq;
using System.Security.Cryptography;
using System.Text;
using System.Text.RegularExpressions;
using static System.Runtime.InteropServices.JavaScript.JSType;

namespace SteamDatabase.ValvePak
{
internal sealed class WriteEntry(ushort archiveIndex, uint fileOffset, PackageEntry entry)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is needed. You can calculate the ArchiveIndex directly in AddFile.

You can look at Valve's packedstore.cpp to see how they handle adding files:

  • CPackedStore::AddFile has a bMultiChunk bool.
  • They keep track of m_nHighestChunkFileIndex and then increase it if the file offset is higher than m_nWriteChunkSize which defaults to 200 * 1024 * 1024 bytes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good,I should go take a look at packdstore.cpp,Can you tell me where it is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Search for cstrike15_src

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it, thank you

{
internal ushort ArchiveIndex { get; set; } = archiveIndex;
internal uint FileOffset { get; set; } = fileOffset;
internal PackageEntry Entry { get; set; } = entry;
}
public partial class Package
{
/// <summary>
Expand Down Expand Up @@ -101,37 +113,38 @@ public PackageEntry AddFile(string filePath, byte[] fileData)
/// Opens and writes the given filename.
/// </summary>
/// <param name="filename">The file to open and write.</param>
public void Write(string filename)
public void Write(string filename, int maxFileBytes = int.MaxValue)
{
using var fs = new FileStream(filename, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.None);
ArgumentOutOfRangeException.ThrowIfNegative(maxFileBytes);

using var fs = new FileStream(filename, FileMode.Create, FileAccess.ReadWrite, FileShare.None);
fs.SetLength(0);

Write(fs);
Write(fs, maxFileBytes);
}

/// <summary>
/// Writes to the given <see cref="Stream"/>.
/// </summary>
/// <param name="stream">The input <see cref="Stream"/> to write to.</param>
public void Write(Stream stream)
public void Write(FileStream stream, int maxFileBytes)
{
if (IsDirVPK)
{
throw new InvalidOperationException("This package was opened from a _dir.vpk, writing back is currently unsupported.");
}

ArgumentNullException.ThrowIfNull(stream);

if (!stream.CanSeek || !stream.CanRead)
{
throw new InvalidOperationException("Stream must be seekable and readable.");
}

using var writer = new BinaryWriter(stream, Encoding.UTF8, leaveOpen: true);

// TODO: input.SetLength()
var streamOffset = stream.Position;
ulong fileDataSectionSize = 0;

List<PackageEntry> entries = Entries.SelectMany(e => e.Value).ToList();

if (entries.Any(e => e.TotalLength > maxFileBytes))
throw new InvalidOperationException("There are files exceeding max file bytes");


var tree = new Dictionary<string, Dictionary<string, List<PackageEntry>>>();

Expand All @@ -152,13 +165,6 @@ public void Write(Stream stream)
}

directoryEntries.Add(entry);

fileDataSectionSize += entry.TotalLength;

if (fileDataSectionSize > int.MaxValue)
{
throw new InvalidOperationException("Package contents exceed 2GiB, and splitting packages is currently unsupported.");
}
}
}

Expand All @@ -168,15 +174,20 @@ public void Write(Stream stream)
writer.Write(0); // TreeSize, to be updated later
writer.Write(0); // FileDataSectionSize, to be updated later
writer.Write(0); // ArchiveMD5SectionSize
writer.Write(48); // OtherMD5SectionSize
writer.Write(48); //OtherMD5SectionSize
writer.Write(0); // SignatureSectionSize

var headerSize = (int)(stream.Position - streamOffset);
uint fileOffset = 0;

const byte NullByte = 0;

// File tree data
bool isSingleFile = entries.Sum(s => s.TotalLength) + headerSize + 64 <= maxFileBytes;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like using maxFileBytes here, we should just have a bool to specify that we want to multi chunk.

This size calculation is also gonna be incorrect if we want to write file hashes.


var groups = CreatePacketsGroup(entries, maxFileBytes, isSingleFile);

if (groups.Count >= 0x7FFF)
throw new InvalidOperationException("The number of packages exceeds 32766");


uint fileOffset = 0;
foreach (var typeEntries in tree)
{
writer.Write(Encoding.UTF8.GetBytes(typeEntries.Key));
Expand All @@ -191,12 +202,24 @@ public void Write(Stream stream)
{
var fileLength = entry.TotalLength;

var fullPath = entry.GetFullPath();
WriteEntry writeEntry = null;

foreach (var group in groups)
{
if (group.TryGetValue(fullPath, out writeEntry))
break;
}
if (writeEntry is null)
throw new InvalidOperationException("No need write entry found");


writer.Write(Encoding.UTF8.GetBytes(entry.FileName));
writer.Write(NullByte);
writer.Write(entry.CRC32);
writer.Write((short)0); // SmallData, we will put it into data instead
writer.Write(entry.ArchiveIndex);
writer.Write(fileOffset);
writer.Write(writeEntry.ArchiveIndex);
writer.Write(writeEntry.FileOffset);
writer.Write(fileLength);
writer.Write(ushort.MaxValue); // terminator, 0xFFFF

Expand All @@ -211,22 +234,47 @@ public void Write(Stream stream)

writer.Write(NullByte);

var fileTreeSize = stream.Position - headerSize;
//clear sub file
for (ushort i = 0; i < 999; i++)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this loop

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to delete the subcontracted files produced by previous tasks. I believe that when users reduce the maximum number of bytes and recreate the subcontracted files, the existence of the previous subcontracted files can be very confusing for users

Copy link
Member

@xPaw xPaw Jul 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's up to them to clean up then, not really our job to arbitrarily loop for 1k files. We only care that the _dir.vpk references correct chunk file which will be overwritten.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's up to them to clean up then, not really our job to arbitrarily loop for 1k files. We only care that the _dir.vpk references correct chunk file which will be overwritten.

You're right, we shouldn't help users make decisions without authorization

{
string sub_FilePath = GetSubFilePath(stream.Name, i);
if (File.Exists(sub_FilePath))
File.Delete(sub_FilePath);
}

// File data
foreach (var typeEntries in tree)
if (isSingleFile)
{
foreach (var directoryEntries in typeEntries.Value)
//Write file data
foreach (var writeEntry in groups[0].Values)
{
foreach (var entry in directoryEntries.Value)
{
ReadEntry(entry, out var fileData, validateCrc: false);
ReadEntry(writeEntry.Entry, out var fileData, validateCrc: false);
writer.Write(fileData);
}
}
else
{
//Create and write sub file data
for (ushort i = 0; i < groups.Count; i++)
{
string sub_FilePath = GetSubFilePath(stream.Name, i);

writer.Write(fileData);
using var fs = new FileStream(sub_FilePath, FileMode.Create, FileAccess.ReadWrite, FileShare.None);
using var writer_sub = new BinaryWriter(fs, Encoding.UTF8, leaveOpen: true);

var group = groups[i];
foreach (var writeEntry in group.Values)
{
ReadEntry(writeEntry.Entry, out var fileData, validateCrc: false);
writer_sub.Write(fileData);
}
}

}


long fileTreeSize = stream.Position - headerSize;


var afterFileData = stream.Position;
var fileDataSize = afterFileData - fileTreeSize - headerSize;

Expand Down Expand Up @@ -293,5 +341,76 @@ public void Write(Stream stream)
ArrayPool<byte>.Shared.Return(buffer);
}
}

/// <summary>
/// Get the sub file name
/// </summary>
/// <param name="indexFilePath">Index file path</param>
/// <param name="indexNumber">Index number</param>
/// <returns></returns>
static string GetSubFilePath(string indexFilePath, ushort indexNumber)
{
FileInfo sub_FileInfo = new FileInfo(indexFilePath);

string sub_FileName = Path.GetFileNameWithoutExtension(sub_FileInfo.FullName);
if (sub_FileName.EndsWith("_dir", StringComparison.OrdinalIgnoreCase))
sub_FileName = $"{sub_FileName[..^4]}";

sub_FileName = $"{sub_FileName}_{indexNumber:D3}";
return $"{sub_FileInfo.Directory}\\{sub_FileName}{sub_FileInfo.Extension}";
}

/// <summary>
/// Split the current tree into multiple trees based on packet size
/// </summary>
/// <param name="mainTypeTree">Tree of data sources</param>
/// <param name="maxFileBytes">Maximum file byte count</param>
/// <returns>List of Trees</returns>

static List<Dictionary<string, WriteEntry>> CreatePacketsGroup(List<PackageEntry> entries, int maxFileBytes, bool isSingleFile)
{
List<Dictionary<string, WriteEntry>> groups = new List<Dictionary<string, WriteEntry>>();
uint totalLength = 0;
ushort archiveIndex = 0;
Dictionary<string, WriteEntry> group = new Dictionary<string, WriteEntry>();
groups.Add(group);

if (isSingleFile)
{
foreach (var entry in entries)
{
group.Add(entry.GetFullPath(), new(0x7FFF, totalLength, entry));
totalLength += entry.TotalLength;
}
}
else
{
group.Add(entries[0].GetFullPath(), new(archiveIndex, totalLength, entries[0]));
totalLength += entries[0].TotalLength;

entries.RemoveAt(0);
do
{
PackageEntry entry = entries.Find(e => e.TotalLength < (ulong)maxFileBytes - totalLength);
if (entry is not null)
{
group.Add(entry.GetFullPath(), new(archiveIndex, totalLength, entry));
totalLength += entry.TotalLength;
entries.Remove(entry);
}
else
{
group = new Dictionary<string, WriteEntry>();
groups.Add(group);
totalLength = 0;
archiveIndex++;
}

} while (entries.Count != 0);
}


return groups;
}
}
}
Loading