Wjjsoft Structured Storage Library (SSG-5)

Introduction

Wjjsoft Structured Storage Library provides a 64-bit cross-platform solution for applications to locally store hierarchical information inside a single file. The information can be any type of files, such as images, documents, spreadsheets, presentations, videos and any other arbitrary files, no matter whether they're text or binary files. All information is automatically compressed and saved as hierarchical files and folders inside the storage file. The storage library provides a set of APIs in ANSI/C that can be used to manipulate file/folder entries in a storage with ease.

The 64-bit storage library supports large files (Size>2GiB, up to 8EiB). If needed, a single storage file in theory can be limitlessly extended to occupy all space on any one disk that you have or can get from the market. Plus, no matter of what size the storage is, the storage can instantly open within one or just a few seconds.

The storage file format is well-designed for performance, safety, stability, security, simplicity, and recoverability from corrupted storage or even from portion of data slice rescued from on crashed disks. The library package includes a shell command line tool that helps create a storage and retrieve information by batch processing or by hand, and a data recovery tool that helps recover information from within corrupted storage file or data slices.

Key Features

The latest version of the structured storage library is 5.0, on which we've been working for more than one year to design, code, test, debug, and optimize for size, speed and stability. We've made significant improvements since the previous version 3.x, which was initially released in 2000.

Files and Folders (Directory Tree)

Files and Folders are the basic concepts of the storage library. In order to store any information within a storage, you'll need to create an 'inner' file under a specified 'inner' folder, the information will be saved within the inner file. Each inner folder maintains a collection of inner files and a collection of sub folders. In fact, the 'inner folder' is a special kind of inner file, content of which just keeps entries of files and sub folders. The inner files and folders are enumerable and searchable by names. The inner file/folder names conform to the common filename conventions, but without limits in length, you can programmatically rename a file/folder with any filename as you see fit.

Data Compression

Wjjsoft Structured Storage Library integrates the Zlib compression library. All information is automatically compressed with Zlib, this way, it can help save much disk space (up to 90%) without too much loss of performance. The compression level is customizable from 0 to 9.

Data Integrity Checksum

For each inner files/folders, the data integrity checksum values are respectively saved with them in the storage, so the storage library is able to test data integrity while extracting information, and ensure that data is intact.

Large File Support

With 64-bit programming, Wjjsoft Structured Storage Library supports large files capacity of which are greater than 2GiB. In theory, a single storage capacity can be extend up to 8EiB (2^63 or 10^18).

Load-on-Access

Can't image what would happen on opening a large storage (>300MiB) without the 'load-on-access' option. Wjjsoft Structured Storage Library implemented the 'load-on-access' feature in a smart way. It loads as minimal bytes as possible to make the 'Load/Open' operation fast. Basically, it simply loads the first 512 bytes (aka. Storage Header) when loading a storage, applications can delay-load additional files/folders on access.

Transaction Commit

Applications can save modifications to an opened storage at any time. Closing a storage without saving changes makes the storage revert to the last commited state, all the uncommited changes are discarded. This feature sounds like 'Transactions' that some professional relational database systems implemented.

Storage Space Recycling

Wjjsoft Structured Storage Library implemented an effetive and efficient storage space allocator, and maintains a recycle bin internally for storage space releasing and reallocation. Applications don't need to care about doing anything on storage space allocation.

Storage Size Optimization

Releasing files/folders may temporarily produce fragments, which can be then reallocated by adding other files/folders, and the storage library always tries to allocate lower address fragments to save information. This way, it can help eliminate the fragment space automatically without having to run the optimization process. Nevertheless, applications can still run the 'optimization' process actively at any time to eliminate fragments and make the storage compact.

Recoverability

The storage file format is well-designed for performance, safety, stability, security and recoverability. If for any reason, a storage got corrupted, the recovery tool gives the possibilities of retrieving inner files/folders that are individually in good condition. The data checksums help determine whether content of individual files/folders are exactly retrieved. The recovery tool doesn't depend on any description info saved within the storage header, it simply looks into data slices and seeks for any possibilities of retrieving individual files/folders that are still there intact. Therefore, the recovery tool does not only help recover from a whole corrupted storage file, and also helps with retrieving info from those separate data slices extracted from on crashed disks by using disk utilities.

In-memory Storage Support

Wjjsoft Structured Storage Library internally supports both ordinary files and memory buffers as the storage media, and provides the interface for applications to define any other type of random accessible streams as the storage media.

Cross-platform Support

Wjjsoft Structured Storage Library is written in ANSI C/C++ programming language, and can be compiled and deployed on those familiar operation systems. For now, we've successfully built the storage library (.dll, .so and .dylib) for Windows/Linux/FreeBSD/MacOSX. This is the case as well for the database recovery tool and the shell command line.

Benefits

Saving All Information in a Single File

Wjjsoft Structured Storage Library provides the great solution to store all your information within a single storage file, without having to make the mess in the file system for storing a number of small (or large) files. This usually helps a lot with management of a large number of software configurations, personal information and various types of documents.

Fully Customizable Folder Tree

Each storage file maintains a fully customizable folder (directory) tree. Like the traditional file system, applications can create any number of files, folders and sub folders inside the storage to store any information, without having to care about how the disk space is allocated.

A Great Solution to Store Unstructured Info

Thanks to the fully customizable hierarchical folders concept, the structured storage can be a great solution to store and manage those unstructured information, such as images, documents, spreadsheets, presentations, videos and any ordinary files, no matter whether they're text or binary files. In the past ten years, myBase Desktop has been a such program which helps many users store and organize notes and random informatin in the tree outline form effectively and efficiently.

Data Compression for Saving Disk Space

By utilizing the Zlib compression library, all information is automatically compressed before inserting into the storage, this helps save much disk space.

Instant-Open and Load-on-Access

No matter what size a storage is, only the first 512 bytes in the storage header are loaded and manipulated when opening a storage, this usually takes less than one second, it runs very fast, unless a number (hundreds) of files/folders are required to load at the same time, and if in this case, applications would want to re-arrange the structure of files/folders to avoid loading/saving a large number of folder/file entries at a time.

Storage Size Optimization

By running the size-optimization process, most of space fragments can be eliminated from the storage, so you can get a compact sized storage. This helps save disk space and save time when archiving or creating backup, or transferring the storage file over network.

Recoverability from corrupted storage

The storage file format is well-designed and optimized for data safety, and a recovery tool has already been implemented and included in the library package. The recovery tool provides the possibilities of retrieving some individual folders/files that are still in good health. The recovery tool works with corrupted storage, or data slices rescued from on crashed disks. Anyway, it attempts to seek and retrieve as much information as all that can be retrieved.

Supported Platforms

Wjjsoft Structured Storage Library is written in ANSI C++, and linked with a few 3rd-party free libraries (compression, digest, etc.) which are totally written in ANSI/C. In theory, Wjjsoft Structured Storage Library works with almost all familiar platforms in different architectures. For now, we've built and tested the library on the following operating systems and some of their variants.

  • Windows
  • GNU/Linux
  • FreeBSD
  • MacOSX

Using the library

Current version of Wjjsoft Structured Storage Library is pre-built as a .dll module (dynamically linked library) for Windows, a .so (shared objects) for Linux/FreeBSD, and a .dylib for MacOSX. It exposes all its APIs in ANSI/C functions without C++ classes/functions. It's simple to manipulate information within a structured storage by using the APIs, the typical steps using the APIs look like this:

  1. Open a storage;
  2. For viewing, just load some folders or relevant file content;
  3. For editing, create new folders/files, or delete/overwrite/move existing folders/files;
  4. Commit changes if any;
  5. Close the storage;

In order to integrate the structured storage library, you'd need to read about the following terms and concepts;

  1. Storage Handle
  2. File/Folder Handle
  3. Stream Handle

Wjjsoft Structured Storage Library package includes a shell command line and a recovery tool. The shell command line wraps the library APIs, so that you have the convenience of manipulating a storage by hand in Terminal, or by shell scripts for batch processing. The recovery tool is intended for retrieving info from within a corrupted storage, or from data slices of a storage. For detailed info about the command line tools, please read the relevant topics below.

Internals

A storage simply consists of a storage header and a series of data extents.

Storage Header

The storage header is a data block of 512 bytes located at the beginning of the storage. It contains the essential data fields describing the storage, and a customizable area intended for storing application defined bits;

Extents

The 'Extents' concept is a lower layer concept of Wjjsoft Structured Storage Library. Some of modern file systems (e.g. EFS, EXT4, BTRFS, etc.) take advantage of the concept as well. However, the 'Extents' implementation in Wjjsoft Structured Storage Library is somewhat different than that of file systems. The 'extents' implementation is just intended to allocate space and store data, it has no idea of any information that is hierarchical or tree structured. This makes the lower layer implementation of the storage really simple and stable.

Folder Tree

In order to store hierarchical information in the storage, the 'Folder Tree' concept is introduced and implemented over the 'Extents' concept. Each storage maintains a folder tree internally, and each entry in the folder tree maintains a list of sub entries, and stores the sub entries within one or a few data extents allocated in the storage. At this point, the 'Folder Tree' is designed as an application over the 'Extents' concept, and the storage header already preserves a customizable area for the 'Folder Tree' application to store the root folder info. This way, opening a storage is really simple and fast, it only needs to load the first 512 bytes of the storage header and read the root folder info, and then applications determines which entries and when to load accordingly.

Files

The 'Folder Tree' concept just introduces the tree structure, while the 'Files' concept is designed for storing a bulk of actual data. Like the traditional file system, each file entry must have a parent folder, and file content are stored within one or more data extents allocated in the storage.

Byte Order

All relevant integer numbers are saved in Little Endian byte order.

Shell Command Line

Currently the storage library includes a shell command line and a recovery tool built for Windows Console and Unix Shell.

SSG Shell

The shell command line program name is 'ssg', which accepts an operation tag with a few parameters followed, so a typical command line may look like this: "./ssg new stg=1.nyf -passwd=abc" where the 'new' is recognized as a sub command tag, with its two parameters followed: 'stg=1.nyf -passwd=abc'; Some of ssg commands allow parameter names to be omitted, so long as they're required to run the commands. For example, the 'stg=' is a required parameter for the 'ssg new' command, so it can be omitted (well, must be typed in the original order), like this: "./ssg new 1.nyf -passwd=abc".

  1. ssg new: Create a storage over a specified disk file.
    • -stg=[file-path]: specifies a filename including path to create a storage in the file (Required*);
    • -version=[3 or 5]: determines which version of the storage library is selected; Default value: 5;
    • -zlib=[0-9]: determines the zlib compression level; 0: no compression, 9: maximum compression ratio; Default value: 9;
    • -passwd=[password]: Set a password for the newly created storage;
    • -offset=[#]: specifies a positive number of bytes for the storage to offset; Default value: 0;
    • -signature=[xxxx]: specifies a 4-byte application-defined signature (aka. magic number) which is saved within the storage header; Default value: ":NYF";
  2. ssg folder: Create a folder entry by a specified SSG path.
    • -stg=[file-path]: specifies a file path to the destination storage (Required*);
    • -path=[ssg-path]: specifies an inner SSG path to the folder entry being created (Required*);
    • -offset=[#]: specifies a positive number of bytes for the storage to offset; It must be the exact number which was input during the 'ssg new' command; Default value: 0;
  3. ssg file: Create a file entry without inserting any content.
    • -stg=[file-path]: specifies a file path to the destination storage (Required*);
    • -fn=[ssg-path]: specifies an inner SSG path to the desitnation file entry being created (Required*);
    • -offset=[#]: specifies a positive number of bytes for the storage to offset; It must be the exact number which was input during the 'ssg new' command; Default value: 0;
  4. ssg import: Create a file entry with content of the specified file.
    • -stg=[file-path]: specifies a file path to the destination storage (Required*);
    • -fndst=[ssg-path]: specifies an inner SSG path to the desitnation file entry being created (Required*);
    • -fnsrc=[file-path]: specifies an disk file path to the source content (Required*); If not supplied, it attempts to accept text input from console/keyboard (aka. stdin or std::cin) as the new file entry's content;
    • -offset=[#]: specifies a positive number of bytes for the storage to offset; It must be the exact number which was input during the 'ssg new' command; Default value: 0;
  5. ssg export: Export an inner file content to a specified disk file.
    • -stg=[file-path]: specifies a file path to the source storage (Required*);
    • -fnsrc=[ssg-path]: specifies an inner SSG path to the source file entry being exported (Required*);
    • -fndst=[file-path]: specifies an disk file to save content (Required*); If not supplied, it attempts to print content on console/screen (aka. stdout or std::cout);
    • -offset=[#]: specifies a positive number of bytes for the storage to offset; It must be the exact number which was input during the 'ssg new' command; Default value: 0;
  6. ssg hint: Set hint info for the specified file or folder entry. The 'hint' info can be a short text that describes the folder/file entry, but is different than the file/folder's name,
    • -stg=[file-path]: specifies a file path to the destination storage (Required*);
    • -entry=[ssg-path]: specifies an inner SSG path to the desitnation file/folder entry (Required*);
    • -hint=[text]: specifies a hint text for the file/folder entry;
    • -act=[get|set]: specifies an act for getting or setting hint text for the file/folder entry;
    • -offset=[#]: specifies a positive number of bytes for the storage to offset; It must be the exact number which was input during the 'ssg new' command; Default value: 0;
  7. ssg list: List out sub entries of a folder entry.
    • -stg=[file-path]: specifies a file path to the storage (Required*);
    • -path=[ssg-path]: specifies an inner SSG path to the folder entry being listed (Required*); Default value: /;
    • -type=[files|folders|all]: specifies what type of entries to be listed; Default value: all;
    • -fields=[name|date|time|hint|size/packed-size/extent-size|all]: specifies which fields of the entries to be displayed; Default value: all+size;
    • -offset=[#]: specifies a positive number of bytes for the storage to offset; It must be the exact number which was input during the 'ssg new' command; Default value: 0;
  8. ssg delete: Delete an inner file/folder entry.
    • -stg=[file-path]: specifies a file path to the destination storage (Required*);
    • -entry=[ssg-path]: specifies an inner SSG path to the file/folder entry being deleted (Required*);
    • -offset=[#]: specifies a positive number of bytes for the storage to offset; It must be the exact number which was input during the 'ssg new' command; Default value: 0;
  9. ssg copy: Copy a folder tree branch from one storage to another.
    • -stgsrc=[file-path]: specifies a file path to the source storage (Required*);
    • -stgdst=[file-path]: specifies a file path to the destination storage (Required*);
    • -pathsrc=[ssg-path]: specifies an inner SSG path to the source folder entry being copied (Required*);
    • -pathdst=[ssg-path]: specifies an inner SSG path to the destination folder entry to accept content (Required*);
    • -offsrc=[#]: specifies a positive number of bytes for the source storage to offset; It must be the exact number which was input during the 'new' command; Default value: 0;
    • -offdst=[#]: specifies a positive number of bytes for the destination storage to offset; It must be the exact number which was input during the 'new' command; Default value: 0;
  10. ssg stat: Display a list of statistics info about the specified storage.
    • -stg=[file-path]: specifies a file path to a storage (Required*);
    • -offset=[#]: specifies a positive number of bytes for the storage to offset; It must be the exact number which was input during the 'new' command; Default value: 0;
  11. ssg digest: Calculate the digest value of the specified text info or file content.
    • -alg=[adler32|sha1|sha224|sha256|sha384|sha512]: specifies the digest algorithm (Required*);
    • -info=[text info]: specifies the text info to be calculated with the digest algorithm; The text info would need quotation marks if it contains blank spaces.
    • -fn=[file-path]: specifies a file path to an ordinary file to calculate checksum values;
  12. ssg copyfile: Copy content (or a portion) of an ordinary file to another one.
    • -fnsrc=[file-path]: specifies a file path to the source file being copied (Required*);
    • -fndst=[file-path]: specifies a file path to the destination file (Required*);
    • -offset=[#]: specifies a positive number of bytes as the starting point to copy;
    • -length=[#]: specifies a positive number of bytes to be copied to the destination file;

Recovery Tool

The usage of the recovery tool is simple. In most of cases, you can simply type a command line in Terminal like this from the tool's directory:

./ssg5recover filename.nyf

The recovery tool tries to seek as many possibilities as it can to recover all data. If any file/folder entries are successfully retrieved, a new storage file will be automatically generated in the same folder, and the retrieved entries will be copied into the new storage.

Nevertheless, it's still recommended to create backup for important storage files, and you can not expect all file/folder entries will always be restored to its original state from a given corrupted storage file,

The recovery tool accepts a few more parameters, which may help, in some cases, retrieve more file/folder entries from within a given corrupted storage file or rescued data slices.

  • -src=[file-path]: specifies a file path to a source storage file or data slice (Required*);
  • -dst=[file-path]: specifies a file path to the destination storage file;
  • -offset=[#]: specifies a positive number of bytes for the storage to offset within the source data slice file, if in the case it is only a part of the original storage file, for instance, it was rescued from a crashed file system;
  • -minsize=[#]: specifies a positive number of bytes, that you estimated on the minimal size of the original storage file;
  • -maxsize=[#]: specifies a positive number of bytes, that you estimated on the maximal size of the original storage file;

Downloads

To download the SSG-5 command line tools (including the recovery tool), please click the link below;

SSG-5 command line (.zip package) for Windows/Linux/FreeBSD/MacOSX

The SSG-5 source code is currently not available to download. If you have any questions or comments, please contact us.