USRC Software

USRC developers contribute to a variety of open source projects.

Parallel Fine-grained Soft Error Fault Injector (P-FSEFI)

PFSEFI is a software fault injector that uses a virtual machine (VM) backend to inject emulated faults into running parallel applications.  Users have advanced controls over faults including a complex fault model.  PFSEFI includes support for Docker to ease installation and deployment.

PFSEFI GitHub Page


MarFS

MarFS provides a scalable near-POSIX file system by using one or more POSIX file systems as a scalable metadata component and one or more data stores (object, file, etc) as a scalable data component.

MarFS GitHub Page


GUFI

Grand Unified File Index (GUFI) is designed using a new, hierarchical approach to storing file metadata, allowing rapid parallel searches across many internal databases.

GUFI GitHub Page


 Charliecloud

charliecloud_logo.png

Charliecloud provides user-defined software stacks (UDSS) for high-performance computing (HPC) centers.

Charliecloud GitHub Page


TensorFI

TensorFI is a TensorFlow Fault Injector (FI) for machine learning applications that enables users to explore the resiliency of machine learning applications to soft errors.

TensorFI GitHub Page


PFTool

PFTool (Parallel File Tool) can stat, copy, and compare files in parallel.  PFTool is optimized for HPC workloads and uses MPI for message passing.

PFTool GitHub Page


fsstats

Python script that collects statistics about a filesystem hierarchy.  This is used in several of the collections available under Data Sources.  Credit for this tool goes to Marc Unangst, Panasas, DOE, SciDAC-PDSI, and CMU.

Download fsstats Now


Kraken

Kraken is a distributed state engine that can maintain state across a large set of computers. It was designed to provide full-lifecycle maintenance of HPC compute clusters, from cold boot to ongoing system state maintenance and automation.

Kraken GitHub Page