Student Research Symposiums
The student research symposium shows off our interns' findings from the summer.
Overview
USRC hosts a summer research symposium showcasing the work of summer interns, full-time staff, and PIs. Please join us and see the work going on (LANL badge required - external collaborators feel free to contact usrc-contact@lanl.gov for assistance).
Summer Research Symposium 2019
- USRC 2019 Summer Symposium Program
- 2019 High Performance Computing Intern Mini-Showcase (related)
-
Topics covered included:
- Deep I/O: Smart Networks for Fast Storage
- Improving SaNSA: Spark Integration and Anomaly Detection in HPC State Analysis
- Shortening Hamming Codes to Better Correct 2-bit Errors
- Differential Privacy for Supercomputer Sensor Data
- Profiling HPC Application Resilience using DisCVar
- Examining Contextual Based Error Correction Techniques in CLAMR
- Performance Characterization of DRAM-NVM Hybrid Memory Architecture for HPC Applications using Intel Optane DC Persistent Memory Modules
- Algorithm Learning with the Diagonal Neural GPU
- Revere: HPC Job Failure Early Alert
- FI-VIS: Towards Understanding Fault Propagation through Visualization
- In-Situ Partitioning for Range Queries
- Tiered Stripeset: Data Availability During Failure Bursts
- Providing order to the world: Range query for KV-SSD
- Petavision: Interpolating Video and Up-Sampling Simulations
- Analyzing Excessive Memory Faults on Trinity and Trinitite
- KrakenBoot: Firmware-Level Cluster Provisioning via UEFI Surgery
Photos from this event can be found here.
Summer Research Symposium 2018
-
Topics covered included:
- Node resilience analysis on HPC clusters via state transition from multiple fused data sources
Compiler based intelligent data placement for heterogeneous memory architectures to improve on performance and resilience- A foundation for automated placement of data
- Investigation of dynamic routing protocols to route network traffic between compute and IO nodes with a cluster
- Scalable in-situ indexing mechanism for manycore platforms
- Assessing Ansible to replace core components of the HPC software stack
- Software engineering, debugging, testing, and enhancements to the DECAF-FSEFI software fault injection system to improve its functionality
- Using machine learning for system log analysis applied to clusters
- Building Monte Carlo simulations, using MCNP6, of neutron scattering in HPCs to be used in the determination of error rates due to cosmic radiation
- Improving IO and recovery performance of storage systems using
declustered RAID in ZFS - Software development for controlling power to nodes in the raspberry pi cluster
- Enhancement of system log analysis by identifying the most likely source code origin of each message via cloud search tools
- Comparing the explanations of LIME and an authentic approach using internal structures of a random forest developed at LANL called “LogAn” for the purpose of determining the potential and reliability of quantitative automatically-generated explanations.
- Analysis of context-sensitive correction with error correction codes applied to adaptive mesh refinement application
- Creating fast storage endpoints for the next generation of HPC systems
- Applying machine learning to supercomputer telemetry data
- Evaluating storage system performance impacts of hardware acceleration
- Parallel simulation of mosquito-borne diseases using highly-asynchronous programming model based on OpenSHMEM
- Using CI/CD tools to automate a cluster installation
- Using Raspberry Pis and Amazon Web Services to test the launching of OpenMPI across clusters
- Pioneering the testing of the Open Build Service which will potentially be used to create dedicated clusters to build packages and virtual machine images in the future
- Applications of deep neutral networks to
high performance computing
Photos from this event can be found here.
Summer Research Symposium 2017
-
Topics covered included:
- Node resilience analysis on HPC clusters via state transition from multiple fused data sources
Compiler based intelligent data placement for heterogeneous memory architectures to improve on performance and resilience- A foundation for automated placement of data
- Anomaly Detection in System Logs
- Burst Buffer Simulation In Dragonfly Network
- Radiation Induced Effects on HPC Hardware
- Reining in Complex Memory
- DeltaFS - Parallel In-situ Data Indexing
- Error Injection with a DIMM Fault Injector
- Inexact Computing for Pre-Post_Moore's and Post-Moore's Time Frames
- Effects of Cosmic Ray Neutrons on Modern HPC Components
- To Share or Not To Share: Comparing Burst Buffer Architecture
- Soft Error Resilience and Failure Recovery for Continuum Dynamics Applications
- Metadata Load Balancing and Key-Value Stores
- Probabilistic Graphical Models for DRAM Faults
- Interpretable Context-Aware Anomaly Detection for High Performance Computing Systems Monitoring Syslog
- Multi-scale Analysis of Resilience and Energy Efficiency for Scientific Applications
- Temporal and Spatial Analysis of SEDC Energy Data
- Overview of USRC, the Ultrascale Systems Research Center
Photos from this event can be found here.