# Introduction

## Overview

This documentation provides a comprehensive design guide for building a **scalable, secure, and reproducible High-Performance Computing (HPC) environment** on Amazon Web Services (AWS), specifically tailored for **Genomic Nextflow workflows**.

The core of the infrastructure is built using **AWS ParallelCluster**, managed as Infrastructure as Code (IaC) via **Terraform**.

* **Production-grade genomic variant discovery pipeline**
* **AWS ParallelCluster with SLURM scheduler**
* **CPU and GPU partitions for optimized workload execution**
* **Nextflow workflow orchestration**
* **Spack + Lmod for software management**
* **FSx for Lustre and EFS for high-performance storage**
* **Wazuh security monitoring on ECS Fargate**
* **Prometheus/Grafana observability stack**

## Quick Navigation

| Section                                                                                                     | Description                                        |
| ----------------------------------------------------------------------------------------------------------- | -------------------------------------------------- |
| [Project Overview](/genomics-nf-hpc-on-aws-parallelcluster/getting-started/01-project-overview.md)          | Objectives, design principles, and target audience |
| [System Architecture](/genomics-nf-hpc-on-aws-parallelcluster/getting-started/02-architecture.md)           | Component breakdown and deployment model           |
| [Technology Stack](/genomics-nf-hpc-on-aws-parallelcluster/design-and-implementation/03-tech-stack.md)      | Compute, storage, and software decisions           |
| [Terraform Provisioning](/genomics-nf-hpc-on-aws-parallelcluster/design-and-implementation/04-terraform.md) | Infrastructure as Code setup                       |
| [Workflow Design](/genomics-nf-hpc-on-aws-parallelcluster/design-and-implementation/05-workflow.md)         | Nextflow pipeline execution flow                   |
| [Security & Observability](/genomics-nf-hpc-on-aws-parallelcluster/operations/06-security-observability.md) | Wazuh, Prometheus, and Grafana integration         |
| [Cost Optimization](/genomics-nf-hpc-on-aws-parallelcluster/operations/07-cost-optimization.md)             | Strategies for minimizing TCO                      |
| [Conclusion](/genomics-nf-hpc-on-aws-parallelcluster/appendix/08-conclusion.md)                             | Summary and future enhancements                    |
| [References](/genomics-nf-hpc-on-aws-parallelcluster/appendix/09-references.md)                             | External resources and citations                   |
| [Developer Guidance](/genomics-nf-hpc-on-aws-parallelcluster/guides/10-developer-guidance.md)               | SSM access, SLURM, and GPU management              |
| [Troubleshooting](/genomics-nf-hpc-on-aws-parallelcluster/guides/11-troubleshooting-gpu.md)                 | GPU and CUDA issue resolution                      |
| [Validation Checklist](/genomics-nf-hpc-on-aws-parallelcluster/guides/12-validation-checklist.md)           | Post-deployment verification steps                 |

## Getting Started

1. Review the [Project Overview](/genomics-nf-hpc-on-aws-parallelcluster/getting-started/01-project-overview.md) to understand the goals
2. Study the [System Architecture](/genomics-nf-hpc-on-aws-parallelcluster/getting-started/02-architecture.md) for component understanding
3. Follow the [Terraform Provisioning](/genomics-nf-hpc-on-aws-parallelcluster/design-and-implementation/04-terraform.md) guide to deploy infrastructure
4. Use the [Validation Checklist](/genomics-nf-hpc-on-aws-parallelcluster/guides/12-validation-checklist.md) to verify deployment

## Repository Structure

```
hpc-genomics-nf/
├── README.md            # Project overview (this file)
├── docs/                # GitBook documentation
├── terraform/           # Infrastructure as Code
├── nextflow/            # Pipeline definitions
└── modules/             # Reusable components
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://naratech-platforms.gitbook.io/genomics-nf-hpc-on-aws-parallelcluster/...md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.