Running the photogrammetry workflow (original monolithic version)
Step-Based Workflow Recommended
A new step-based workflow with optimized resource allocation is now available and recommended for all new processing.
Benefits of step-based workflow:
- π° 60-80% reduction in GPU costs
- π Individual step monitoring and debugging
- π§ Configurable GPU vs CPU scheduling
- β‘ Disabled steps are completely skipped (no resource allocation)
This page documents the original monolithic workflow for reference only.
This guide describes how to run the original OFO photogrammetry workflow, which processes drone imagery using automate-metashape in a single monolithic container and performs post-processing steps.
Prerequisites
Before running the workflow, ensure you have:
- Installed and set up the
openstackandkubectlutilities - Installed the Argo CLI
- Added the appropriate type and number of nodes to the cluster (cluster-access-and-resizing.md#cluster-resizing)
- Set up your
kubectlauthentication env var (part of instructions for adding nodes). Quick reference:
source ~/venv/openstack/bin/activate
source ~/.ofocluster/app-cred-ofocluster-openrc.sh
export KUBECONFIG=~/.ofocluster/ofocluster.kubeconfig
Workflow overview
The workflow performs the following steps:
- Pulls raw drone imagery from
/ofo-share-2onto the Kubernetes VM cluster - Processes the imagery with Metashape
- Writes the imagery products to
/ofo-share-2 - Uploads the imagery products to
S3:ofo-internaland deletes them from/ofo-share - Downloads the imagery products from S3 back to the cluster and performs postprocessing (CHMs, clipping, COGs, thumbnails)
- Uploads the final products to
S3:ofo-public
Setup
1. Prepare inputs
Before running the workflow, you need to prepare three types of inputs on the cluster's shared storage:
- Drone imagery datasets (JPEG images)
- Metashape configuration files
- A config list file specifying which configs to process
All inputs must be placed in /ofo-share-2/argo-data/.
Directory structure
Here is a schematic of the /ofo-share-2/argo-data directory:
/ofo-share-2/argo-data/
βββ argo-input/
βββ datasets/
β βββdataset_1/
β β βββ image_01.jpg
β β βββ image_02.jpg
β βββdataset_2/
β βββ image_01.jpg
β βββ image_02.jpg
βββ configs/
β βββconfig_dataset_1.yml
β βββconfig_dataset_2.yml
βββ config_list.txt
Add drone imagery datasets
To add new drone imagery datasets to be processed using Argo, transfer files from your local machine (or the cloud) to the /ofo-share-2 volume. Put the drone imagery projects to be processed in their own directory in /ofo-share-2/argo-data/argo-input/datasets.
One data transfer method is the scp command-line tool:
scp -r <local/directory/drone_image_dataset/> exouser@<vm.ip.address>:/ofo-share-2/argo-data/argo-input/datasets
Replace <vm.ip.address> with the IP address of a cluster node that has the share mounted.
Specify Metashape parameters
Metashape processing parameters are specified in configuration YAML files which need to be located at /ofo-share-2/argo-data/argo-input/configs/.
Every dataset to be processed needs to have its own standalone configuration file.
Naming convention: Config files should be named to match the naming convention <config_id>_<datasetname>.yml. For example:
01_benchmarking-greasewood.yml02_benchmarking-greasewood.yml
Setting the photo_path: Within each metashape config.yml file, you must specify photo_path
which is the location of the drone imagery dataset to be processed. When running via Argo workflows,
this path refers to the location of the images inside the docker container.
For example, if your drone images were uploaded to /ofo-share-2/argo-data/argo-input/datasets/dataset_1, then the photo_path should be written as:
Parameters handled by Argo: The output_path, project_path, and run_name configuration parameters are handled automatically by the Argo workflow:
output_pathandproject_pathare determined via the arguments passed to the automate-metashape container, which in turn are derived from theRUN_FOLDERworkflow parameter passed when invokingargo submitrun_nameis pulled from the name of the config file (minus the extension) by the Argo workflow
Any values specified for these parameters in the config.yml will be ignored.
Create a config list file
We use a text file, for example config_list.txt, to tell the Argo workflow which config files should be processed in the current run. This text file should list the paths to each config.yml file you want to process (relative to /ofo-share-2/argo-data), one config file path per line.
For example:
argo-input/configs/01_benchmarking-greasewood.yml
argo-input/configs/02_benchmarking-greasewood.yml
argo-input/configs/01_benchmarking-emerald-subset.yml
argo-input/configs/02_benchmarking-emerald-subset.yml
This allows you to organize your config files in subdirectories or different locations. The dataset name will be automatically derived from the config filename (e.g., argo-input/configs/dataset-name.yml becomes dataset dataset-name).
You can create your own config list file and name it whatever you want, placing it anywhere within /ofo-share-2/argo-data/. Then specify the path to it (relative to /ofo-share-2/argo-data) using the CONFIG_LIST parameter when submitting the workflow.
Submit the workflow
Once your cluster authentication is set up and your inputs are prepared, run:
argo submit -n argo photogrammetry-workflow.yaml \
-p CONFIG_LIST=argo-input/config-lists/config_list.txt \
-p RUN_FOLDER=gillan_june27 \
-p PHOTOGRAMMETRY_CONFIG_ID=01 \
-p S3_BUCKET_PHOTOGRAMMETRY_OUTPUTS=ofo-internal \
-p S3_BUCKET_POSTPROCESSED_OUTPUTS=ofo-public \
-p OUTPUT_DIRECTORY=jgillan_test \
-p BOUNDARY_DIRECTORY=jgillan_test \
-p WORKING_DIR=/argo-output/temp-working-dir \
-p POSTPROCESSING_IMAGE_TAG=latest
Database parameters (not currently functional):
-p DB_PASSWORD=<password> \
-p DB_HOST=<vm_ip_address> \
-p DB_NAME=<db_name> \
-p DB_USER=<user_name>
Workflow parameters
| Parameter | Description |
|---|---|
CONFIG_LIST |
Path to text file listing paths to metashape config files (all paths relative to /ofo-share-2/argo-data) |
RUN_FOLDER |
Name for the parent directory of the Metashape outputs (locally under argo-data/argo-outputs and at the top level of the S3 bucket). Example: photogrammetry-outputs. |
PHOTOGRAMMETRY_CONFIG_ID |
Two-digit configuration ID (e.g., 01, 02) used to organize outputs into photogrammetry_NN subdirectories in S3 for both raw and postprocessed products. If not specified, both raw and postprocessed products are stored directly in RUN_FOLDER (no photogrammetry_NN subfolder). |
S3_BUCKET_PHOTOGRAMMETRY_OUTPUTS |
S3 bucket where raw Metashape products (orthomosaics, point clouds, etc.) are uploaded (typically ofo-internal). When PHOTOGRAMMETRY_CONFIG_ID is set, products are uploaded to {bucket}/{RUN_FOLDER}/photogrammetry_{PHOTOGRAMMETRY_CONFIG_ID}/. When not set, products go to {bucket}/{RUN_FOLDER}/. |
S3_BUCKET_POSTPROCESSED_OUTPUTS |
S3 bucket for final postprocessed outputs and where boundary files are stored (typically ofo-public) |
OUTPUT_DIRECTORY |
Name of parent folder where postprocessed products are uploaded. When PHOTOGRAMMETRY_CONFIG_ID is set, products are organized as {OUTPUT_DIRECTORY}/{mission_name}/photogrammetry_{PHOTOGRAMMETRY_CONFIG_ID}/. When not set, products go to {OUTPUT_DIRECTORY}/{mission_name}/. |
BOUNDARY_DIRECTORY |
Parent directory where mission boundary polygons reside (used to clip imagery) |
WORKING_DIR |
Directory within container for downloading and postprocessing (typically /tmp/processing which downloads data to the processing computer; can be changed to a persistent volume) |
POSTPROCESSING_IMAGE_TAG |
Docker image tag for the postprocessing container (default: latest). Use a specific branch name or tag to test development versions (e.g., dy-manila) |
DB_* |
Database parameters for logging Argo status (not currently functional; credentials in OFO credentials document) |
Secrets configuration:
- S3 credentials: S3 access credentials, provider type, and endpoint URL are configured via the s3-credentials Kubernetes secret
- Agisoft license: Metashape floating license server address is configured via the
agisoft-license Kubernetes secret
These secrets should have been created (within the argo namespace) during cluster creation.
Monitor the workflow
Using the Argo UI
The Argo UI is great for troubleshooting and checking additional logs. Access it at argo.focal-lab.org, using the credentials from Vaultwarden under the record "Argo UI token".
Navigating the Argo UI
The Workflows tab on the left side menu shows all running workflows. Click a current workflow to see a schematic of the jobs spread across multiple instances:
Click on a specific job to see detailed information including which VM it is running on, the duration of the process, and logs:
A successful Argo run looks like this:
Workflow outputs
The final outputs will be written to S3:ofo-public in the following directory structure:
/S3:ofo-public/
βββ <OUTPUT_DIRECTORY>/
βββ dataset1/
βββ images/
βββ metadata-images/
βββ metadata-mission/
βββ dataset1_mission-metadata.gpkg
βββphotogrammetry_01/
βββ full/
βββ dataset1_cameras.xml
βββ dataset1_chm-ptcloud.tif
βββ dataset1_dsm-ptcloud.tif
βββ dataset1_dtm-ptcloud.tif
βββ dataset1_log.txt
βββ dataset1_ortho-dtm-ptcloud.tif
βββ dataset1_points-copc.laz
βββ dataset1_report.pdf
βββ thumbnails/
βββ dataset1_chm-ptcloud.png
βββ dataset1_dsm-ptcloud.png
βββ dataset1_dtm-ptcloud.png
βββ dataset1-ortho-dtm-ptcloud.png
βββphotogrammetry_02/
βββ full/
βββ dataset1_cameras.xml
βββ dataset1_chm-ptcloud.tif
βββ dataset1_dsm-ptcloud.tif
βββ dataset1_dtm-ptcloud.tif
βββ dataset1_log.txt
βββ dataset1_ortho-dtm-ptcloud.tif
βββ dataset1_points-copc.laz
βββ dataset1_report.pdf
βββ thumbnails/
βββ dataset1_chm-ptcloud.png
βββ dataset1_dsm-ptcloud.png
βββ dataset1_dtm-ptcloud.png
βββ dataset1-ortho-dtm-ptcloud.png
βββ dataset2/
This directory structure should already exist prior to running the Argo workflow.