Introduction

This page has visualizations for the CCA model example, Atlantis Summit Common Scenario 1. For full explanation of methods, see the file linked at the beginning of each section.

Simulate a survey part 4: sample for length composition

Full methods are explained here.

The following settings should achieve a survey that samples all Atlantis model output timesteps, all fish and shark species, and all model polygons, with perfect efficiency and full selectivity for all ages:

# should return a perfectly scaled survey 
effic1 <- data.frame(species=funct.group.names,
                     efficiency=rep(1.0,length(funct.group.names)))

# should return all lengths fully sampled (Atlantis output is 10 age groups per spp)
selex1 <- data.frame(species=rep(funct.group.names, each=10),
                     agecl=rep(c(1:10),length(funct.group.names)),
                     selex=rep(1.0,length(funct.group.names)*10))

# should return all model areas
boxpars <- load_box(d.name, box.file)
boxall <- c(0:(boxpars$nbox - 1))

# generalized timesteps all models
runpar <- load_runprm(d.name, run.prm.file)
noutsteps <- runpar$tstop/runpar$outputstep
stepperyr <- if(runpar$outputstepunit=="days") 365/runpar$toutinc

timeall <- c(0:noutsteps)
  
# define set of species we expect surveys to sample (e.g. fish only? vertebrates?)
# for ecosystem indicator work test all species, e.g.
survspp <- funct.group.names 

# for length and age groups lets just do fish and sharks
# NOBA model has InvertType, changed to GroupType in file, but check Atlantis default
if(initNOBA) funct.groups <- rename(funct.groups, GroupType = InvertType)

survspp <- funct.groups$Name[funct.groups$IsTurnedOn==1 &
                           funct.groups$GroupType %in% c("FISH", "SHARK")]

Here we use create_survey on the numbers output of run_truth to create the survey census of age composition.

survey_testNall <- create_survey(dat = truth$nums,
                                 time = timeall,
                                 species = survspp,
                                 boxes = boxall,
                                 effic = effic1,
                                 selex = selex1)

# consider saving this interim step if it takes a long time go generate

Then we use the sample_fish function with very high effN and compare to true annual age comp calculated above as a test, which matches as we have shown here.

The following gets both numbers and weights to get the length comp, and ultimately the weight at age for use in assessments.

First we subset the nums, resn, and structn components of run_truth using the same create_survey design selecting boxes, time, and species. We need only apply survey efficiency and selectivity to nums, however (done above), so we apply aggregateDensityData here.

# aggregate true resn per survey design
survey_aggresnall <- aggregateDensityData(dat = truth$resn,
                                 time = timeall,
                                 species = survspp,
                                 boxes = boxall)

# aggregate true structn per survey design
survey_aggstructnall <- aggregateDensityData(dat = truth$structn,
                                 time = timeall,
                                 species = survspp,
                                 boxes = boxall)

Now we should have inputs to sample_fish on the same scale, and they need to be aggregated across boxes into a single biological sample for the whole survey. We are not maintaining spatial structure in sampling because it isn’t used in most assessments.

To do the proper aggregation and not apply the multinomial sampling to the density data, I rewrote sample_fish to apply the median if sample=FALSE in the function call.

# this one is high but not equal to total for numerous groups
effNhigh <- data.frame(species=survspp, effN=rep(1e+8, length(survspp)))

# apply default sample fish as before to get numbers
numsallhigh <- sample_fish(survey_testNall, effNhigh)

#dont sample these, just aggregate them using median
structnall <- sample_fish(survey_aggstructnall, effNhigh, sample = FALSE)

resnall <-  sample_fish(survey_aggresnall, effNhigh, sample = FALSE)

Select a single species for testing. Generating length composition for a single species should take just a few minutes to run.

# this should still represent a census but with polygon and layer aggregated
# WARNING: selecting a species by name is hardcoded for a particular model

ss_numsallhigh <- numsallhigh[numsallhigh$species == "Yelloweye_rockfish",]
ss_structnall <- structnall[structnall$species == "Yelloweye_rockfish",]
ss_resnall <- resnall[resnall$species == "Yelloweye_rockfish",]

ss_length_censussurvsamp <- calc_age2length(structn = ss_structnall,
                                 resn = ss_resnall,
                                 nums = ss_numsallhigh,
                                 biolprm = truth$biolprm, fgs = truth$fgs,
                                 CVlenage = 0.1, remove.zeroes=TRUE)

This gives annual length comps:

Run for all species then SAVE

This code block can take hours (5+ for CCA) to run, so be prepared, but then the saved file can be read back in to have length comps for all surveyed species.

length_censussurvsamp <- calc_age2length(structn = structnall,
                                 resn = resnall,
                                 nums = numsallhigh,
                                 biolprm = truth$biolprm, fgs = truth$fgs,
                                 CVlenage = 0.1, remove.zeroes=TRUE)

#save for later use, takes a long time to generate
saveRDS(length_censussurvsamp, file.path(d.name, paste0(scenario.name, "length_censussurvsamp.rds")))

Read it back in:

length_censussurvsamp <- readRDS(file.path(d.name, paste0(scenario.name, "length_censussurvsamp.rds")))

Demo plots for other species:

ss_length_censussurvsamp <- length_censussurvsamp$natlength %>%
  filter(species == "Herring")

lfplot <- ggplot(ss_length_censussurvsamp, aes(upper.bins)) +
  geom_bar(aes(weight = atoutput)) +
  theme_tufte()

lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 1, scales="free_y")

lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 2, scales="free_y")

lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 3, scales="free_y")

lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 4, scales="free_y")

lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 5, scales="free_y")

lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 6, scales="free_y")

lfplot + facet_wrap_paginate(~time, ncol=4, nrow = 4, page = 7, scales="free_y")

So this looks like our best approximation of a true length comp for a particular Atlantis run. Comparisons with standard surveys next.