Building an input prey dataset

Use food habits data from Laurel’s condition repository, object is called allfh

load(url("https://github.com/Laurels1/Condition/raw/master/data/allfh.RData"))

Range of years in allfh :

1973, 1974, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989, 1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, NA, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020

Following data prep in Ng et al. (2021). First collapsed to first and second half of year, combining winter and summer surveys with spring and fall to expand datasets.

Want to compute the average mass of each prey category for combined predators per tow. Predator-specific covariates need to also be averaged by tow (size, Nspecies, evenness?).

First explore, how many predators per tow, how many stomachs each, how many prey in each predator? Filter to piscivores first.

# code for piscivore table
piscivores <- ecodata::species_groupings %>%
  select(COMNAME, SizeCat, Garrison.Link) %>%
  filter(!is.na(Garrison.Link),
         Garrison.Link == "Piscivores") %>%
  mutate(PiscGuild = case_when(COMNAME == "WINTER SKATE" ~ "c",
                               COMNAME == "WEAKFISH" ~ "b", 
                               COMNAME == "BLUEFISH" & SizeCat == "S" ~ "b",
                               TRUE ~ "a")) %>%
  distinct()

# same as Scott's?
guild_dat <- read.csv("fhdat/predator_guilds.csv", stringsAsFactors = FALSE)

pisc2 <- guild_dat %>%
  filter(guild %in% "piscivore")

# no, typo in Scott's BSB should not be a piscivore and bluefish S and M missing

rm(pisc2, guild_dat)

# food habits 1973-2020 piscivores only; include empties

fh.nefsc.pisc <- allfh %>%
  #filter(pynam != "EMPTY") %>%
  left_join(piscivores, by = c("pdcomnam" = "COMNAME",
                               "sizecat" = "SizeCat")) %>%
  filter(!is.na(Garrison.Link))

 preycount <- fh.nefsc.pisc %>%
   #group_by(year, season, pdcomnam, pynam) %>%
   group_by(pdcomnam, pynam) %>%
   summarise(count = n()) %>%
   #arrange(desc(count))
   pivot_wider(names_from = pdcomnam, values_from = count)

How many times in the full dataset was each prey name observed per predator?

datatable(preycount, rownames = FALSE,
          extensions = c("FixedColumns"),
          #caption = "Prey observed in piscivores guild (Garrison and Link 2000) 1973-2020",
          options = list(pageLength = 25,
                         scrollX = TRUE,
                         fixedColumns = list(leftColumns = 1)))

Start by aggregating all pelagic nekton prey that have at least 10 observations for bluefish (leaving out empty, fish, osteichthes, AR, blown):

gencomlist <- fh.nefsc.pisc %>%
  select(pynam, gencom2) %>%
  distinct()

blueprey <- preycount %>%
  filter(BLUEFISH > 9) %>%
  filter(!pynam %in% c("EMPTY", "BLOWN",
                       "FISH", "OSTEICHTHYES",
                       "ANIMAL REMAINS",
                       "FISH SCALES")) %>%
  #filter(!str_detect(pynam, "SHRIMP|CRAB")) %>%
  left_join(gencomlist) %>%
  filter(!gencom2 %in% c("ARTHROPODA", "ANNELIDA",
                         "CNIDARIA", "UROCHORDATA",
                         "ECHINODERMATA", "WORMS",
                         "BRACHIOPODA", "COMB JELLIES",
                         "BRYOZOA", "SPONGES",
                         "MISCELLANEOUS", "OTHER"))
datatable(blueprey, rownames = FALSE,
          extensions = c("FixedColumns"),
          #caption = "Bluefish prey in all piscivores 1973-2020",
          options = list(pageLength = 25,
                         scrollX = TRUE,
                         fixedColumns = list(leftColumns = 1)))

This gives us 20 individual prey names.

We should compare this list with NEAMAP bluefish prey to see if any are missing.

How many tows with piscivore diets had these preynames?

fh.nefsc.pisc.blueprey <- fh.nefsc.pisc %>%
  mutate(blueprey = case_when(pynam %in% blueprey$pynam ~ "blueprey",
                              TRUE ~ "othprey"))

preystn <- fh.nefsc.pisc.blueprey %>%
  group_by(year, season, station) %>%
  count(blueprey) %>%
  pivot_wider(names_from = blueprey, values_from = n) 

#dim(preystn)[1]

bluepreystn <- preystn %>% 
  arrange(desc(blueprey)) %>%
  filter(!is.na(blueprey))

#dim(bluepreystn)[1]

Between 1973 and 2020 we have 24820 individual tows where piscivores were collected and 8510 of those tows had bluefish prey. Therefore, 34.2868654 percent of piscivore tows will be used to get the aggregate prey index.

Which years and seasons have bluefish prey observed? How many tows each?

# coverage <- bluepreystn %>%
#   ungroup() %>%
#   select(year, season) %>%
#   distinct() %>%
#   arrange(year, desc(season))

coverage <- bluepreystn %>%
  group_by(year, season) %>%
  count(station) %>%
  summarize(nstation = sum(n))

datatable(coverage, rownames = FALSE,
          #extensions = c("FixedColumns"),
          caption = "Piscivore stations with bluefish prey 1973-2020",
          options = list(pageLength = 10,
                         autoWidth = TRUE,
                         columnDefs = list(list(width = '200px'))
                         )
          )

(Note that the fall 2020 surveys were cancelled, and spring 2020 was partial due to pandemic restrictions.)

No year/season combinations have 0 stations with aggregated bluefish prey. Summarizing the number of stations per year/season with bluefish prey, 1973-2020:

summary(coverage$nstation)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    3.00   36.25   65.00   67.54   88.75  158.00

So we can aggregate over all these prey in all predator stomachs at a particular station to get mean bluefish prey weight per predator stomach. Calculate variance in prey weight too, n prey species, prey diversity index? The catchability covariates at each station could be number of predators, number of predator species, diversity/evenness of predator species, mean predator length, variance predator length. The habitat covariates at each station could be temperature, bottom depth?

bluepyall_stn <- fh.nefsc.pisc.blueprey %>%
  #create id linking cruise6_station
  #create season_ng spring and fall Spring=Jan-May, Fall=June-Dec
  mutate(id = paste0(cruise6, "_", station),
         year = as.numeric(year),
         month = as.numeric(month),
         season_ng = case_when(month <= 5 ~ "SPRING",
                               month >= 6 ~ "FALL",
                               TRUE ~ as.character(NA))
         ) %>%
  select(year, season_ng, id, 
         pynam, pyamtw, pywgti, pyvoli, blueprey, 
         pdcomnam, pdid, PiscGuild, pdlen, pdsvol, pdswgt, 
         beglat, beglon, declat, declon, 
         bottemp, surftemp, setdepth) %>%
  group_by(id) %>%
  #mean blueprey g per stomach per tow: sum all blueprey g/n stomachs in tow
  mutate(bluepywt = case_when(blueprey == "blueprey" ~ pyamtw,
                               TRUE ~ 0.0),
         bluepynam = case_when(blueprey == "blueprey" ~ pynam,
                               TRUE ~ NA_character_)) 

stndat <- bluepyall_stn %>%
  select(year, season_ng, id, 
         beglat, beglon, declat, declon, 
         bottemp, surftemp, setdepth) %>%
  distinct()

#pisc stomachs in tow count pdid for each pred and sum
piscstom <- bluepyall_stn %>%
  group_by(id, pdcomnam) %>%
  summarise(nstompd = n_distinct(pdid)) %>%
  group_by(id) %>%
  summarise(nstomtot = sum(nstompd))

#mean and var pred length per tow
pisclen <- bluepyall_stn %>%
  summarise(meanpisclen = mean(pdlen),
            varpisclen = var(pdlen))

bluepyagg_stn <- bluepyall_stn %>%
  summarise(sumbluepywt = sum(bluepywt),
            nbluepysp = n_distinct(bluepynam, na.rm = T),
            npreysp = n_distinct(pynam),
            npiscsp = n_distinct(pdcomnam)) %>%
  left_join(piscstom) %>%
  mutate(meanbluepywt = sumbluepywt/nstomtot) %>%
  left_join(pisclen) %>%
  left_join(stndat)

Some tests to ensure calcs look right, station data and summary:

datatable(bluepyall_stn %>% filter(id=="197303_134"),
          options= list(pageLength = 25,
                         scrollX = TRUE))
datatable(bluepyagg_stn %>% filter(id=="197303_134"),
          options= list(pageLength = 25,
                         scrollX = TRUE))
datatable(bluepyall_stn %>% filter(id=="197303_140"),
          options= list(pageLength = 25,
                         scrollX = TRUE))
datatable(bluepyagg_stn %>% filter(id=="197303_140"),
          options= list(pageLength = 25,
                         scrollX = TRUE))

Then look at summary stats of prey per tow, predators per tow by season each year.

sumstats <- bluepyagg_stn %>%
  group_by(year, season_ng) %>%
  summarise(nstn = n(),
            meanpreyn = mean(npreysp),
            meanbluepreyn = mean(nbluepysp),
            meanpiscn = mean(npiscsp),
            meanstomn = mean(nstomtot))

datatable(sumstats)

Save the input data for the initial aggregated prey index VAST:

saveRDS(bluepyagg_stn, here("fhdat/bluepyagg_stn.rds"))

Suggestions from Bluefish WG review 10 Dec

Consider truncating the time series could start in 1985 similar to bluefish assessment.

Consider leaving out more dissimilar predators, especially really different life history types and habitat/foraging mode. First to go may be the winter skate which was in a different subgroup from bluefish. Second could be split out small bluefish and weakfish group from Garrison and Link (2000).

Flag the other managed species that are bluefish prey abd who manages them.

Do we want to include inverts among the bluefish prey? ID some important ones.

What covariates could be used to describe differences in foraging efficiency between predators? Possibly predator length as in Ng et al. (2021), but think about other predator attributes.

Next steps

After the first run with all prey aggregated we can explore possible prey categories:

Presence of family level categories (Cephalopoda, Engraulidae) makes habitat splitting difficult. Do some dynamic mapping if engraulidae mostly with anchoa make inshore? If cephalapods with loligo make inshore; Illex offshore? Circular though.

References

Garrison, L., and Link, J. 2000. Dietary guild structure of the fish community in the Northeast United States continental shelf ecosystem. Marine Ecology Progress Series, 202: 231–240. http://www.int-res.com/abstracts/meps/v202/p231-240/ (Accessed 22 October 2018).
Ng, E. L., Deroba, J. J., Essington, T. E., Grüss, A., Smith, B. E., and Thorson, J. T. 2021. Predator stomach contents can provide accurate indices of prey biomass. ICES Journal of Marine Science, 78: 1146–1159. https://doi.org/10.1093/icesjms/fsab026 (Accessed 1 September 2021).