Pre | ATBC2026

Workshops & Courses

研讨会与课程

W-4

Workshop

Emerging Applications of Large Language Models in Ecology and Conservation Science

Saturday, June 27th, 2026

Organizer(s):

Christos Mammides, Xishuangbanna Tropical Botanical Garden

Description

Workshop Overview and Objectives

Large Language Models (LLMs) are rapidly becoming integral components of ecological and conservation research workflows. Their ability to process, extract, organise, and synthesise information at scale offers new opportunities for accelerating scientific inquiry and supporting evidence-based decision making. LLMs are now capable of assisting with tasks such as literature synthesis, structured data extraction, coding and analytical support, biodiversity data interpretation, and the analysis of conservation policy documents. As these tools become more widely adopted, the research community faces an urgent need for structured training that emphasises scientific rigour, transparency, reproducibility, and responsible use. Without clear guidance, the risk of misapplication, overreliance, or methodological errors increases substantially.

This workshop is designed to introduce participants to the emerging capabilities of LLMs and to demonstrate how these tools can be incorporated into ecological research in a rigorous and methodologically sound manner. Through a combination of structured instruction and hands-on group exercises, participants will gain both conceptual understanding and practical experience with LLM-supported research workflows. The workshop is suitable for researchers at all career stages, and no prior experience with artificial intelligence or machine learning is required.

By the end of the session, participants will be able to:

a) Design effective, domain-specific prompts tailored to ecological research tasks.

b) Apply LLMs to core analytical and interpretive activities, including evidence extraction and synthesis.

c) Evaluate the accuracy, consistency, and reliability of model-generated outputs using multiple verification strategies.

d) Identify and mitigate risks associated with hallucinations, bias, reproducibility challenges, and ethical considerations.

The workshop places strong emphasis on enabling researchers to integrate LLMs into their workflows responsibly and with appropriate methodological safeguards. All materials used during the workshop are drawn from publicly available scientific literature, and no personal or sensitive data will be processed.

Program Outline

The workshop consists of two main parts:

Part I: Introduction and Critical Discussion (~40 minutes)

This session provides an overview of current and emerging applications of LLMs in ecology and conservation science, drawing on the review by Mammides et al. (Preprint). It covers applications including scientific writing support, programming and coding development, evidence synthesis, improved biodiversity monitoring, and policy analysis.

The session also discusses important technical and ethical challenges related to LLM use, such as hallucination risks, algorithmic bias, reproducibility issues, disparities in access to computational resources, and environmental effects. It concludes by exploring potential solutions, emphasizing prompt engineering principles that highlight how clarity, specificity, and context influence model behaviour.

Part II: Extended Hands-On Exercise (~110 minutes in total)

This section of the workshop offers an extended practical session where participants apply LLMs for an ecological literature-extraction task. It facilitates an in-depth exploration of prompt development, model behavior, and validation approaches.

Step 1: Prompt Design and Application (~45 minutes)

In small groups (3-4 people), participants will develop a clear, structured prompt to extract predefined categories of information from ecological research articles. Typical fields to be extracted include, among others, focal taxa, geographic location in standardized format, biodiversity threats, and conservation status. Groups will improve their prompts through guided discussion, analyzing how types of prompts, phrasing, and constraints affect model outputs.

After finalizing their prompts, groups will apply them to a curated set of eight to ten ecological research papers (provided in both PDF and plain-text formats to minimize upload delays). Participants will use browser-based or desktop LLM tools for extraction. This format enables groups to test different model settings, identify error patterns, and compare outputs across documents.

Step 2: Validation and Verification (~35 minutes)

Groups will conduct a structured evaluation of the extracted outputs using several complementary approaches:

Direct comparison between the extracted information and the original text,
Self-consistency testing, repeating prompts to assess output stability,
Self-verification, asking the model to evaluate the accuracy or completeness of its own extraction,
Cross-group comparison, examining differences arising from prompt structure, model choice, or parameter settings.

These activities will help participants develop a detailed understanding of where extraction succeeds, where it fails, and why.

Step 3: Group Synthesis and Discussion (~ 30 minutes)

Groups will present their findings and compare the performance of different prompt designs. A facilitated discussion will synthesise common patterns, highlight effective strategies, and identify limitations encountered during the exercise. The session will conclude with a set of recommended best practices for using LLMs in ecological evidence extraction and related research tasks.

To ensure continuity in the event of technical disruptions, pre-generated outputs will be available.

Materials that participants need to bring:

Participants should bring their own laptop. They will also need access to DeepSeek (https://www.deepseek.com/). They may use either the DeepSeek desktop application or the browser-based interface. Please note that access requires prior registration (free of charge).

W-4