1. What is the eRisk lab's focus?
The eRisk lab focuses on early risk detection of mental disorders from social media data. It started in 2017 with the pilot task of detecting depression from social media data. The lab organizes tasks yearly, expanding to other mental illnesses like eating disorders, pathological gambling, and self-harm. The current task involves retrieving and ranking social media posts with depression symptoms from the BDI-II questionnaire. The lab's proposed method generates synthetic Reddit posts resembling BDI-II responses to add diversity to the data and improve retrieval of relevant sentences.
read more
2. How can LLMs be used for mental health assessment?
LLMs can be used for mental health assessment in various ways. Recent advancements in large language models (LLMs) have shown potential in evaluating them for mental health assessment. For instance, Yang et al. compared ChatGPT with three supervised baselines and found that while ChatGPT achieved good results in a zero-shot classification setting, it lagged behind transformer-based specialized models for downstream tasks such as suicide and depression identification from social media data. Amin et al. performed an interpretable mental health analysis through emotional reasoning using ChatGPT on 11 datasets across 5 tasks related to depression, stress, and suicide ideation. Their results indicated that zero-shot ChatGPT performed better than traditional neural network architectures but could not surpass the performance of specialized transformer-based models. The authors also conducted human evaluations and tested the impact of emotional reasoning in mental health assessment, finding that emotional reasoning improved ChatGPT's performance and enabled the model to generate explanations for its predictions. Additionally, LLMs have been used to generate and augment data for mental health assessment. Meyer et al. evaluated the synthetic data generated by GPT-3 for conversational tasks and found that classifiers trained on synthetic data performed worse than those trained on fewer samples of real user-generated data. However, generating synthetic data might be a suitable approach in scenarios with limited data or resources. In summary, LLMs can be utilized for mental health assessment through zero-shot classification, emotional reasoning, and data generation and augmentation, offering potential benefits in improving mental health assessment and intervention strategies.
read more
3. How are sentences ranked for relevance to BDI-II symptoms?
Sentences are ranked based on their relevance to the symptoms of the Beck Depression Inventory-II (BDI-II). The BDI-II is a questionnaire used to screen for depression and consists of 21 questions related to symptoms such as sadness, pessimism, loss of pleasure, and tiredness. Each question corresponds to one of the symptoms, with a Likert scale survey measuring the intensity of the symptom. In the eRisk 2023 Lab task, sentences from Reddit are ranked by their relevance to each BDI-II symptom. A sentence is considered relevant if it contains information about the user's mental state regarding the symptom, even if the user does not suffer from it. The data for this task includes 4 million sentences from 3,107 users, organized as TREC formatted sentences. Top-k pooling with k equal to 50 is used to evaluate the systems' performance, combining the top 50 relevant sentences for each symptom from each system. These sentences are then assessed by three annotators for relevance to the symptoms. A sentence is considered relevant if it contains information about the individual's state and is topically related to the BDI-II symptoms.
read more
4. Generate diverse Reddit posts for BDI symptom?
To create diverse Reddit posts for the BDI depression questionnaire, we need to generate {N} examples for the '{symptom}' symptom, with the BDI answer of interest being '{item}'. These posts should be in English, 2-3 sentences long, diverse in language, specific to personal experiences, and avoid using exact BDI item words. The posts should combine descriptions of past experiences with feelings or events, providing substantial content for ranking models. Examples may include self-disclosure, such as 'My cat passed away' or 'I just broke up with my partner'.
read more