This job is no longer available.
You can view related vacancies or set-up an email alert notification when similar jobs are added to the website below.

Research Assistant

 

Job Description

Research Assistant – sample size investigation for clinical texts classification using NLP



A team in BHI is looking for a motivated research assistant to work on a project investigating optimal sample size for Natural Language Processing (NLP) classification tasks that require manual annotations. The tasks involve:



·       running simulations using deep learning models to investigate model performances in various language settings and training corpora sizes;



·       developing a GitHub project page and a well-documented code;



·       participating in dissemination activities (e.g. preparing and participating in an interactive online seminar).



A successful candidate can start as soon as possible and work till the end of July, in a collaboration with myself, Angus Roberts, Jaya Chaturvedi, and Daniel Stahl. The hours (14h/week) can be worked as two full days or spread over the week, remotely or in the office (Denmark Hill). Please write to diana.shamsutdinova@kcl.ac.uk to express your interest.



Project Description: Natural Language Processing methods are widely applied to extract information from clinical texts and present it in a structured way. However, unlike in statistical data analyses, there are no methods available for estimating the sample size needed. Our project aims to assess optimal sample size for the development of clinical NLP models and how these requirements change depending on the documents and language properties. By taking a simulation approach and following modern guidance on model validation, we will be able to investigate model performances in various scenarios and provide guidance on sample sizes for clinical NLP tasks.



Qualifications

The role is suitable for a current MSc/PhD student/postdoc/early career researcher in Computer science, Engineering, Health Informatics, Statistics, or a related field.



Skills

·       Strong analytical skills;



·       Knowledge of Python programming language;



·       Understanding model validation techniques such as cross-validation and evaluation metrics such as AUC-ROC/sensitivity/specificity/precision, 



·       Ability to work independently and in a team.

MORE JOBS LIKE THIS

About the Role:

We are seeking a highly motivated Research Assistant to contribute to the development of a medical timeline builder using Large Language Models (LLMs). This project aims to extract and organize temporal information from clinical narratives to construct structured medical timelines that enhance clinical decision-making and patient care. The successful candidate will work at the intersection of natural language processing (NLP), clinical informatics, and AI-driven healthcare applications.

Key Responsibilities:

  • Data Processing & Annotation: Preprocess and structure clinical text datasets (e.g., i2b2, MIMIC) for training and evaluation.
  • LLM Fine-Tuning & Evaluation: Fine-tune state-of-the-art LLMs for temporal information extraction and reasoning in clinical texts.
  • Pipeline Development: Develop and implement a two-stage LLM-based framework for extracting temporal references and constructing medical timelines.
  • Model Benchmarking: Design benchmark datasets and evaluate models on clinical temporal reasoning tasks.
  • Visualization & Integration: Assist in integrating timeline generation results into interactive visualization toolsfor clinical use.
  • Collaboration & Dissemination: Work closely with interdisciplinary teams, including clinicians and AI researchers, and contribute to publications and conference presentations.


Qualifications

Education: Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Biomedical Informatics, or a related field.



Skills
  • Programming Skills: Proficiency in Python, with experience in NLP libraries (e.g., Hugging Face Transformers, spaCy, NLTK).
  • Machine Learning & LLMs: Understanding of deep learning, LLM fine-tuning, and model evaluation techniques.
  • Clinical NLP Experience: Familiarity with medical text processing, clinical terminologies (e.g., SNOMED, UMLS), and temporal reasoning in healthcare.
  • Data Handling: Experience working with structured and unstructured clinical datasets (e.g., i2b2, MIMIC-III).
  • Research & Communication: Strong analytical skills, ability to conduct literature reviews, and contribute to academic writing.
  • assisting in conducting research activities related to computer vision, including literature reviews, data collection, experimentation, and analysis
  • assisting in the development and implementation of computer vision algorithms, including image processing, object detection, recognition, segmentation, and tracking
  • preparing and annotating datasets for training and evaluation purposes, ensuring data quality and relevance to research objectives
contributing to the solution in a form of software tools and frameworks for computer vision research, using programming languages such as Python or C/C++
  • assisting in the analysis of qualitative and quantitative data, as directed.


Qualifications

N/a



Skills
  • Some prior experience and strong interest in the subject of Computer Vision
  • Understanding of deep learning frameworks (e.g., TensorFlow Keras, PyTorch) and some proficiency in training convolutional neural networks (CNNs) for computer vision tasks.
  • Familiarity in training deep learning models using preprocessed and augmented datasets, monitoring model performance and convergence during training.
  • Practical knowledge in utilizing programming languages relevant to machine learning, deep learning and computer vision (Python 3.4 and above is an absolute must).
  • Experience working with video / image data, including data preprocessing, annotation and analysis using popular libraries (e.g. OpenCV)
Knowledge of common evaluation metrics for assessing model performance in computer vision tasks, such as accuracy, precision, recall, and F1 score.
  •  Knowledge in web frameworks written in Python (e.g. Flask) is desirable but not essential

    ?    assisting in conducting research activities related to computer vision, including literature reviews, data collection, experimentation, and analysis
    ?    assisting in the development and implementation of computer vision algorithms, including image processing, object detection, recognition, segmentation, and tracking
    ?    preparing and annotating datasets for training and evaluation purposes, ensuring data quality and relevance to research objectives
contributing to the solution in a form of software tools and frameworks for computer vision research, using programming languages such as Python or C/C++
    ?    assisting in the analysis of qualitative and quantitative data, as directed.



Qualifications

N/a



Skills

    ?    Some prior experience and strong interest in the subject of Computer Vision
    ?    Understanding of deep learning frameworks (e.g., TensorFlow Keras, PyTorch) and some proficiency in training convolutional neural networks (CNNs) for computer vision tasks.
    ?    Familiarity in training deep learning models using preprocessed and augmented datasets, monitoring model performance and convergence during training.
    ?    Practical knowledge in utilizing programming languages relevant to machine learning, deep learning and computer vision (Python 3.4 and above is an absolute must).
    ?    Experience working with video / image data, including data preprocessing, annotation and analysis using popular libraries (e.g. OpenCV)
Knowledge of common evaluation metrics for assessing model performance in computer vision tasks, such as accuracy, precision, recall, and F1 score.
    ?    Knowledge in web frameworks written in Python (e.g. Flask) is desirable but not essential

MORE JOBS LIKE THIS