Medical bioinformatics with a focus on near-patient data acquisition

Prof. Dr. Alexander Schliep is FGW Professor for Medical Bioinformatics with a focus on patient-oriented data acquisition at the Brandenburg University of Technology Cottbus-Senftenberg (BTU).

Large amounts of data are collected in many disciplines, often with highly diverse formats that need to be processed bioinformatically. For this purpose, it is usually necessary to develop new tools that provide the best possible answer for the data set and the specific question.

Furthermore, one focus of this professorship will be the development of new techniques such as mobile diagnostics or networked care of chronically ill people as well as the bias-free evaluation of these new techniques.

Prof. Dr. Alexander Schliep
Prof. Dr. Alexander Schliep
Head of the Professorship for Medical Bioinformatics with a Focus on near-patient Data Acquisition
phone: +49 (0)355 5818 721

The SchliepLAB is part of the Brandenburg Faculty of Health Sciences and is located at the Brandenburg University of Technology Cottbus-Senftenberg. Part of the group is based at the Faculty of Computer Science and Engineering, which is a joint faculty of the University of Gothenburg and Chalmers University of Technology.

A list of current and completed research projects can be found at https://schlieplab.org/Research/, a list of publications at https://schlieplab.org/Publications/ and a list of software packages at https://schlieplab.org/Software/.

Research Focuses

AI for the design of oligonucleotide therapeutics

In particular, for antisense oligonucleotides (ASO) that act by RNAse H1-mediated knockdown, binding energies and kinetics of ASO mRNA duplexes are critical for predicting efficacy and safety. We predict binding energies from sequences, study the kinetics of ASO action to make the ASO drug design process more predictable, and combine molecular dynamics and artificial intelligence in collaborative projects to extend predictive models to a wider range of nucleotide modifications. Our federated, privacy-preserving learning approach enables competitors to pool data for training predictors of binding energies.

ML for pan-genomic graphs

Pan-genomic graphs provide a principled approach for dealing with structural variants and the high degree of diversity between genomes. ML on pan-genomic graphs will allow to tackle prediction and regression tasks for different populations, including quantities relevant for oligonucleotide therapeutics, such as transcription rate or accessibility, as well as clinically relevant variables.

ML and algorithmics for sequencing data

Data generated by high-throughput experimental platforms such as high-throughput sequencing (HTS) pose computational challenges, in particular when advanced statistical approaches such as Bayesian methods are used for analysis. In the past, we have developed a compressive genomics approach funded by the NIH Big Data to Knowledge Program (BD2K), used wavelet compression in Bayesian HMMs for copy number variant detection, and significantly improved the utility of statistical ML models representing genomes – variable length Markov chains – through faster learning algorithms. This enables, for example, alignment-free genome comparisons from raw data.

Education

Computational thinking is a basic requirement for all disciplines. The teaching of computational and algorithmic ideas can benefit greatly from software tools. We develop animation systems for graph algorithms that are available on desktop, as a web app, and soon as an iOS app; CATBox is a Springer textbook that uses Gato. With our Hidden Markov Model library, learners can focus on solving exciting bioinformatics problems.

winter semester 2024/25

Computing at Scale in Machine Learning: Distributed computing and algorithmic approaches (Modul: 14038)

Learning Outcomes
Students will obtain an overview on how to solve large-scale computational problems in data science and machine learning using a) parallel approaches from multi-threaded computation on individual machines to implicit parallelism frameworks on compute clusters and b) algorithms and data structures supporting efficient exact or approximate computation with massive data sets in and out of core. In particular they will learn how to analyze relevant probabilistic data structures and algorithms and select and implement appropriate computational approaches for large-scale problems.

Contents
The focus will be on four fundamental problem areas:

  • A review of memory-compute co-location and its impact on big data computations.
  • Solving Machine Learning (ML) work loads using explicit parallelism, specifically multi-threaded computation on an individual machine.
  • Introduction of implicit parallelism programming models as implemented for example in MapReduce, Spark and Ray and their application in ML.
  • Probabilistic algorithms such as sketching algorithms (incl. CountMinSketch, HyperLogLog) or Bloom filters.
  • Implementing ML methods using index data structures such as suffix or kd-trees.

Recommended Prerequisites

Introduction to machine learning at Master’s level. Advanced knowledge of programming in Python and the Linux command line.

LectureExercise
ThursdayThursday
15:30 – 17:0017:30 – 19:00
Central Campus, LG 1A, HS 1Central Campus, LG 1A, HS 1

Detailed information for participants is available at <<coming soon>>


Introduction to Bioinformatics (Modul: <<coming soon>>)

Learning Outcomes
After successfully completing the module, students will have acquired an overview of the fundamentals of bioinformatics. This includes an introduction to relevant molecular processes, scientific instruments to investigate these processes, and the data generated by them. For central computational problems, students will be able to discuss advantages and disadvantages of statistical and basic algorithmic approaches, respectively adapt them to specific biological questions. Students will be able to analyze specific biological data using appropriate software libraries for Python.

Contents
The focus will be on the basics of the following areas:

  • An introduction to molecular biology including relevant scientific instruments and the Omics-data generated by them.
  • Pair-wise and multiple sequence alignments, seed-and-extend approaches, and genome indexes
  • Evolutionary models and phylogenetic trees
  • Signals in sequences: identification of motifs
  • Assembly of genomes and transcriptomes
  • Gene expression analysis

Recommended Prerequisites
Good knowledge of discrete probability, algorithms and data structures at the undergraduate level. Advanced knowledge of programming in Python and the Linux command line. 

ExerciseLecture
FridayFriday
9:15 – 10:4511:30 – 13:00
Campus Sachsendorf, building 9, room 9.122Campus Sachsendorf, building 9, room 9.122

Detailed information available for participants at <<coming soon>>


Bioinformatics (Modul: 13866)

Seminar
Friday
13:45 – 15:15
Sachsendorf Campus, LG 9, HS 9.122

Detailed information for participants is available at <<coming soon>>


Research Module in Artificial Intelligence (Modul: 14060)

Appointment by arrangement in Sachsendorf


Oberseminar Medizinische Bioinformatik (Modul: 13600)

Appointment by arrangement in Sachsendorf


A list of current and prior offered courses and seminars can be found at https://schlieplab.org/Teaching/

Katharina Mansfeld
Katharina Mansfeld
Assistant
phone: +49 (0)355 5818 720
Nathalie Gocht, M. Sc.
Nathalie Gocht, M. Sc.
Research Associate
phone: +49 (0)355 5818 373
Aleksandra Khatova, M. Sc.
Aleksandra Khatova, M. Sc.
Research Associate
phone: +49 (0)355 5818 723


Joint faculty
The University of Potsdam, the Brandenburg Medical School Theodor Fontane and the Brandenburg Technical University Cottbus-Senftenberg