Know Cancer

forgot password

Platform for Medical Information Extraction From Incomplete Data

Not Enrolling
Liver Cancer

Thank you

Trial Information

Platform for Medical Information Extraction From Incomplete Data

Because of the increasing adoption of Electronic Medical Record (EMR) systems, the data
access of EMR is more and more convenient. However, there still have difficulties in
analyzing all the clinical data directly due to a large number of records using the
narrative format. In order to perform research smoothly, the process of information
extraction is required for translating data in clinical text into available format for
analysis and statistic. In medical research, the problem of missing data occurs frequently.
It is important to develop the method with better imputation performance in the stability
and accuracy. The purposes of this project are to provide the data integration and
extraction methods for handling the structured and unstructured data sources in more
efficient ways, to provide the validation scheme for facilitating the data reviewing of
extracted results produced by information extraction modules, to increase the quality of
clinical data by comparing the data from different data sources and correcting data errors
and inconsistent, to handle the clinical data with the properties of time series and
incompleteness, to increase accuracy of data analysis and increase quality of health care by
improving the completeness and correctness of clinical data, to provide flexibility of
methods in the platform. In the project, the disease topic is focused on the liver cancer
patients' clinical data and we hope the methods in the projects can be extended to handle
other diseases by replacing these knowledge models in the future.

Inclusion Criteria

Patients with liver cancer

Type of Study:


Study Design:

Time Perspective: Retrospective

Outcome Measure:

The number of patients correctly identified by recurrence predictive model

Outcome Description:

The recurrence predictive model is developed using the incomplete data set, this model is used for predicting the recurrent status of patient who received the specific treatment for liver cancer. The number of patients correctly identified by recurrence predictive model is regarded as the primary outcome measure.

Outcome Time Frame:

3 years

Safety Issue:


Principal Investigator

Feipei Lai

Investigator Role:

Principal Investigator

Investigator Affiliation:

National Taiwan University


Taiwan: Department of Health

Study ID:




Start Date:

March 2013

Completion Date:

March 2016

Related Keywords:

  • Liver Cancer
  • clinical narrative report
  • time series data
  • missing value
  • Liver Neoplasms