This article was originally published here
Gigascience. Sep 11, 2021; 10 (9): giab059. doi: 10.1093 / gigascience / giab059.
BACKGROUND: High quality phenotype definitions are desirable to allow extraction of patient cohorts from large electronic health record repositories and are characterized by properties such as portability, reproducibility, and validity. Phenotype libraries, where definitions are stored, have the potential to significantly contribute to the quality of the definitions they host. In this work, we present a set of desiderata for the design of a new generation phenotype library capable of ensuring the quality of hosted definitions by combining the functionalities currently offered by disparate tools.
METHODS: A group of researchers reviewed work to date on phenotype models, implementation, and validation, as well as contemporary phenotype libraries developed within their own phenomenic communities. The existing phenotypic frameworks were also examined. This work has been translated and refined by all the authors into a set of good practices.
RESULTS: We present 14 library desiderata that promote high quality phenotype definitions, in the areas of modeling, logging, validation, sharing and storage.
CONCLUSIONS: There are a number of choices to be made when constructing libraries of phenotypes. Our considerations distill best practices in the field and include pointers to their further development to support a portable, reproducible and clinically valid phenotype design. Providing high quality phenotype definitions enables electronic health record data to be used more effectively in medical fields.
PMID: 34508578 | DOI: 10.1093 / gigascience / giab059