| | |
| | |
Stat |
Members: 3645 Articles: 2'506'133 Articles rated: 2609
26 April 2024 |
|
| | | |
|
Article overview
| |
|
Adversarial Learning of Privacy-Preserving Text Representations for De-Identification of Medical Records | Max Friedrich
; Arne Köhn
; Gregor Wiedemann
; Chris Biemann
; | Date: |
12 Jun 2019 | Abstract: | De-identification is the task of detecting protected health information (PHI)
in medical text. It is a critical step in sanitizing electronic health records
(EHRs) to be shared for research. Automatic de-identification classifierscan
significantly speed up the sanitization process. However, obtaining a large and
diverse dataset to train such a classifier that works wellacross many types of
medical text poses a challenge as privacy laws prohibit the sharing of raw
medical records. We introduce a method to create privacy-preserving shareable
representations of medical text (i.e. they contain no PHI) that does not
require expensive manual pseudonymization. These representations can be shared
between organizations to create unified datasets for training de-identification
models. Our representation allows training a simple LSTM-CRF de-identification
model to an F1 score of 97.4%, which is comparable to a strong baseline that
exposes private information in its representation. A robust, widely available
de-identification classifier based on our representation could potentially
enable studies for which de-identification would otherwise be too costly. | Source: | arXiv, 1906.5000 | Services: | Forum | Review | PDF | Favorites |
|
|
No review found.
Did you like this article?
Note: answers to reviews or questions about the article must be posted in the forum section.
Authors are not allowed to review their own article. They can use the forum section.
browser Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
|
| |
|
|
|
| News, job offers and information for researchers and scientists:
| |