vocab_verbatim_term_mapper
VocabVerbatimTermMapper
Maps source terms to concept IDs using a pre-built index of normalized terms. The index is created from vocabulary term files stored in Parquet format, downloaded using the download_terms module.
- If an index file exists at the verbatim_mapping_index_file path specified in the config, it is loaded.
- If not, the index is created by processing all Parquet files in the terms folder specified in the config.
Source code in src/ariadne/verbatim_mapping/vocab_verbatim_term_mapper.py
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | |
map_term(source_term)
Maps a source term to concept IDs using the pre-built index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_term
|
str
|
the source clinical term to map |
required |
Returns:
| Type | Description |
|---|---|
List[tuple[int, str]]
|
A list of concept ID - concept name tuples, possibly empty if no match is found. |
Source code in src/ariadne/verbatim_mapping/vocab_verbatim_term_mapper.py
90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | |
map_terms(source_terms, term_column='cleaned_term', mapped_concept_id_column='mapped_concept_id', mapped_concept_name_column='mapped_concept_name')
Maps source terms in a DataFrame column to concept IDs using the pre-built index.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source_terms
|
DataFrame
|
DataFrame containing the source clinical terms to map |
required |
term_column
|
str
|
Name of the column with terms to map |
'cleaned_term'
|
mapped_concept_id_column
|
str
|
Name of the column to store matched concept IDs. |
'mapped_concept_id'
|
mapped_concept_name_column
|
str
|
Name of the column to store matched concept names. |
'mapped_concept_name'
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
A DataFrame with the original columns and their mapped concept IDs and names. |
Source code in src/ariadne/verbatim_mapping/vocab_verbatim_term_mapper.py
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 | |