Healthcare Series — SNOMED Release Files

Siva Gollapalli
4 min readFeb 2, 2024

In this blog post, let’s discuss SNOMED file distribution and its significance. Before moving ahead, please read this blog post to get context about what is SNOMED? let’s begin

SNOMED distribution is a set of text files that has a list of concepts and their descriptions, synonyms, fully specified names, and their relationships with other concepts, attributes, etc. Each of these has its significance. In a SNOMED release, you see two folders. 1) Full 2) Snapshot.

  • Full — Consists of the same set of files but it has consisted of all concepts since day one of SNOMED release. This might be useful when you analyze data.
  • Snapshot — This is the recent release that consists of updated definitions of the same set of concepts as we see in full but with updated definitions, descriptions, and relationships.

The following diagram provides clarity on how files are related

Source: https://confluence.ihtsdotools.org/display/DOCRELFMT/4.1+Associations+Between+Release+Files

Concept: It is the root of the entire SNOMED release. It consists list of concepts that are active/inactive as well. It is recommended to go with active concepts when you do data entry. You will see a list of concepts in this “sct2_Concept_Snapshot_INT_20231201”

Description: It provides definitions for a given concept and you would see that in “sct2_Description_Snapshot-en_INT_20231201” file. There are two types of descriptions:

  1. FSN (Fully specified Name): This would be the official name given to a concept that is denoted with concept ID “900000000000003001”. The FSN for COVID-19 is “Disease caused by severe acute respiratory syndrome coronavirus 2 (disorder)”.
  2. Synonym: Most of the time we don’t use FSN and each country will have its names. In the context of COVID-19, it would be “COVID-19”, “ Disease caused by 2019 novel coronavirus”, “Disease caused by 2019-nCoV”, “Disease caused by SARS-CoV-2”, “ Disease caused by severe acute respiratory syndrome coronavirus 2”. The concept ID for the Synonym would be “900000000000013009”.

To pull the above information we need to query description files with the above concept IDs for respective information.

Note: If you notice that using SNOMED you can easily club the different versions of COVID-19 nomenclatures and group them as one disease rather than the different diseases. Also, it depends on the country you live you can use respective language refsets as well.

Relationship: It provides an overview of how each concept relates to another concept. SNOMED is a poly-hierarchy relationship as depicted below:

source: https://confluence.ihtsdotools.org/display/DOCGLOSS/DAG

So each concept may have multiple parents and these relationships will end at the root concept level. If you see it is a direct acyclic graph and each child has a direct relationship with the parent using “is a | 116680003” relationship. So in our example covid19 | 840539006 is a type of Coronavirus infection (disorder) | 186747009 which is a type of Disease caused by Coronaviridae (disorder) | 27619001 which is a type of Viral disease (disorder) | 34014006 which is a type of Infectious disease (disorder) | 40733004 which is a type of Disease (disorder) | 64572001 which is a part of Clinical finding (finding) | 404684003.

So using the above relationships we can easily find ancestors or descendants for a given concept. In our release files, you can find these relationships in the “sct2_Relationship_Snapshot_INT_20231201” file. Below is the snapshot of the relationship file.

In the context of COVID-19 | 840539006 is a type of Coronavirus infection (disorder) | 186747009, in release files it gets represented as sourceid as 840539006, destinationid as 186747009 and typeid as 116680003.

After reading you may get confused about how to interact with these text files and how we can easily integrate them into our stack. Don’t worry, I understand your pain point and hence I am building a tool backed by Postgres so you can load them as SQL tables and interact with SNOMED effectively. Please stay tuned to my blog for details!!!

Happy weekend. Any feedback or suggestions would be welcome.

--

--