# X12 EDI Format for Mere Mortals ## Overview X12 EDI format is widely used for exchanging healthcare "administrative" data, e.g., health insurance claims. EDI is governed by [X12](https://x12.org), chartered by the American National Standards Institute. Since X12 EDI's development predates JSON and probably XML (it was originally developed in 60s for transmitting railroad schedules), its notation and terminology could be confusing to a modern day person. Here is a quick example: ``` CLM*26463774*100***11:B:1*Y*A*Y*I~ REF*D9*17312345600006351~ HI*BK:0340*BF:V7389~ LX*1~ SV1*HC:99213*40*UN*1***1~ DTP*472*D8*20061003~ ``` At the first glance, this seems fairly inscrutable, but we will demystify all of it below. It is worth mentioning that there are alternatives to using EDI in healthcare space. The format called [FHIR](https://www.hl7.org/fhir/), supported by [HL7](https://www.hl7.org/), has become a de-facto standard for information exchange, including APIs. This format supports JSON and other encoding options. FHIR comes with the [model for claims](https://www.hl7.org/fhir/claim.html) ("claim resource"), however, EDI is still very prevalent in the industry, and it is mandated by all the major players in the space, such as EDI clearing houses. The pros and cons of EDI vs FHIR is a separate topic of discussion. We would only mention that EDI is more compact as it is devoid of almost any metadata, such as field names. But this very trait also makes it difficult to understand. ## Key Characteristics of EDI EDI is a text file format, it only uses ASCII characters. It uses delimiters, such as "*", to separate values, so it has similarities to a [CSV](https://en.wikipedia.org/wiki/Comma-separated_values) format. EDI also has some commonalities with XML or JSON in that it allows for supporting hierarchies and repeating groups. EDI does not use whitespace or indentation like YAML, all whitespace characters can be discarded from an EDI file. Unlike traditional CSV files, EDI does not use lines, so a new line character "\n" is ignored. Therefore, our earlier example could be presented as a single string: ``` CLM*26463774*100***11:B:1*Y*A*Y*I~REF*D9*17312345600006351~HI*BK:0340*BF:V7389~LX*1~SV1*HC:99213*40*UN*1***1~DTP*472*D8*20061003~ ``` It is somewhat customary, however, to start new line after "~" character (end of segment) when presenting EDI files for "human consumption", we will follow this convention here. Key concepts of the EDI format are explained below. ## EDI Files and Transactions EDI format is batch-oriented. EDI data is packaged in files; files can pass multiple intermediaries, such as clearing houses. A file can contain multiple types of data, the data schema is defined by "Transaction type". The transaction type is very similar to an XML or a JSON schema, however, it does not provide a machine-readable representation akin to an ".xsd" that can be automatically utilized by parsers. Healthcare-related transaction types are buried within the [Insurance Transaction Set] (https://x12.org/products/transaction-sets). This transaction set contains schemas for claims (institutional, professional, dental) and payments. Each transaction type has a unique ID or name. "837" transaction types refer to claim data, "835" to payments. Transaction header (`ST` segment) stipulates the transaction type: ``` ST*837*12345*005010X222~ ``` The file content following this line must confirm to the schema of this transaction type, e.g., `837` and `005010X222`, which actually refers to the professional claims. ### Loop * Analogous to a JSON object, contains data elements that belong to the same logical entity * Like a JSON object, it can also contain nested loops (objects) * Can repeat multiple times. Unlike JSON, EDI does not have a special notation for arrays ("[]" in JSON) Loops consist of segments. The fragment below represent a single loop called "Billing provider loop" ``` NM1*41*2*PREMIER BILLING SERVICE*****46*TGJ23~ PER*IC*JERRY*TE*3055552222*EX*231~ ``` ### Segment * A reusable group of elements, akin to a relational database table * A segment always starts with a segment "ID" denoting its purpose and ends with "~" * Unlike loops, segments may not contain nested segments * Segments contain elements, "*" is used as a separator * Many segments contain "qualifier" or "identifier code" elements. A qualifier specifies a role or a purpose of a segment This is a single segment: ``` NM1*85*2*ABC Group Practice*XX*1234567890~ ``` The prefix `NM1` is the segment ID. The rest of the string consists of values (elements), ending with `~`. `NM1` is "Individual or Organizational Name". `NM1` is used for any name in an EDI file -- could be a provider, a patient, a payer and so on. `NM1` can contain up to 12 elements (like a database table containing 12 columns). Some elements in a segment could be optional. For example, the last element in our example, `1234567890`, is a provider's NPI. This will not apply to `NM1` used for patient names. The first element `85` is "Entity Identifier Code". Code `85` identifies a billing provider. The segment ID is the only explicit piece of "metadata" in an EDI file. Loops and elements do not their names stated anywhere in the file and have to be gleaned from context. ### Element * Similar to a column in a database table or to an Excel cell * Support various data primitive types such as string, numeric, date * Same element can repeat, but this is relatively rare. Think of repeating elements as an array of values. * EDI has a notion of a composite element, see below. Most elements, however, are simple value elements Let's take the following segment: ``` DTP*454*D8*20050108~ ``` We have the following elements: * `454` is a "Date Time Qualifier", "454" stands for "Initial Treatment". * `D8` * `20050108` ### Composite Element * Consists of multiple sub-elements separated by ":" * Can repeat multiple times * Typically, used for specifying healthcare codes, such as procedures and diagnoses * The first element is usually a "qualifier" defining the type of code. Composite elements are conceptually similar to segments. The difference is that they can exist only within a segment. You can think of a composite element as a JSON object or array of objects within another JSON object (segment). Example: ``` SV2*0300*HC:81099*73.42*UN*1~ ``` `SV2` segment defines "Institutional Service Line"; it includes charge amount (73.42), and the procedure code `HC:81099` . `:` tells us that this is a composite element. The first sub-element `HC` defines the type of the codeset (HCPCS/CPT), the second `81099` states the actual procedure code. The description for this code is "Unlisted urinalysis procedure". ### Full Example Here is an example of an EDI content with a single transaction (837P) containing a single claim. Explanation is provided before each line. An identifier that starts with a number (2000A) is a loop, following by segments contained within that loop. We added the new line before each loop for readability. ``` ST TRANSACTION SET HEADER ST*837*0021*005010X222~ BHT BEGINNING OF HIERARCHICAL TRANSACTION BHT*0019*00*244579*20061015*1023*CH~ 1000A SUBMITTER NM1 SUBMITTER NAME NM1*41*2*PREMIER BILLING SERVICE*****46*TGJ23~ PER SUBMITTER EDI CONTACT INFORMATION PER*IC*JERRY*TE*3055552222*EX*231~ 1000B RECEIVER NM1 RECEIVER NAME NM1*40*2*KEY INSURANCE COMPANY*****46*66783JJT~ 2000A BILLING PROVIDER HL LOOP HL - BILLING PROVIDER HL*1**20*1~ PRV BILLING PROVIDER SPECIALTY INFORMATION PRV*BI*PXC*203BF0100Y~ 2010AA BILLING PROVIDER NM1 BILLING PROVIDER NAME NM1*85*2*BEN KILDARE SERVICE*****XX*9876543210~ N3 BILLING PROVIDER ADDRESS N3*234 SEAWAY ST~ N4 BILLING PROVIDER LOCATION N4*MIAMI*FL*33111~ REF - BILLING PROVIDER TAX IDENTIFICATION REF*EI*587654321~ 2010AB PAY-TO PROVIDER NM1 PAY-TO PROVIDER NAME NM1*87*2~ N3 PAY-TO PROVIDER ADDRESS N3*2345 OCEAN BLVD~ N4 PAY-TO PROVIDER CITY N4*MAIMI*FL*33111~ 2000B SUBSCRIBER HL LOOP HL - SUBSCRIBER HL*2*1*22*1~ MAY 2006 499 005010X222 • 837 SBR SUBSCRIBER INFORMATION SBR*P**2222-SJ******CI~ 2010BA SUBSCRIBER NM1 SUBSCRIBER NAME NM1*IL*1*SMITH*JANE****MI*JS00111223333~ DMG SUBSCRIBER DEMOGRAPHIC INFORMATION DMG*D8*19430501*F~ 2010BB PAYER NM1 PAYER NAME NM1*PR*2*KEY INSURANCE COMPANY*****PI*999996666~ REF BILLING PROVIDER SECONDARY IDENTIFICATION REF*G2*KA6663~ 2000C PATIENT HL LOOP HL - PATIENT HL*3*2*23*0~ PAT PATIENT INFORMATION PAT*19~ 2010CA PATIENT NM1 PATIENT NAME NM1*QC*1*SMITH*TED~ N3 PATIENT ADDRESS N3*236 N MAIN ST~ N4 PATIENT CITY/STATE/ZIP N4*MIAMI*FL*33413~ DMG PATIENT DEMOGRAPHIC INFORMATION DMG*D8*19730501*M~ 2300 CLAIM CLM CLAIM LEVEL INFORMATION CLM*26463774*100***11:B:1*Y*A*Y*I~ REF CLAIM IDENTIFICATION NUMBER FOR CLEARING HOUSES (Added by C.H.) REF*D9*17312345600006351~ HI HEALTH CARE DIAGNOSIS CODES HI*BK:0340*BF:V7389~ 2400 SERVICE LINE LX SERVICE LINE COUNTER LX*1~ SV1 PROFESSIONAL SERVICE SV1*HC:99213*40*UN*1***1~ DTP DATE - SERVICE DATE(S) DTP*472*D8*20061003~ 2400 SERVICE LINE LX SERVICE LINE COUNTER LX*2~ SV1 PROFESSIONAL SERVICE SV1*HC:87070*15*UN*1***1~ DTP DATE - SERVICE DATE(S) DTP*472*D8*20061003~ 2400 SERVICE LINE LX SERVICE LINE COUNTER LX*3~ SV1 PROFESSIONAL SERVICE SV1*HC:99214*35*UN*1***2~ DTP DATE - SERVICE DATE(S) DTP*472*D8*20061010~ 2400 SERVICE LINE LX SERVICE LINE COUNTER LX*4~ SV1 PROFESSIONAL SERVICE SV1*HC:86663*10*UN*1***2~ DTP DATE - SERVICE DATE(S) DTP*472*D8*20061010~ TRAILER SE TRANSACTION SET TRAILER SE*42*0021~ ```