X12 EDI Standard is a proprietary text-based format for encoding and transferring complex structured hierarchical data. Even though the standard is very different from modern JSON/XML formats, it is governed by simple concepts explained below.
Did you know that EDI was inspired by the 1948 Berlin airlift, and the challenges of exchanging data over 300 baud teletype modem? Yes, the EDI format is that old.
The Electronic data interchange (EDI) format is widely used for exchanging healthcare “administrative” data, e.g., health insurance claims.
In the US, EDI is governed by X12 non-profit organization, chartered by the American National Standards Institute. Here we’re using EDI and X12 almost interchangeably, although other EDI flavors, such EDIFACT are in use outside of the US.
Since X12 EDI’s development predates JSON and probably XML (it was originally developed in the 60s for transmitting railroad schedules), its notation and terminology could be confusing to a modern-day person.
Here is a quick example:
CLM*26463774*100***11:B:1*Y*A*Y*I~
REF*D9*17312345600006351~
HI*BK:0340*BF:V7389~
LX*1~
SV1*HC:99213*40*UN*1***1~
DTP*472*D8*20061003~
At the first glance, this seems fairly inscrutable, but we will demystify all of it below.
It is worth mentioning though that there are alternatives to using EDI in healthcare space.
The format called FHIR, supported by HL7, has become a de-facto standard for information exchange, including APIs. This format supports JSON and other encoding options.
FHIR comes with the model for claims (“claim resource”), however, EDI is still very prevalent in the industry, and it is mandated by all the major players in the space, such as EDI clearing houses.
The pros and cons of EDI vs FHIR is a topic for another post.
We would only mention that EDI is more compact as it is devoid of almost any metadata, such as field names.
But this very trait also makes it difficult to understand.
EDI is a text file format, it only uses ASCII characters.
It uses delimiters, such as *
, to separate values, so it has similarities to a CSV format. EDI also has some commonalities with XML or JSON in that it allows for supporting hierarchies and repeating groups.
EDI does not use white space or indentation like YAML, all white space characters can be discarded from an EDI file.
Unlike traditional CSV files, EDI does not use lines, so a new line character “\n” is ignored. Therefore, our earlier example could be presented as a single string:
CLM*26463774*100***11:B:1*Y*A*Y*I~REF*D9*17312345600006351~HI*BK:0340*BF:V7389~LX*1~SV1*HC:99213*40*UN*1***1~DTP*472*D8*20061003~
It is somewhat customary, however, to start new line after ~
character (end of segment) when presenting EDI files for “human consumption”, we will follow this convention here.
Key concepts of the EDI format are explained below.
EDI format is batch-oriented. EDI data is packaged in files; files can pass multiple intermediaries, such as clearing houses.
A file can contain multiple types of data, the data schema is defined by “Transaction type”.
The transaction type is very similar to an XML or a JSON schema, however, it does not provide a machine-readable representation akin to an “.xsd” that can be automatically utilized by parsers.
Healthcare-related transaction types are buried within the [Insurance Transaction Set] (https://x12.org/products/transaction-sets).
This transaction set contains schemas for claims (institutional, professional, dental) and payments.
Each transaction type has a unique ID or name. “837” transaction types refer to claim data, “835” to payments.
Transaction header (ST
segment) stipulates the transaction type:
ST*837*12345*005010X222~
The file content following this line must confirm to the schema of this transaction type, e.g., 837
and 005010X222
,
which actually refers to the professional claims.
Analogous to a JSON object, contains data elements that belong to the same logical entity
Like a JSON object, it can also contain nested loops (objects)
Can repeat multiple times. Unlike JSON, EDI does not have a special notation for arrays (“[]” in JSON)
Loops consist of segments.
The fragment below represent a single loop called “Billing provider loop”
NM1*41*2*PREMIER BILLING SERVICE*****46*TGJ23~
PER*IC*JERRY*TE*3055552222*EX*231~
A reusable group of elements, akin to a relational database table
A segment always starts with a segment “ID” denoting its purpose and ends with “~”
Unlike loops, segments may not contain nested segments
Segments contain elements, *
is used as a separator
Many segments contain “qualifier” or “identifier code” elements. A qualifier specifies a role or a purpose of a segment
This is a single segment:
NM1*85*2*ABC Group Practice*XX*1234567890~
The prefix NM1
is the segment ID. The rest of the string consists of values (elements), ending with ~
.
NM1
is “Individual or Organizational Name”.
NM1
is used for any name in an EDI file – could be a provider, a patient, a payer and so on.
NM1
can contain up to 12 elements (like a database table containing 12 columns).
Some elements in a segment could be optional. For example, the last element in our example, 1234567890
, is a
provider’s NPI. This will not apply to NM1
used for patient names.
The first element 85
is “Entity Identifier Code”. Code 85
identifies a billing provider.
The segment ID is the only explicit piece of “metadata” in an EDI file. Loops and elements do not their names stated anywhere in the file and have to be gleaned from context.
Similar to a column in a database table or to an Excel cell
Support various data primitive types such as string, numeric, date
Sometimes, an element can repeat, but this is relatively rare. Think of repeating elements as an array of values.
EDI has a notion of a composite element, see below. Most elements, however, are simple value elements
Let’s take the following segment:
DTP*454*D8*20050108~
We have the following elements:
454
is a “Date Time Qualifier”, “454” stands for “Initial Treatment”.
D8
20050108
Consists of multiple sub-elements separated by “:”
Can repeat multiple times
Typically, used for specifying healthcare codes, such as procedures and diagnoses
The first element is usually a “qualifier” defining the type of code.
Composite elements are conceptually similar to segments. The difference is that they can exist only within a segment. You can think of a composite element as a JSON object or array of objects within another JSON object (segment).
Example:
SV2*0300*HC:81099*73.42*UN*1~
SV2
segment defines “Institutional Service Line”; it includes charge amount (73.42), and the procedure code HC:81099
.
:
tells us that this is a composite element. The first sub-element HC
defines the type of the codeset (HCPCS/CPT),
the second 81099
states the actual procedure code.
The description for this code is “Unlisted urinalysis procedure”.
Here is an example of an EDI content with a single transaction (837P) containing a single claim.
Explanation is provided before each line. An identifier that starts with a number (2000A) is a loop, following by segments contained within that loop.
We added the new line before each loop for readability.
ST TRANSACTION SET HEADER
ST*837*0021*005010X222~
BHT BEGINNING OF HIERARCHICAL TRANSACTION
BHT*0019*00*244579*20061015*1023*CH~
1000A SUBMITTER
NM1 SUBMITTER NAME
NM1*41*2*PREMIER BILLING SERVICE*****46*TGJ23~
PER SUBMITTER EDI CONTACT INFORMATION
PER*IC*JERRY*TE*3055552222*EX*231~
1000B RECEIVER
NM1 RECEIVER NAME
NM1*40*2*KEY INSURANCE COMPANY*****46*66783JJT~
2000A BILLING PROVIDER HL LOOP
HL - BILLING PROVIDER
HL*1**20*1~
PRV BILLING PROVIDER SPECIALTY INFORMATION
PRV*BI*PXC*203BF0100Y~
2010AA BILLING PROVIDER
NM1 BILLING PROVIDER NAME
NM1*85*2*BEN KILDARE SERVICE*****XX*9876543210~
N3 BILLING PROVIDER ADDRESS
N3*234 SEAWAY ST~
N4 BILLING PROVIDER LOCATION
N4*MIAMI*FL*33111~
REF - BILLING PROVIDER TAX IDENTIFICATION
REF*EI*587654321~
2010AB PAY-TO PROVIDER
NM1 PAY-TO PROVIDER NAME
NM1*87*2~
N3 PAY-TO PROVIDER ADDRESS
N3*2345 OCEAN BLVD~
N4 PAY-TO PROVIDER CITY
N4*MAIMI*FL*33111~
2000B SUBSCRIBER HL LOOP
HL - SUBSCRIBER
HL*2*1*22*1~
MAY 2006 499
005010X222 • 837
SBR SUBSCRIBER INFORMATION
SBR*P**2222-SJ******CI~
2010BA SUBSCRIBER
NM1 SUBSCRIBER NAME
NM1*IL*1*SMITH*JANE****MI*JS00111223333~
DMG SUBSCRIBER DEMOGRAPHIC INFORMATION
DMG*D8*19430501*F~
2010BB PAYER
NM1 PAYER NAME
NM1*PR*2*KEY INSURANCE COMPANY*****PI*999996666~
REF BILLING PROVIDER SECONDARY IDENTIFICATION
REF*G2*KA6663~
2000C PATIENT HL LOOP
HL - PATIENT
HL*3*2*23*0~
PAT PATIENT INFORMATION
PAT*19~
2010CA PATIENT
NM1 PATIENT NAME
NM1*QC*1*SMITH*TED~
N3 PATIENT ADDRESS
N3*236 N MAIN ST~
N4 PATIENT CITY/STATE/ZIP
N4*MIAMI*FL*33413~
DMG PATIENT DEMOGRAPHIC INFORMATION
DMG*D8*19730501*M~
2300 CLAIM
CLM CLAIM LEVEL INFORMATION
CLM*26463774*100***11:B:1*Y*A*Y*I~
REF CLAIM IDENTIFICATION NUMBER FOR CLEARING HOUSES
REF*D9*17312345600006351~
HI HEALTH CARE DIAGNOSIS CODES
HI*BK:0340*BF:V7389~
2400 SERVICE LINE
LX SERVICE LINE COUNTER
LX*1~
SV1 PROFESSIONAL SERVICE
SV1*HC:99213*40*UN*1***1~
DTP DATE - SERVICE DATE(S)
DTP*472*D8*20061003~
2400 SERVICE LINE
LX SERVICE LINE COUNTER
LX*2~
SV1 PROFESSIONAL SERVICE
SV1*HC:87070*15*UN*1***1~
DTP DATE - SERVICE DATE(S)
DTP*472*D8*20061003~
2400 SERVICE LINE
LX SERVICE LINE COUNTER
LX*3~
SV1 PROFESSIONAL SERVICE
SV1*HC:99214*35*UN*1***2~
DTP DATE - SERVICE DATE(S)
DTP*472*D8*20061010~
2400 SERVICE LINE
LX SERVICE LINE COUNTER
LX*4~
SV1 PROFESSIONAL SERVICE
SV1*HC:86663*10*UN*1***2~
DTP DATE - SERVICE DATE(S)
DTP*472*D8*20061010~
TRAILER
SE TRANSACTION SET TRAILER
SE*42*0021~