BytePane

XML to JSON Converter: Transform XML Data to JSON

Data Formats15 min read

From XML's Dominance to JSON's Takeover — and Why You Still Need to Convert Between Them

1998

W3C publishes XML 1.0. It becomes the universal format for data interchange. SOAP web services, RSS feeds, WSDL service descriptions, SVG graphics, and enterprise middleware all run on XML. Every major platform — Java, .NET, SAP — builds XML processing at its core.

2001–06

JSON emerges. Douglas Crockford popularizes JSON as a lightweight alternative. json.org launches in 2002. RFC 4627 formalizes the format in 2006. Early adopters: Yahoo!, del.icio.us, Flickr.

2008–14

JSON displaces XML in new APIs. Twitter, GitHub, Stripe, Twilio all launch with JSON APIs. REST replaces SOAP as the dominant API style. Stack Overflow's developer surveys begin tracking JSON as the most common data format.

2026

XML remains entrenched in large swaths of the industry. Healthcare (HL7 FHIR XML, CDA documents), finance (FpML, XBRL financial reporting), publishing (DITA, DocBook), government (NIEM, GML geospatial data), and any enterprise system built before 2010 still run on XML. RSS and Atom feeds are XML. SVG is XML. Maven's pom.xml is XML. Android layouts are XML.

The result: developers in 2026 regularly need to convert XML to JSON — reading a SOAP legacy API response, parsing an RSS feed, consuming an external partner's data in XML format, or migrating data from an enterprise system to a modern JSON API. The conversion is not always straightforward. XML has structural features that have no direct JSON equivalent, and the conversion decisions you make affect how downstream code consumes the data.

Key Takeaways

  • XML-to-JSON conversion is inherently lossy for XML attributes, namespaces, comments, and processing instructions — JSON has no equivalent constructs for these.
  • The biggest conversion trap: an XML element with a single child becomes an object, but the same element with two children becomes an array — most libraries have an option to force arrays to avoid this inconsistency.
  • Use fast-xml-parser for Node.js (5–10× faster than xml2js), xmltodict for Python, and encoding/xml for Go.
  • For multi-GB XML files, never load the entire document into memory — use SAX-based streaming parsers (node-expat, saxes in Node.js; xml.sax in Python).
  • Always validate your JSON output against your intended schema — XML attributes and element values merge in unexpected ways depending on which library you use.

XML vs. JSON: The Structural Mismatch

XML and JSON were designed for different purposes. XML was designed for document markup — it inherits from SGML and HTML, where mixed content (text interleaved with markup tags) is fundamental. JSON was designed for data interchange — pure data structures with no document semantics. This philosophical difference creates several structural mismatches:

XML FeatureJSON EquivalentConversion ApproachInformation Loss?
AttributesNonePrefix with @ or $, or merge into element objectNo (if preserved)
NamespacesNoneKeep prefix in key name, or strip prefixes entirelyYes (if stripped)
CommentsNoneDropped — JSON has no comment syntaxYes (always)
Processing instructionsNoneDropped — no JSON equivalentYes (always)
Mixed contentNone (complex)Special "#text" key for text nodes alongside element childrenStructural change
CDATA sectionsStringDelimiters stripped, content becomes string valueNo
Document type (DOCTYPE)NoneDropped — JSON has no schema reference syntaxYes (schema info)
Element orderingPreservedJSON object keys are unordered (per spec)Yes (order semantics)
Multiple same-name childrenArrayCollected into arrayNo (if done correctly)
Text + attributes on same elementNoneSpecial "#text" key + "@attr" keysStructural change

The Three Conversion Problems That Break Downstream Code

Problem 1: The One-Child-vs-Many-Children Array Inconsistency

This is the most common source of bugs in XML-to-JSON conversion. When an XML element has a single child of a given type, most libraries represent it as an object. When it has multiple children, they become an array. The result: your JSON structure changes based on the data, not the schema.

<!-- XML: one item vs many items -->
<cart>
  <item>Laptop</item>          <!-- single item -->
</cart>

<cart>
  <item>Laptop</item>          <!-- multiple items -->
  <item>Mouse</item>
</cart>

// xml2js default output — structure changes!
// Single item → object:
{ cart: { item: "Laptop" } }

// Multiple items → array:
{ cart: { item: ["Laptop", "Mouse"] } }

// Your downstream code that worked for one item:
cart.item.toUpperCase()  // works when item is a string
cart.item.toUpperCase()  // TypeError: cart.item.toUpperCase is not a function
                         // when item becomes an array

The fix: use the library's "force array" option, or normalize after conversion:

// xml2js: explicitArray: true forces all values to arrays
const xml2js = require('xml2js')
const parser = new xml2js.Parser({ explicitArray: true })

// Result is always an array — consistent regardless of child count:
{ cart: { item: ["Laptop"] } }         // one item
{ cart: { item: ["Laptop", "Mouse"] } } // two items

// fast-xml-parser: isArray callback for targeted forcing
const { XMLParser } = require('fast-xml-parser')
const parser = new XMLParser({
  isArray: (tagName) => ['item', 'product', 'order'].includes(tagName),
  ignoreAttributes: false,
  attributeNamePrefix: '@_',
})

Problem 2: Attributes vs. Element Values on the Same Element

XML elements can have both attributes and text content simultaneously. JSON objects cannot natively express this:

<!-- XML: element with attributes AND text content -->
<price currency="USD" vat="false">19.99</price>

<!-- The challenge: currency and vat are attributes; 19.99 is the element value.
     JSON has no "element + attributes" concept. -->

// Approach 1: @-prefix for attributes, #text for text content
{
  "price": {
    "@currency": "USD",
    "@vat": "false",
    "#text": "19.99"
  }
}

// Approach 2: Flat merge (loses attribute/element distinction)
{
  "price": {
    "currency": "USD",
    "vat": "false",
    "_value": "19.99"
  }
}

// Approach 3: Discard attributes (most lossy but simplest)
{ "price": "19.99" }

The @-prefix convention (used by BadgerFish, JAXB, and many libraries) is the most portable choice — it preserves information while being predictable for downstream consumers.

Problem 3: XML Namespaces

<!-- SOAP envelope with namespaces -->
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <soap:Body>
    <GetOrderResponse xmlns="http://example.com/orders">
      <Order>
        <xsi:type>StandardOrder</xsi:type>
        <orderId>12345</orderId>
      </Order>
    </GetOrderResponse>
  </soap:Body>
</soap:Envelope>

// If you strip namespace prefixes:
{
  "Envelope": {
    "Body": {
      "GetOrderResponse": {
        "Order": { "type": "StandardOrder", "orderId": "12345" }
      }
    }
  }
}
// Problem: if another namespace uses a "type" element, it collides with xsi:type

// If you preserve namespace prefixes as keys:
{
  "soap:Envelope": {
    "soap:Body": {
      "GetOrderResponse": {
        "Order": { "xsi:type": "StandardOrder", "orderId": "12345" }
      }
    }
  }
}
// Problem: colon in JSON key requires bracket notation: obj["soap:Envelope"]
//          Most languages handle this, but it's awkward

For SOAP API integration, the most practical approach is to parse the XML with a namespace-aware library and explicitly map the elements you care about to a clean JSON structure — rather than relying on generic XML-to-JSON conversion.

XML to JSON Conversion Code in Three Languages

Python: xmltodict

xmltodict is the most Pythonic XML-to-JSON library — it produces a dictionary that mirrors the XML structure using the @-prefix convention for attributes. Per its PyPI page, xmltodict has over 5 million monthly downloads.

# pip install xmltodict
import xmltodict
import json

def xml_to_json(xml_string: str, indent: int = 2) -> str:
    """Convert XML string to JSON string using xmltodict."""
    # force_list ensures elements always become lists, not dicts
    # when there's only one child — prevents the inconsistency bug
    force_list = ('item', 'product', 'order', 'entry', 'record')

    data = xmltodict.parse(
        xml_string,
        force_list=force_list,
        attr_prefix='@',    # attributes prefixed with @
        cdata_key='#text',  # text content stored under #text key
    )
    return json.dumps(data, indent=indent, ensure_ascii=False)

# Example: Parse an RSS feed
rss_xml = """
<rss version="2.0">
  <channel>
    <title>My Blog</title>
    <item>
      <title>First Post</title>
      <link>https://example.com/post/1</link>
      <pubDate>Mon, 21 Apr 2026 10:00:00 GMT</pubDate>
    </item>
    <item>
      <title>Second Post</title>
      <link>https://example.com/post/2</link>
      <pubDate>Tue, 22 Apr 2026 10:00:00 GMT</pubDate>
    </item>
  </channel>
</rss>
"""

json_output = xml_to_json(rss_xml)
data = json.loads(json_output)

# Access feed items safely:
items = data['rss']['channel']['item']  # always a list (due to force_list)
for item in items:
    print(item['title'], item['link'])

# Convert from file:
with open('feed.xml', 'rb') as f:
    data = xmltodict.parse(f)
json_output = json.dumps(data, indent=2)

# Reverse: JSON dict back to XML
xml_output = xmltodict.unparse(data, pretty=True, indent='  ')

For SOAP APIs, use xml.etree.ElementTree from the standard library — it gives you explicit namespace handling:

import xml.etree.ElementTree as ET
import json

def parse_soap_order(xml_string: str) -> dict:
    """Parse a SOAP GetOrderResponse without generic conversion."""
    ns = {
        'soap': 'http://schemas.xmlsoap.org/soap/envelope/',
        'orders': 'http://example.com/orders'
    }
    root = ET.fromstring(xml_string)

    # Navigate with explicit namespace awareness
    body = root.find('soap:Body', ns)
    response = body.find('orders:GetOrderResponse', ns)
    order = response.find('orders:Order', ns)

    return {
        'orderId': order.findtext('orders:orderId', namespaces=ns),
        'status': order.findtext('orders:status', namespaces=ns),
        'amount': float(order.findtext('orders:amount', namespaces=ns, default='0')),
    }

Node.js: fast-xml-parser

fast-xml-parser benchmarks 5–10× faster than the older xml2js library for large documents and has no external dependencies. It is the recommended choice for new Node.js projects. According to its npm page, it processes over 90 million downloads per month as of 2025.

// npm install fast-xml-parser
import { XMLParser, XMLBuilder, XMLValidator } from 'fast-xml-parser'

// Validate before parsing
const isValid = XMLValidator.validate(xmlString)
if (isValid !== true) {
  throw new Error(`Invalid XML: ${isValid.err.msg} at line ${isValid.err.line}`)
}

const parser = new XMLParser({
  ignoreAttributes: false,        // preserve attributes
  attributeNamePrefix: '@_',      // prefix attributes with @_
  allowBooleanAttributes: true,
  parseAttributeValue: true,      // parse "123" attribute values as numbers
  parseTagValue: true,            // parse element text as numbers/booleans where appropriate

  // Force specific tags to always be arrays — critical for stability
  isArray: (tagName: string) =>
    ['item', 'product', 'entry', 'record', 'row'].includes(tagName),
})

const result = parser.parse(xmlString)
const json = JSON.stringify(result, null, 2)

// xml2js (older, slower, but widely used in legacy code):
import xml2js from 'xml2js'
const parser2 = new xml2js.Parser({
  explicitArray: true,   // always arrays — avoids the one-child bug
  mergeAttrs: false,     // keep attributes separate from element values
  attrkey: '@',          // attribute key prefix
  charkey: '#text',      // text content key
  explicitCharkey: true, // always include #text even when no attributes
})
const result2 = await parser2.parseStringPromise(xmlString)

// Streaming for large XML files (SAX-based):
import { createReadStream } from 'fs'
import { SAXParser } from 'sax'

const saxStream = createReadStream('large-data.xml')
const sax = new SAXParser(true, {})  // strict mode

sax.onopentag = (node) => { /* handle element start */ }
sax.ontext = (text) => { /* handle text content */ }
sax.onclosetag = (name) => { /* handle element end */ }

saxStream.pipe(sax)

Go: encoding/xml

Go's standard library encoding/xml provides the most control — you define Go structs that exactly match the XML structure, ensuring type-safe conversion. This is more code but produces the cleanest, most predictable JSON output:

package main

import (
    "encoding/json"
    "encoding/xml"
    "fmt"
    "strings"
)

// Define structs matching the XML structure
type RSSFeed struct {
    XMLName xml.Name   `xml:"rss" json:"-"`
    Version string     `xml:"version,attr" json:"version"`
    Channel Channel    `xml:"channel" json:"channel"`
}

type Channel struct {
    Title       string   `xml:"title" json:"title"`
    Link        string   `xml:"link" json:"link"`
    Description string   `xml:"description" json:"description"`
    Items       []Item   `xml:"item" json:"items"`
}

type Item struct {
    Title       string   `xml:"title" json:"title"`
    Link        string   `xml:"link" json:"link"`
    Description string   `xml:"description" json:"description"`
    PubDate     string   `xml:"pubDate" json:"pubDate"`
    GUID        string   `xml:"guid" json:"guid"`
}

func ConvertRSSToJSON(xmlData string) (string, error) {
    var feed RSSFeed
    if err := xml.NewDecoder(strings.NewReader(xmlData)).Decode(&feed); err != nil {
        return "", fmt.Errorf("xml decode: %w", err)
    }

    jsonData, err := json.MarshalIndent(feed, "", "  ")
    if err != nil {
        return "", fmt.Errorf("json marshal: %w", err)
    }
    return string(jsonData), nil
}

// Generic approach using xml.Token for unknown XML:
func GenericXMLToJSON(xmlData string) (map[string]interface{}, error) {
    decoder := xml.NewDecoder(strings.NewReader(xmlData))
    // Note: generic XML-to-map conversion in Go requires building
    // a recursive decoder — use the etree or xmlquery library for this:
    // go get github.com/beevik/etree
    // go get github.com/antchfx/xmlquery
    return nil, nil  // implement with etree/xmlquery for production
}

Online XML to JSON Converter Tools Compared

For one-off conversions or testing, online tools are faster than writing code. Here's an honest comparison of the main options based on features, not ranking:

ToolStrengthsWeaknessesPrivacy
FreeFormatter.comWidely known, many output options, handles large inputsSends data to server; ad-heavy UIServer-processed
ConvertSimple.comClean UI, fast, handles CDATA and attributesLimited error messages on malformed XMLServer-processed
CodeBeautify.orgMany format tools in one place, batch conversionCluttered UI, slow on large filesServer-processed
BytePane XML to JSONIn-browser processing (no upload), fast, clean output with attribute supportLess configurable than CLI tools; no namespace handlingClient-side only
transform.toolsOpen source, many format conversions, handles namespacesOccasional inconsistencies with edge casesClient-side

For sensitive data (internal API responses, PII-containing XML), prefer client-side tools or local CLI conversion. Server-processed tools upload your data to a third-party server.

After converting XML to JSON, use the BytePane JSON formatter to validate the JSON syntax and explore the structure. For the reverse operation, the YAML to JSON guide covers similar conversion patterns with YAML-specific edge cases.

Real-World XML to JSON Use Cases

RSS and Atom Feed Aggregation

RSS (Really Simple Syndication, now version 2.0) and Atom are XML formats. According to the W3C Validator service, RSS 2.0 remains the most widely deployed feed format in 2025, with millions of active feeds. Aggregators, podcast apps, and content pipeline tools all convert RSS XML to JSON for storage and APIs.

# Fetch and convert an RSS feed to JSON
import feedparser  # pip install feedparser
import json

# feedparser handles both RSS and Atom, normalizes the output
feed = feedparser.parse('https://example.com/feed.xml')

# feedparser already gives you a dict — just serialize to JSON
articles = [
    {
        'title': entry.title,
        'link': entry.link,
        'published': entry.published,
        'summary': entry.summary,
        'author': getattr(entry, 'author', None),
    }
    for entry in feed.entries
]

print(json.dumps(articles, indent=2))

SOAP to REST API Migration

Enterprise integration projects frequently involve wrapping a SOAP (XML) service with a REST (JSON) facade — so modern frontend applications can consume legacy backend services. According to a MuleSoft Connectivity Benchmark 2025, 72% of enterprises still maintain at least one SOAP-based integration. A typical facade pattern:

// Express REST endpoint wrapping a SOAP service
import express from 'express'
import axios from 'axios'
import { XMLParser } from 'fast-xml-parser'

const router = express.Router()
const xmlParser = new XMLParser({
  ignoreAttributes: false,
  attributeNamePrefix: '@_',
  isArray: (tagName) => ['Order', 'LineItem'].includes(tagName),
})

router.get('/orders/:id', async (req, res) => {
  const soapEnvelope = `
    <?xml version="1.0" encoding="UTF-8"?>
    <soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
      <soap:Body>
        <GetOrder xmlns="http://example.com/orders">
          <OrderId>${req.params.id}</OrderId>
        </GetOrder>
      </soap:Body>
    </soap:Envelope>
  `

  const soapResponse = await axios.post(process.env.SOAP_ENDPOINT!, soapEnvelope, {
    headers: { 'Content-Type': 'text/xml; charset=utf-8', 'SOAPAction': 'GetOrder' },
  })

  const parsed = xmlParser.parse(soapResponse.data)

  // Navigate to the response body, stripping SOAP envelope
  const order = parsed?.['soap:Envelope']?.['soap:Body']?.GetOrderResponse?.Order

  if (!order) {
    return res.status(404).json({ error: 'Order not found' })
  }

  // Return clean JSON, not raw XML structure
  res.json({
    id: order.OrderId,
    status: order.Status,
    total: parseFloat(order.Total),
    createdAt: order.CreatedDate,
    items: (order.LineItem || []).map((item: any) => ({
      sku: item.SKU,
      quantity: parseInt(item.Quantity),
      unitPrice: parseFloat(item.UnitPrice),
    }))
  })
})

Maven/Gradle Build Configuration Parsing

Maven's pom.xml files are XML. Build analysis tools, dependency audit systems, and monorepo management tools often convert pom.xml to JSON for programmatic inspection. The Apache Maven Project has over 400,000 published artifacts in Maven Central, all with XML descriptors.

When NOT to Convert XML to JSON

Not every XML document should be converted to JSON. Some XML is genuinely document-centric and loses critical information in conversion:

  • SVG files — SVG is XML, but converting SVG to JSON destroys the document. Browsers render SVG natively; embed it as-is.
  • DocBook and DITA documentation — technical documentation formats where element ordering, mixed content, and semantic markup are all meaningful. Converting to JSON loses document structure.
  • XHTML — HTML content in XML format. Use a DOM parser or HTML-specific tools, not generic XML-to-JSON conversion.
  • XML with complex namespaces — When namespace URIs carry semantic meaning (XBRL financial data, HL7 CDA clinical documents), generic conversion loses the namespace information required to interpret the data correctly.
  • XSD-validated enterprise XML — If the consuming system can accept XML natively, don't convert. Conversion introduces a transformation layer that can go wrong.

For a broader comparison of data formats including when XML, JSON, and YAML each make sense, the JSON vs YAML vs XML comparison covers the trade-offs in detail. If you need to validate the XML before converting, use the BytePane XML formatter to check syntax first.

Frequently Asked Questions

Can all XML be converted to JSON?

Not without loss. XML features with no JSON equivalent — attributes (handled via @-prefix conventions), namespaces (kept as key prefixes or stripped), comments (dropped), processing instructions (dropped), and mixed content (element text interleaved with child elements) — require lossy conversion decisions. Simple data-oriented XML converts cleanly; document-centric XML does not.

How are XML attributes converted to JSON?

No single standard exists. Common approaches: prefix with @ (most widely adopted — JAXB, BadgerFish, fast-xml-parser's attributeNamePrefix option), merge into the element object (flat), or create a special $ key for attributes. The @-prefix is recommended — it preserves information and is predictable for downstream consumers.

Why do APIs still use XML in 2026?

Enterprise systems built on SOAP and legacy middleware remain in production. Healthcare (HL7, CDA), finance (FpML, XBRL), government (NIEM), and publishing (DITA) standardized on XML before JSON existed. Per the MuleSoft Connectivity Benchmark 2025, 72% of enterprises maintain at least one SOAP integration. These ecosystems move slowly.

What is the fastest XML to JSON converter for Node.js?

fast-xml-parser benchmarks 5–10× faster than xml2js for large documents, processes ~40MB/s on typical XML, and has zero external dependencies. For multi-GB files, use a SAX-based streaming approach with node-expat or saxes to avoid loading the full document into memory. For most production workloads under 10MB, any maintained library works adequately.

What is CDATA in XML and how does it convert to JSON?

CDATA sections (written as <![CDATA[...]]>) mark text that should not be parsed as XML markup — useful for embedding HTML, JavaScript, or SQL inside XML. When converting to JSON, the CDATA delimiters are stripped and the content becomes a plain JSON string. No information is lost; only the CDATA wrapping disappears.

How do XML namespaces affect JSON conversion?

XML namespaces have no JSON equivalent. Converters either strip namespace prefixes (losing info when same local names exist across namespaces) or keep prefixes as part of key names ("soap:Body" becomes a JSON key "soap:Body"). For complex namespace usage like SOAP or XBRL, use a namespace-aware parser and explicitly map the fields you need rather than relying on generic conversion.

Convert XML to JSON Instantly

Paste your XML and get clean JSON output in your browser — no server uploads, processes entirely client-side. Handles attributes, CDATA, and nested elements.