SHARE
Facebook X Pinterest WhatsApp

XML.com: Character Encodings within XML and Perl

Written By
thumbnail
Web Webster
Web Webster
Apr 30, 2000

“This article examines the handling of character encodings in
XML and Perl. I will look at what character encodings are and
what their relationship to XML is. We will then move on to how
encodings are handled in Perl, and end with some practical examples
of translating between encodings.

“Encodings! The hidden face of XML. For most people, at least
here in the US, XML is simply a data format that specifies elements
and attributes, and how to write them properly in a nice tree
structure.”

“But the truth is that, in order to encode text or data, you
first need to specify an encoding for it. The most common of all
encodings (at least in Western countries) is without a doubt ASCII.
Other encodings you may have come across include the following:
EBCDIC, which will remind some of you of the good old days when
computer and IBM meant the same thing; Shift-JIS, one of the
encodings used for Japanese characters; and Big 5, a Chinese
encoding.”

“What all of these encodings have in common is that they are
largely incompatible. There are very good reasons for this, the
first being that Western languages can live with 256 characters,
encoded in 8-bits, while Eastern languages use many more, thus
requiring multi-byte encodings. Recently, a new standard was
created to replace all of those various encodings: Unicode, a.k.a.
ISO 10646. (Actually they are two different standards — ISO 10646
from ISO and Unicode from the Unicode consortium — but they are so
close that we can consider them equivalent for most purposes.)”

Complete
Story

thumbnail
Web Webster

Web Webster

Web Webster has more than 20 years of writing and editorial experience in the tech sector. He’s written and edited news, demand generation, user-focused, and thought leadership content for business software solutions, consumer tech, and Linux Today, he edits and writes for a portfolio of tech industry news and analysis websites including webopedia.com, and DatabaseJournal.com.

Recommended for you...

Germany Puts Microsoft on Five Years Probation for Antitrust Bullying
brideoflinux
Oct 12, 2024
Linus Torvalds Expresses Frustration With Bcachefs Development Process
Senthil Kumar
Oct 7, 2024
Mozilla Thunderbird Lands On Android With New Beta Release
Senthil Kumar
Oct 1, 2024
Tor and Tails Merge to Fight Global Surveillance and Censorship
Bobby Borisov
Sep 26, 2024
Linux Today Logo

LinuxToday is a trusted, contributor-driven news resource supporting all types of Linux users. Our thriving international community engages with us through social media and frequent content contributions aimed at solving problems ranging from personal computing to enterprise-level IT operations. LinuxToday serves as a home for a community that struggles to find comparable information elsewhere on the web.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.