Access the NEW Basecamp Support Portal

Standard Generalized Markup Language

« Back to Glossary Index

Standard Generalized Markup Language (SGML)
– SGML is an ISO standard: ISO 8879:1986 Information processing– Text and office systems– Standard Generalized Markup Language (SGML).
– There are three versions of SGML: Original SGML, SGML (ENR), and SGML (ENR+WWW or WebSGML).
– SGML is part of a trio of enabling ISO standards for electronic documents developed by ISO/IEC JTC 1/SC 34.
– SGML was reworked in 1998 into XML, a successful profile of SGML.
– DSSSL (ISO/IEC 10179) and HyTime are the other two ISO standards related to electronic documents.

History and Terminology
– SGML descended from IBM’s Generalized Markup Language (GML).
– Charles Goldfarb, Edward Mosher, and Raymond Lorie developed GML in the 1960s.
– Goldfarb coined the GML term using their surname initials.
– Goldfarb also wrote the definitive work on SGML syntax in The SGML Handbook.
– SGML was originally designed to enable the sharing of machine-readable large-project documents in government, law, and industry.
– Tag-validity was introduced in SGML (ENR+WWW) to support XML.
– Fully tagged refers to documents with no DOCTYPE declaration or with a DOCTYPE declaration that makes no XML Infoset contributions.
– Integrally stored reflects the XML requirement that elements end in the same entity in which they started.
– Reference-free reflects the HTML requirement that entity references are for special characters and do not contain markup.
– SGML validity commentary before 1997 mainly covers type-validity.

Document Validity and Syntax
– SGML (ENR+WWW) defines two kinds of validity: type-valid SGML document and tag-valid SGML document.
– A type-valid SGML document has an associated document type declaration (DTD) to which it conforms.
– A tag-valid SGML document is fully tagged, and it may or may not have a document type declaration.
– Users may enforce additional constraints on a document, such as integrally-stored or reference-free requirements.
– SGML validity supports the requirement for rigorous markup.
– An SGML document consists of the SGML Declaration, the Prologue (containing a DOCTYPE declaration), and the instance itself.
– SGML documents can be composed of many entities.
– The SGML Declaration specifies the entities, element types, character sets, features, delimiter sets, and keywords used in the document.
XML documents have both a logical and physical structure, indicated by explicit markup.
– SGML syntax has optional features that can be enabled in the SGML Declaration.

Markup Minimization and Formal Characterization
– SGML has features for reducing the number of characters required to mark up a document.
– SGML processors need not support every available feature.
XML is intolerant of syntax omissions and does not require a DTD for checking well-formedness.
– Omitting start and end tags is allowed in SGML if certain conditions are met.
– The OMITTAG feature in the SGML Declaration enables the omission of tags.
– SGML has features that are difficult to describe using formal automata theory.
– There is no definitive classification of full SGML against a known class of formal grammar.
XML is generally parsable like a two-level grammar for non-validated XML.
– The SGML productions in the ISO standard are reported to be LL(3) or LL(4).
– The class of documents conforming to a given SGML document grammar forms an LL(1) language.

Derivatives and Applications
XML is a profile (subset) of SGML designed to ease implementation.
XML does not use the grammar (DTD) to change delimiter maps or inform parse modes.
XML validation of elements is not active in the same sense as SGML validation.
XML without a DTD is a grammar or a language.
XML with a DTD is a metalanguage.
– There are other derivatives of SGML, such as HTML and XHTML.
HTML is an application of SGML and has its own set of rules and syntax.
– XHTML is an XML-based version of HTML.
– XML-based derivatives provide stricter syntax rules and well-formedness requirements.
– Derivatives like HTML and XHTML have simplified and specific use cases compared to SGML.
– Document markup languages defined using SGML are called applications.
– The Text Encoding Initiative (TEI), DocBook, CALS, and HyTime are examples of SGML-based markup languages.
– Significant open-source implementations of SGML include ASP-SGML, ARC-SGML, SGMLS, and Project YAO.
– SP and Jade, maintained by the OpenJade project, are common parts of Linux distributions.
– The second edition of the Oxford English Dictionary is marked up with an SGML-based markup language.
– The third edition of the Oxford English Dictionary is marked up as XML.
– Some document markup languages related to SGML and XML cannot be processed using standard SGML and XML tools.
– The Z Format markup language and programming languages like Scala are examples.
– The Organization for the Advancement of Structured Information Standards (OASIS), S-expression, DSSSL, LaTeX, and other related concepts are also associated with SGML.

Note: The content has been organized into five comprehensive groups, combining identical concepts while keeping the facts, statistics, and detailed points intact.

The Standard Generalized Markup Language (SGML; ISO 8879:1986) is a standard for defining generalized markup languages for documents. ISO 8879 Annex A.1 states that generalized markup is "based on two postulates":

  • Declarative: Markup should describe a document's structure and other attributes rather than specify the processing that needs to be performed, because it is less likely to conflict with future developments.
  • Rigorous: In order to allow markup to take advantage of the techniques available for processing, markup should rigorously define objects like programs and databases.
Standard Generalized Markup Language
Filename extension
Internet media type
application/sgml, text/sgml
Uniform Type Identifier (UTI)public.xml[citation needed]
Developed byISO
Type of formatMarkup language
Extended fromGML
Extended toHTML, XML
StandardISO 8879

DocBook SGML and LinuxDoc are examples which used SGML tools.

« Back to Glossary Index

Request an article

Please let us know what you were looking for and our team will not only create the article but we'll also email you to let you know as soon as it's been published.
Most articles take 1-2 business days to research, write, and publish.
Content/Article Request Form

Submit your RFP

We can't wait to read about your project. Use the form below to submit your RFP!
Request for Proposal

Contact and Business Information

Provide details about how we can contact you and your business.

Quote Request Details

Provide some information about why you'd like a quote.