This document gives a short description of
ISO 8601, the date and time representation standard.
It also presents some arguments why it should be applied,
especially in Web authoring.
Sample codes are given for printing date and time in ISO 8601
format in some programming languages.
Links to more detailed technical resources are given.
This document
recommends the following simple format
for dates:
1998-05-12 (year-month-day)
and the following format for combined date and time
in international contexts:
1998-05-12T10:20Z
though it may improve readability to replace the letter T by a space.
There are different date and time formats in use in different parts of the world and in different contexts. There are two major practical problems:
These problems are combined in notations like 01/02/03. Does it mean 1st of February, 2003, or 2nd of January, 2003, or 2nd of March, 2001, or what? (In some notations, the year precedes the date.) If a product has the text "Use before 03/06/09" without explanation, what do you do?
Practical problems are also caused by the ambiguity of time designations:
Although some of these problems could be rather easily solved by special solutions in special cases, it is evident that a uniform and universal date and time representation format is needed. For example, in a monolingual context, the first problem (ambiguity of notations like 5/6/98) could be solved by writing the month as a name, not with digits. But this too would introduce an unnecessary language dependency. For example, on a multilingual Web page, one would certainly like to have the "last updated" date expressed just once, in a language-neutral notation.
As an example, consider the following system message which is bilingual for a good reason (from real life, but abbreviated here, and with typos fixed):
* Internet-yhteydet poikki maanantaina 23.11. klo 17 - 20
* ---
* Internet outside of HUT will be inaccessible starting at 5 o'clock PM
* (1700 hours), on Monday, November 23. Estimated duration is
* till 8 o'clock PM (2000 hours) at most.
So the time is expressed in three ways. Users have been observed to get easily confused in such situations. The more you try to explain things in different ways, the more probable it is that the ways get mixed up. Moreover, it all becomes too long for short announcements, headlines, etc. By switching to simple ISO 8601 notations things could be expressed briefly and uniquely:
Internet-yhteydet poikki, Internet connections off:
1998-11-23T17/20
During a transitional period we would, of course, need to accompany such information with text (in "finer print" when applicable) which expresses the time period in older, language dependent notations.
Automatic processing of data is easier to program if dates and times are expressed in a uniform, preferably fixed-length notation. The format should allow simple comparison and sorting of dates and times, which means that the notation should be either fully descending (with the most significant part, such as the year, expressed first, the the next significant part, such as the month, etc, up to seconds and parts of a secord) or fully ascending (just the opposite). It should be noted that such uniformity would be most beneficial for small, tool-like programs, typically created by private persons or small companies. In a large project by a large software vendor, the cost of code for handling a wide variety of date and time formats is relatively small (although perhaps absolutely large).
On the Internet, the notation of times and
dates has always been problematic.
In particular,
the format of Internet E-mail messages, as defined in 1982-08-13
(with some later modifications)
by
RFC 822 remained valid
(which is still valid
as an Internet standard) for a very long time.
specifies a relatively uniform notation for date and time.
It allowed some variation, but the most common alternative was
something like
Fri, 8 May 1998 15:57:33 +0300 (EET DST)
There was enough variation to make it difficult to write simple
programs for processing such data, too little variation to please
everyone. In 2001-04,
RFC 2822
was published as a successor to RFC 822. It restricted the
recommended date and time formats to the format exemplified above.
Note that
this format is hardly used
outside the Internet.
In addition, different programs use date and time formats differing from the one specified in RFC 822 and RFC 2822. To illustrate the diversity, let us take a look at the Proposed Standard RFC 2068; in the discussion of time and date formats, it says:
HTTP applications have historically allowed three different formats for the representation of date/time stamps:Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() formatThe first format is preferred as an Internet standard and represents a fixed-length subset of that defined by RFC 1123 (an update to RFC 822). The second format is in common use, but is based on the obsolete RFC 850 date format and lacks a four-digit year.
The ISO 8601 standard, or most officially ISO 8601:2004 Data elements and interchange formats -- Information interchange -- Representation of dates and times, approved by ISO in 1988, updated in 2000, again in 2004, defines a large number of alternative representation of dates, times, and time intervals. Thus, rather than the date and time standard, it is just a general framework. To achieve uniformity, we must select one or a few formats from it and apply them consistently.
Luckily, it seems that people who know about ISO 8601 usually stick to the same simple alternatives. The following is an attempt to describe "best current practice" (in the informal sense of this phrase):
1998-05-12
, always expressing the year in full,
followed
by the month and then the day.
Thus, the example means the 12th of May in 1998.
Use exactly two digits for the
month and exactly two digits for the day, using leading
zeros when necessary.
Notice that there is no time zone indication, although dates too
are time zone dependent in principle; by default, times are relative
to some local time zone. If this is of some concern for dates
(i.e. you need to be very exact with them), you could express the
date in UTC and append a Z
to the date designation to
indicate this. But in such cases, the combined date and the format
is probably preferable (see below).
14:15
or
14:15:00
,
always expressing hours and minutes and seconds (if present)
each with exactly two digits.
Express the time as local time in the time zone implied by
the context.
But whenever there is any possibility of misunderstanding
what the time zone is, use the next option:
14:15Z
or
14:15:00Z
,
always expressing hours and minutes and seconds (if present)
each with exactly two digits.
Express the time as Universal Time Coordinated
(UTC, formerly
called Greenwich Mean Time, GMT); the appended Z
letter indicates that the time is represented in UTC.
Alternatively, use a local time with explicit zone designation
as explained in the next item.
T
and the time of the day designation, e.g.
1998-05-12T14:15Z
.
Note that the standard clearly requires the use of T
in this context. However, such a notation is often regarded as
odd-looking, and people who otherwise use ISO 8601 might deviate
from it here by using a space instead.
/
and
an indication of the end of the period.
Of course, one of the formats mentioned above is used for the
start and end.
However, to allow reasonably short expressions,
higher order components of the end designation can be omitted,
in which case the
corresponding values from the start designation are used.
Examples:1998-05-12T14:15Z/1998-05-13T16:00Z
(time interval
extending from one day to another)1998-05-12T14:15Z/16:00Z
(time interval within a day)1998-05-12/15
(time interval from the 12th to 15th of May,
1998).
Note: This is compatible with the format described in the Dates and times subsection of the HTML 4.01 Specification for use with certain HTML constructs. However, the format specified there is stricter in the sense that only the combined date and time format is allowed and it must contain the seconds part, but more permissive in the sense that is allows other time zones than UTC, too.
For periods of time, notations such as
1980-85 have often been used. Even if you use an en dash (–) instead
of a hyphen and/or surround that punctuation with spaces, there
misunderstandings may arise. According to
ISO 8601, a notation like 2000-02 uniquely means the second
month of year 2000, so it is risky to use it, or any similar notation,
to denote years from 2000 to 2002. Using
The ISO 8601 standard does not specify whether a date or time (or date and time) designation refers to a singular point in time or a time period. In particular, a designation of a date can be used to refer to a full 24-hour day or a specific moment of time within it, probably by default the start of the day (00:00). Similarly, a time notation like 9:00 could refer to nine o'clock absolutely sharp or the period from 09:00 to 09:01 or anything else. When necessary, a specific agreement or verbal indication of the meaning can be given, or the most explicit notation with ISO 8601 could be used. For example, one could write 09:00:00 or 09:00:00/09:01:00 to distinguish between the two interpretations mentioned above.
Within the European standardization organization CEN, a so-called CEN Workshop Agreement (CWA) on various notations has been prepared, and it specifies:
For the date and time conventions, the following numeric forms are recommended to be used in a language-independent, pan-European document.
Long date: 1996-04-28 Abbreviated date and time: 1996-04-28 17:22:06 Abbreviated long date: 1996-04-28 Numeric date: 1996-04-28 Time: 09:22:06 The 24 hour system is used in Europe. Thus the time of the day is given in the range from 00:00:00 to 23:59:59, and the possible leap second 23:59:60. No abbreviation is used for before or after noon.
NOTE The abbreviated date and time is given as the combination of the date format and the time format of ISO 8601; as opposed to the combined day-and-time format of said standard, which includes a "T" between day and time.
- CWA 14051-1, Information Technology - European generic locales - Part 1: General specifications, section 4.1.5 (page 10).
There was an European standard, EN 28601, with the same content as ISO 8601. It has however been withdrawn. Members of CEN are thus no longer required to have national standards on this issue.
In modern approach to localization, data is internally stored and processed in a neutral format as far as possible. If localization is desired, such as the presentation of data in a particular language or notation, it is performed as close to the user as possible. This makes it possible to apply user-selected presentation principles. The approach is described in quite some detail in the Common Locale Data Repository (CLDR) material. Apparently, ISO 8601 is the suitable neutral format for dates and times.
The basic separators used according to ISO 8601 are the
The ISO 8601 standard allows these separators to be omitted (e.g., 19980512 for a date), but expressions are much easier to read when separators are used. The separators also make it more obvious that a date or time is given; a string of digits could mean different things.
The separators can be omitted in internal data formats that are
never visible to users. Sometimes they need to be omitted due to
technical restrictions or special considerations.
For example, if you use file names that correspond to dates
(e.g. in news archives), a name like 19980512.html is probably
more convenient than
The standard distinguishes the hyphen from the minus sign as well as hyphen-minus, often called ASCII hyphen. (These concepts are explained in the document Dashes and hyphens.) However, it mentions that both hyphen and minus may be mapped to hyphen-minus when the character repertoire is limited, and this is common practice. Moreover, programs that interpret date notations might expect to see hyphen-minus. In principle, however, U+2010 HYPHEN is the most appropriate character for use in ISO 8601 dates, when available (e.g., in text processing when using a font that contains it).
When ISO 8601 date notations are used in text (or in tables), there might a risk of line break after a hyphen. Although that would not be strictly wrong, it cannot be regarded as good presentation. However, technically it would be incorrect (and often ineffective) to use the non-breaking hyphen character. Usually the problem needs to be handled at levels other than character level, e.g. using markup (see notes on preventing line breaks on web pages).
As an example of writing code which outputs a date in the ISO 8601 notation, here is the C code for getting the current date and printing it:
time_t now_t;
struct tm now;
time(&now_t);
now = *localtime(&now_t);
printf("%4d-%02d-%02d",
now.tm_year+1900, now.tm_mon+1,
now.tm_mday);
Naturally, only the printf
function call is affected
by the date format used. Notice the use of zero in the field designator
%02d
to force the number to be written with exactly two
digits, using leading zero if needed.
As another example, here is Perl code for getting the current date and time and writing it in UTC:
($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) =
gmtime(time);
$t = sprintf "%4d-%02d-%02dT%02d:%02dZ\n",
1900+$year,$mon+1,$mday,$hour,$min;
print $t;
In JavaScript programming,
we can expect currenty used browsers to support the
toISOString
method, which yields an ISO 8601 conformant
notation. If you just need the date part, you can pick up
a substring
consisting of the first ten
characters (because in ISO 8601, the date part is of fixed length):
var today = new Date();
var dateString = today.toISOString().substring(0, 10);
The following information
about clumsier solutions is preserved here mostly
for historical reasons:
In JavaScript,
there are various advanced date functions,
such as getFullYear
. Previously they were not
supported by all JavaScript implementations, so it was safest
to use just the basic date functions and "do it yourself" (performing
Y2K corrections too).
Although this is probably irrelevant nowadays, here is
code that constructs an ISO 8601
conformant date notation into the value of the variable
dateString
using a just old basic functions:
function getCorrectedYear(year) {
year = year - 0; /* converting to a number */
if (year
Pete Forman has written more detailed ECMAscript code for determining the date and time.
Concerning Java, see section Dates and Times of the Java FAQ and code written by Simon Brooke.
If you use the strftime function
(see Single UNIX®
Specification
for a description), the following format specification would be suitable:
"%Y-%m-%dT%H:%M:%SZ"
to get both date and time.
This means that if you, as a Web author, use
Server Side Includes
(SSI), the following should cause the date and time
denotation
(corresponding to the moment when the server processes and sends
the document)
to be inserted in ISO 8601 format:
<!--#config timefmt="%Y-%m-%dT%H:%M:%SZ"--> <!--#echo var="DATE_GMT"-->
But check server-specific documents and test that
this works on your server! And if you just want to have the
date inserted, you'd use timefmt="%Y-%m-%d"
.
In PHP, you can write the current date and time on the server as follows:
$now = substr_replace(strftime("%Y-%m-%dT%H:%M:%S%z"), ":", -2, 0); echo "Server datetime is ", $now;
In Fortran 90, the following code could be used
to print the current date and time (without seconds)
in the local time zone using the standard subroutine
date_and_time
:
character*8 date character*10 time character*5 zone integer values(8) call date_and_time(date,time,zone,values) print 100, values(1), values(2), values(3), values(5), values(6) 100 format(1X,I4,'-',I2.2,'-',I2.2,'T',I2.2,':',I2.2)
For some other program codes related to formatting dates and times in ISO 8601 notation, see ISO 8601 Date and Time - Converting and implementing by Nikolai Sandved Aasen.
It is probably unnecessary to apply these notations in running monolingual text, where language-dependent traditional expressions with the month expressed with a word like "the 4th of July" or "4. heinäkuuta" can be used without problems. But separate date designations, such as date of issue or date of last update, and tabulated dates, are best presented using the variant of ISO 8601 outlined above.
The CLDR (Common Locale Data Repository) activity, coordinated by the Unicode Consortium, has defined a general-purpose formalism (a markup language, LDML) for specifying formats of date and time representation. It has also collected voluminous information about date and time formats in different locales (languages and language variants) represented in that formalism. The general idea is that internally, in data structures and binary files, dates and times should be represented in ISO 8601 format, but externally, when displaying data to users, they should be formatted according to the language of the context and ultimately according to each user's preferences. Of course, the user's preference could be ISO 8601, too.