HOME

TheInfoList



OR:

When an
HTTP The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, w ...
client (generally a
web browser A web browser is application software for accessing websites. When a user requests a web page from a particular website, the browser retrieves its files from a web server and then displays the page on the user's screen. Browsers are used ...
) requests a
URL A Uniform Resource Locator (URL), colloquially termed as a web address, is a reference to a web resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifi ...
that points to a directory structure instead of an actual web page within the directory structure, the
web server A web server is computer software and underlying hardware that accepts requests via HTTP (the network protocol created to distribute web content) or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initia ...
will generally serve a default page, which is often referred to as a main or "index" page. A common filename for such a page is index.
html The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScript ...
, but most modern HTTP servers offer a configurable list of filenames that the server can use as an index. If a server is configured to support
server-side scripting Server-side scripting is a technique used in web development which involves employing scripts on a web server which produces a response customized for each user's (client's) request to the website. The alternative is for the web server itself ...
, the list will usually include entries allowing dynamic content to be used as the index page (e.g. index. cgi, index. pl, index.
php PHP is a general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementation is now produced by The PHP Group. ...
, index. shtml, index. jsp, default. asp) even though it may be more appropriate to still specify the HTML output (index.html.php or index.html.aspx), as this should not be taken for granted. An example is the popular
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
web server
Apache The Apache () are a group of culturally related Native American tribes in the Southwestern United States, which include the Chiricahua, Jicarilla, Lipan, Mescalero, Mimbreño, Ndendahe (Bedonkohe or Mogollon and Nednhi or Carrizaleño a ...
, where the list of filenames is controlled by the DirectoryIndex directive in the main server configuration file or in the configuration file for that directory. It is possible to not use file extensions at all, and be neutral to content delivery methods, and set the server to automatically pick the best file through
content negotiation Content negotiation refers to mechanisms defined as a part of HTTP that make it possible to serve different versions of a document (or more generally, representations of a resource) at the same URI, so that user agent In computing, a user age ...
. If the server is unable to find a file with any of the names listed in its configuration, it may either return an error (usually 403 Index Listing Forbidden or 404 Not Found) or generate its own index page listing the files in the directory. Usually this option, often named autoindex, is also configurable.


History

A scheme where web server serves a default file on per-subdirectory basis has been supported as early as
NCSA HTTPd NCSA HTTPd is an early, now discontinued, web server originally developed at the NCSA at the University of Illinois at Urbana–Champaign by Robert McCool and others. First released in 1993, it was among the earliest web servers developed, foll ...
0.3beta (22 April 1993), which defaults to serve index.html file in the directory. This scheme has been then adopted by CERN HTTPd since at least 2.17beta (5 April 1994), whose default supports Welcome.html and welcome.html in addition to the NCSA-originated index.html. Later web servers typically support this default file scheme in one form or another; this is usually configurable, with index.html being one of the default file names.


Implementation

In some cases, the
home page A home page (or homepage) is the main web page of a website. The term may also refer to the start page shown in a web browser when the application first opens. Usually, the home page is located at the root of the website's domain or subdomain ...
of a website can be a menu of language options for large sites that use
geotargeting In geomarketing and internet marketing, geotargeting is the method of delivering different content to visitors based on their geolocation. This includes country, region/state, city, metro code/ zip code, organization, IP address, ISP, or other c ...
. It is also possible to avoid this step, for example, by using
content negotiation Content negotiation refers to mechanisms defined as a part of HTTP that make it possible to serve different versions of a document (or more generally, representations of a resource) at the same URI, so that user agent In computing, a user age ...
. In cases where no known index.* file exists within a given directory, the web server may be configured to provide an automatically generated listing of the files within the directory instead. With the Apache web server, for example, this behavior is provided by the mod_autoindex module and controlled by the Options +Indexes directive in the web server
configuration file In computing, configuration files (commonly known simply as config files) are files used to configure the parameters and initial settings for some computer programs. They are used for user applications, server processes and operating system ...
s. These automated ''directory listings'' are sometimes a security risk because they enumerate sensitive files which may not be intended for public access, in a process known as a directory indexing attack. Such a security misconfiguration may also assist in other attacks, such as a path or
directory traversal attack A directory traversal (or path traversal) attack exploits insufficient security validation or sanitization of user-supplied file names, such that characters representing "traverse to parent directory" are passed through to the operating system's ...
.


Performances

When accessing a directory, the various available index methods may also have a different impact on usage of OS resources (
RAM Ram, ram, or RAM may refer to: Animals * A male sheep * Ram cichlid, a freshwater tropical fish People * Ram (given name) * Ram (surname) * Ram (director) (Ramsubramaniam), an Indian Tamil film director * RAM (musician) (born 1974), Dutch ...
,
CPU time CPU time (or process time) is the amount of time for which a central processing unit (CPU) was used for processing instructions of a computer program or operating system, as opposed to elapsed time, which includes for example, waiting for i ...
, etc.) and thus on web server performances. Proceeding from ''fastest'' to ''slowest'' method, here it is the list: * using a static index file, e.g.: index.html, etc.; * using a web server feature usually named ''autoindex'' (when no index file exists) to let web server autogenerate directory listing by using its internal module; * using an interpreted file read by web server internal program interpreter, e.g.: index.php; * using a CGI executable and compiled program, e.g.: index.cgi.


References

{{reflist Web navigation