linux网络服务器工具

源代码在线查看: mod_unique_id.html.en

软件大小: 7475 K
上传用户: edan1181
关键词: linux 网络服务器
下载地址: 免注册下载 普通下载 VIP

相关代码

																        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX				              This file is generated from xml source: DO NOT EDIT				        XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX				      -->				mod_unique_id - Apache HTTP Server																												Modules | Directives | FAQ | Glossary | Sitemap				Apache HTTP Server Version 2.2																Apache > HTTP Server > Documentation > Version 2.2 > Modules								Apache Module mod_unique_id								Available Languages:  en  |				 ja  |				 ko 								Description:Provides an environment variable with a unique				identifier for each request				Status:Extension				Module營dentifier:unique_id_module				Source燜ile:mod_unique_id.c				Summary												    This module provides a magic token for each request which is				    guaranteed to be unique across "all" requests under very				    specific conditions. The unique identifier is even unique				    across multiple machines in a properly configured cluster of				    machines. The environment variable UNIQUE_ID is				    set to the identifier for each request. Unique identifiers are				    useful for various reasons which are beyond the scope of this				    document.								Directives				This module provides no				            directives.				Topics								 Theory																Theory				    								    First a brief recap of how the Apache server works on Unix				    machines. This feature currently isn't supported on Windows NT.				    On Unix machines, Apache creates several children, the children				    process requests one at a time. Each child can serve multiple				    requests in its lifetime. For the purpose of this discussion,				    the children don't share any data with each other. We'll refer				    to the children as httpd processes.								    Your website has one or more machines under your				    administrative control, together we'll call them a cluster of				    machines. Each machine can possibly run multiple instances of				    Apache. All of these collectively are considered "the				    universe", and with certain assumptions we'll show that in this				    universe we can generate unique identifiers for each request,				    without extensive communication between machines in the				    cluster.								    The machines in your cluster should satisfy these				    requirements. (Even if you have only one machine you should				    synchronize its clock with NTP.)								    				      The machines' times are synchronized via NTP or other				      network time protocol.								      The machines' hostnames all differ, such that the module				      can do a hostname lookup on the hostname and receive a				      different IP address for each machine in the cluster.				    								    As far as operating system assumptions go, we assume that				    pids (process ids) fit in 32-bits. If the operating system uses				    more than 32-bits for a pid, the fix is trivial but must be				    performed in the code.								    Given those assumptions, at a single point in time we can				    identify any httpd process on any machine in the cluster from				    all other httpd processes. The machine's IP address and the pid				    of the httpd process are sufficient to do this. So in order to				    generate unique identifiers for requests we need only				    distinguish between different points in time.								    To distinguish time we will use a Unix timestamp (seconds				    since January 1, 1970 UTC), and a 16-bit counter. The timestamp				    has only one second granularity, so the counter is used to				    represent up to 65536 values during a single second. The				    quadruple ( ip_addr, pid, time_stamp, counter ) is				    sufficient to enumerate 65536 requests per second per httpd				    process. There are issues however with pid reuse over time, and				    the counter is used to alleviate this issue.								    When an httpd child is created, the counter is initialized				    with ( current microseconds divided by 10 ) modulo 65536 (this				    formula was chosen to eliminate some variance problems with the				    low order bits of the microsecond timers on some systems). When				    a unique identifier is generated, the time stamp used is the				    time the request arrived at the web server. The counter is				    incremented every time an identifier is generated (and allowed				    to roll over).								    The kernel generates a pid for each process as it forks the				    process, and pids are allowed to roll over (they're 16-bits on				    many Unixes, but newer systems have expanded to 32-bits). So				    over time the same pid will be reused. However unless it is				    reused within the same second, it does not destroy the				    uniqueness of our quadruple. That is, we assume the system does				    not spawn 65536 processes in a one second interval (it may even				    be 32768 processes on some Unixes, but even this isn't likely				    to happen).								    Suppose that time repeats itself for some reason. That is,				    suppose that the system's clock is screwed up and it revisits a				    past time (or it is too far forward, is reset correctly, and				    then revisits the future time). In this case we can easily show				    that we can get pid and time stamp reuse. The choice of				    initializer for the counter is intended to help defeat this.				    Note that we really want a random number to initialize the				    counter, but there aren't any readily available numbers on most				    systems (i.e., you can't use rand() because you need				    to seed the generator, and can't seed it with the time because				    time, at least at one second resolution, has repeated itself).				    This is not a perfect defense.								    How good a defense is it? Suppose that one of your machines				    serves at most 500 requests per second (which is a very				    reasonable upper bound at this writing, because systems				    generally do more than just shovel out static files). To do				    that it will require a number of children which depends on how				    many concurrent clients you have. But we'll be pessimistic and				    suppose that a single child is able to serve 500 requests per				    second. There are 1000 possible starting counter values such				    that two sequences of 500 requests overlap. So there is a 1.5%				    chance that if time (at one second resolution) repeats itself				    this child will repeat a counter value, and uniqueness will be				    broken. This was a very pessimistic example, and with real				    world values it's even less likely to occur. If your system is				    such that it's still likely to occur, then perhaps you should				    make the counter 32 bits (by editing the code).								    You may be concerned about the clock being "set back" during				    summer daylight savings. However this isn't an issue because				    the times used here are UTC, which "always" go forward. Note				    that x86 based Unixes may need proper configuration for this to				    be true -- they should be configured to assume that the				    motherboard clock is on UTC and compensate appropriately. But				    even still, if you're running NTP then your UTC time will be				    correct very shortly after reboot.								    The UNIQUE_ID environment variable is				    constructed by encoding the 112-bit (32-bit IP address, 32 bit				    pid, 32 bit time stamp, 16 bit counter) quadruple using the				    alphabet [A-Za-z0-9@-] in a manner similar to MIME				    base64 encoding, producing 19 characters. The MIME base64				    alphabet is actually [A-Za-z0-9+/] however				    + and / need to be specially encoded				    in URLs, which makes them less desirable. All values are				    encoded in network byte ordering so that the encoding is				    comparable across architectures of different byte ordering. The				    actual ordering of the encoding is: time stamp, IP address,				    pid, counter. This ordering has a purpose, but it should be				    emphasized that applications should not dissect the encoding.				    Applications should treat the entire encoded				    UNIQUE_ID as an opaque token, which can be				    compared against other UNIQUE_IDs for equality				    only.								    The ordering was chosen such that it's possible to change				    the encoding in the future without worrying about collision				    with an existing database of UNIQUE_IDs. The new				    encodings should also keep the time stamp as the first element,				    and can otherwise use the same alphabet and bit length. Since				    the time stamps are essentially an increasing sequence, it's				    sufficient to have a flag second in which all machines				    in the cluster stop serving and request, and stop using the old				    encoding format. Afterwards they can resume requests and begin				    issuing the new encodings.								    This we believe is a relatively portable solution to this				    problem. It can be extended to multithreaded systems like				    Windows NT, and can grow with future needs. The identifiers				    generated have essentially an infinite life-time because future				    identifiers can be made longer as required. Essentially no				    communication is required between machines in the cluster (only				    NTP synchronization is required, which is low overhead), and no				    communication between httpd processes is required (the				    communication is implicit in the pid value assigned by the				    kernel). In very specific situations the identifier can be				    shortened, but more information needs to be assumed (for				    example the 32-bit IP address is overkill for any site, but				    there is no portable shorter replacement for it). 																Available Languages:  en  |				 ja  |				 ko 								Copyright 2008 The Apache Software Foundation.Licensed under the Apache License, Version 2.0.				Modules | Directives | FAQ | Glossary | Sitemap							

相关资源