This is a minor release that mainly provides bug fixes and small
enhancements that have been commited to CVS since the last release. Vladimir
Koslov provided indispensable help for testing and preparing the Win32
release. Special thanks to Wayne Davison, Art Barstow, Peter Stamfest, Zhu
Qun-Ying, Jens Meggers, Ken Olum for their contributions.
Summary of principal changes (the ChangeLog
provides a detailed account)
The following sample applications could core dump under Windows:
head.c, getheaders.c, chunk.c, chunkbody.c,
multichunk.c
When retrieving an object from the cache, the HTTP headers
associated with the object weren't stored in the HTResponse
object.
The cache garbage collector could go into an endless loop.
The HTResponse object now returns the HTTP reason.
Update of expat to version 19990728.
Optimization, enhancements, and bugs fixes to the HTRDF module
Some memory leaks and compiler warning fixes.
The robot could hang when accessing local files.
Optimization of the HTChunk and HTHash modules.
Outstanding bugs
The webbot crashes from time to time under Windows. It
seems to be a problem while reading the robots.txt file,
where the application frees a request that's still registred in the
Windows async. loop.,
The tiny.c sample application doesn't prompt the user
with a text input.
This is the proper first 5.3 release. We changed the minor release number
as many people had started to refer to the previous release as 5.3.0.
What's new
Art Barstow (barstow@w3.org): fixed a bug with the handling of
parseType="Literal" by the RDF parser. None of the elements [and
attributes if there were any] were added to the literal. The only
string that was added to the literal was the elements' content (aka
CDATA).
Changed the Makefile so that .html->.h file conversion will only
be done with Perl. This is faster than using the www tool.
Updated the latest autoconf, automake, and libtools scripts. Now
using the separate config.guess
and config.sub scripts.
Extended the libwww/SSL API with HTSSL_setProtocol () and
HTSSL_verifyDepth () functions to allow to change the TSL/SSL default
protocol and the certificate verification depth. The wwwssl
application was updated accordingly.
Diverse documentation updates provided by Fox One.
Fixed bugs
The www wasn't able to convert .html files into .h ones
anymore.
the Extrnals.html file wasn't being
generated on all platforms.
Applied diverse patches for libwww/SSL which should correct the
multiple request and Post problems that people had signaled. Many
thanks to Heiner Kallweit and Gertjan van Winger for their
contributions.
Other patches I (unfortunately) forgot to note down and am too lazy
to get them out from CVS... (sorry, will keep better notes next
time!)
This is the first release done after Henrik's depart from W3C. We're
calling it a pre-release, as we're not sure if we prepared it correctly.
Special thanks to Henrik Frystik, Fox One, Vladimir Koslov, Rafaelle Sena,
and all the other libwww User's Community for their patches and continuous
support.
What's new
Included all the patches commited to CVS since the last release.
It's been over a year since the last release and we don't have an
exact count of what was commited. Many of them consisted of memory
leak fixes and fixup with the asynchronous event handlers and timers
under Win32.
Integrated the libwww-SSL code back into libwww (the US export
restriction doesn't apply anymore to source code).
Upgraded expat to v1.1.
Merged the outstanding Amaya libwww changes into the main tree.
For this release, special thanks go to the libwww hackers Olga Antropova,
Vladimir Kozlov, John Punin Olga Antropova, Jose Kahan, Vladimir Kozlov, John
Punin, Bob Racko, and Raffaele Sena for all their hard work and cool new
features including:
SSL
transport - this code is not part of the libwww distribution because
of US export restrictions on glue code for cryptographic code. We hope to
make them available shortly
This release is the has a lot of bug fixes and new features - primarily as
a result of lots of really cool work done by people hacking away on libwww
including:
This release is the first release after the libwww
CVS Repository was made public in May 1998. At the time of the release
there were more than 500 checked out versions of the libwww code base and
numerous people have contributed to this release in form of bug fixes, extensions, and new features.
Added support for regex handling of
proxies so that you can now say something like this: all URIs
matching this regex goes to this proxy, all others go to this, etc.
Better control of how much data is being sent in the REQUEST OBJECT. Now you can both get the
total number of bytes read and written as well as the number of entity
(body) bytes read and written. This uses features provided by the HTNet
object
The default implementation of progress
notifications now uses the new HTRequest bytes read/written
methods.
The mime parser in HTMIME.c and HTMIMERq.c now uses the updated control of
bytes read/written.
Added support for version conflict detection in Web Commander (update all of WinCom area)
The Web Commander - the Win32 GUI PUT
sample application has been updated from the ground up - try it out!
Made WWW_TraceFlag unsigned integer in HTUtils.html (was signed) and changed
order in wwwsys.html so that netinet/in.h now is included before
netinet/tcp.h in wwwsys.html. This was a problem on RC6000.
Put in a few checks for invalid function arguments (NULL etc.) in HTEscape.c and HTParse.c
Added explicit support for HTTP preconditions (the if-* headers) so
that they can now be turned on and off on a pr request basis. This
affects HTReqMan.cHTReq.html and HTReqMan.html
Changed the encoding streamstack builder to
support best matching (wildcard matching) capabilities. This
means that you now can register a default decoder like this:
HTFormat_addCoding("*", HTIdentityCoding, HTIdentityCoding,
0.3), for example where HTIdentityCoding is a coder
provided by libwww. The quality indicates how much you like it (between
0.0 and 1.0)
A close socket event wasn't registered on Windows. This is now done in
HTHost.c.
On some platforms, it is not possible to refer to function addresses
across shared libraries. The result is that the libwww API that uses
function addresses don't always work. I have added an opcode version of
the HTAlert API to handle this for user
messages and progress notifications.
Release 5.1m May 21, 1998
New Features
Changed the encoding
streamstack builder to support best matching capabilities.
This means that you now can register a default decoder like this:
HTFormat_addCoding("*", HTIdentityCoding, HTIdentityCoding,
0.3), for example where HTIdentityCoding is a coder
provided by libwww.
Made sure that we write more often as we check for expired timers in
HTWriter.c
Added a "repetitive" flag to timers so that they can be created to say
call me every 30ms over and over again
Added support for building libwww as a shared library on Unix (it is
already using DLLs on Windows). If you have any problems with shared
libraries on your platform then run
configure with the --disable-shared flag and you will
only get the static version. If you are using libwww as a shared library
but do not use the libwww HTML
parser then you MUST include this dummy implementation of the HText interface somewhere in your code. The
reason for this is that the HText interface may be called (it depends a
bit on how your platform supports shared libraries) but an implementation
isn't provided.
Added support for MySQL access in this simple SQL interface which takes care
of most os the common problems encountered when talking to an SQL server.
Currently I use this for handling
logging in the
webbot but many other features like persistent caching, preferences,
etc. can use the SQL module as well. You
can also see some cool examples of how
to use the robot's logs when querying MySQL through www-sql.
Changed the name of "sysdep.h" to "wwwsys.h" for the
same reason
Bug Fixes
Fixed a bug in the generation of relative
URIs which caused the creation of incorrect relative URIs. Libwww
uses relative URIs in the HTTP referer field and other places to save
(often significant) amount of bandwidth.
Decreased default max number of outstanding (pipelined) requests from
100 to 50
Does now check whether we get a broken pipe when writing to the network
from expired timers. This was a bug that it didn't.
This is a small add-on to 5.1k that was released two days ago. It has a
few fixes that was discovered while working with the latest Arena browser.
Bug Fixes
Minor stuff that made PUT work nicer from a user perspective (when to
ask what questions, etc)
Fixed a few things in the direct WAIS access module
Release 5.1k March 23, 1998
New Features
Added a simple "single user lock" on the persistent cache as it gets
confused if multiple users are using it.
On Unic, the header files are now installed when running "make
install". The default location is
"/usr/local/include/w3c-libwww".
On Unix, aset of icons that can be used when browsing local file
directories are also installed - the default location is
"/usr/local/share/w3c-libwww"
Bug Fixes
These are the patches that I have received
for the current release. They will be applied to this upcoming
release.
Fixed a few bugs that make Arena run on top of
the latest version
Release 5.1j March 9, 1998
New Features
Made the distinction between Transfer-Encoding and
Content-Transfer-Encoding clear - they were somewhat mixed up before.
Libwww also now supports multiple transfer-codings as well as multiple
content-codings as required by HTTP/1.1.
Changed the Stream Stack to not call
the SaveLocally stream any more but instead return an
ErrorStream. The reason for this is that there are several places in the
stram stack building algorithm where we might want to save the file to
local file: when there is a content type that we don't understand, a
content-encoding, or a transfer-encoding (or a combination of these) that
we don't understand. Now we can make sure that the
SaveLocally stream is only called once
Added support for "identity" transfer-coding and
content-coding
Added support for Allow header when doing PUT and POST
Bug Fixes
Fixed a problem handling wrong HTTP responses from NCSA HTTPD derived
servers (they don't include a version number when responding to an
HTTP/1.1 request)
PUT now works much more reliable, and it also works on Windows (at
least NT, I haven't tested it on 95).
Fixed a refresh problem in the Line Mode
Browser which caused it to sometimes render the screen twice.
Release 5.1f January 1998
New Features
Changed the referer HTTP header field to use relative URIs instead of
absolute. This saves a lot of bytes on the wire
Fixed problem where a request was not flushed if using blocking sockets
as reported by this
bug report.
A limitation in the current persistent cache is that it only works in
non-preemptive mode. Hence if using blocking sockets then the cache
should be disabled. This is now the default behavior in the libwww profiles.
Bug fixed that caused the maxsock variable used in
select() not to be decreased when deleting a socket in the
default event manager
Changed the connection management so that it complies with the Connection
Management draft by Jim Gettys and Alan Freier. The HTTP client now
closes idle connections after 60 seconds which is a heuristic period
chosen by Jeff Mogul in the paper "The
Case for Persistent-Connection HTTP". The number can be dynamically
changed using the HTHost_setPersistTimeout() and the
HTHost_persistTimeout() methods. This could be made more
advanced so that we take into account any information given in the
"Keep-Alive" header but isn't for now.
Fixed a problem when a HTTP/1.1 server sent a response including a
Connection: close header using the close of the TCP connection
as a delimiter. This problem was pointed out in this
bug report
Fixed security hole handling HTTP 305 proxy
redirection codes. The proxy location returned in the responses was
enabled as a permanent proxy without any notification. The operation now
requires explicit acknowledgement from the user
Fixed potential (but small) security whole handling parsing a new rules file. This operation
now requires explicit acknowledgement from the user.
Release 5.1 February 18 1997
New Features and APIs
Added support pipelining
Support for zlib based decompression in content encoding
This release was originally called 4.1 but because we now have a complete
HTTP/1.1 client side implemenation including a
persistent cache manager and full support
for uploading documents, we decided to call it version 5.0 instead.
The focus for version 5.0 of the W3C Sample Code Library
is to provide a set of higher level, application specific APIs for accessing
the Web. These APIs - called profiles -
will help the application, a Web client for example, to more easily use the
full potential of the application independent Library core. Also, the Library contains a
significantly better interface for easy access to the Web through a large set of functions specialized to perform
certain Web operations like PUT, POST,
DELETE, GET and HEAD.
This release contains a TCL add-on to the Robot
example application and a Deja GNU Test suite
for the Library. Also, it support HTTP/1.1
including persistent connections, two-way PUT, and the
host header. There is also a sample PEP
implementation, that although incomplete can give an idea of where we're
headed using PEP.
Bug Fixes
Changed HTGetTmpName not
to use tempnam() anymore as it caused problems.
Added argument to HTParseTime so that we can decide
whether we want to expand relative times or not.
Release 4.1b5 is mainly a bug fix release after intensive testing against
the Common Lisp Server which also is a
HTTP/1.1 application. However, it also has a few new features worth
noting.
New Features
Added support for the entity-tag validator headers:
If-Match and If-None-Match and also for the
date validators If-Modified-Since and
If-Unmodified-Since. We only use either etags or date stamps
and the entity-tag validators have precedence over the date
validators.
Added acinclude.m4 in the WWW directory. The file is used by
automake
Changed the configure script to handle
the location of the icons distributed as part of the libwww distribution
package.
Added full support for Cache-Control header and
Connection header. Both headers can have an association list
of name value pairs as directives.
Moved the memory cache handler (history handler) from the line mode browser to be a BEFOREfilter which can be used by other
applications as well. It is now included as part of the client profile.
Changed the expiration handling API to
not include any notification messages. Any messages to be transmitted to
the user is now handled by the Alert
manager.
Changed the cache validation management in the Request object. The validation scheme is now
compatible with HTTP/1.1 caching including handling of the history list.
We have changed the validation enumerations from
Fixed problem when getting a connection header with the
close directive. The problem caused libwww to loop as it
recursively tried to free the input stream pipe
Fixed bug in access authentication which could cause libwww to loop if
the top level of the site was protected.
Fixed bug that caused a core dump if receiving a message body without
any content-type. Now the data is passed to the guessing stream which hopefully can handle
it.
Release 4.1b4 August 20 1996
HTTP version 1.1 allows for effective use of persistent connections.
However, in orderto make this work, a client application must be capable of
recovering from a closed connection between sent requests. The beta 4 version
of libwww supports automatic connection recovery and provides the
functionality for performing pipelining of requests. That is, there can be
multiple outstanding requests on the same connection In order to do this, the
release contains modifications to the Channel Object, the Host object and the Net object.
New Features
Added support for case-insensitive searching for proxies via
environment variables
Added support for proxy authentication
Changed handling of proxies so that they are not included in the URL
but is now instead part of the request object. This allows for better
handling of proxies and also for more freely use of the proxy filter as
it doesn't affect the other filters anymore.
Updated News DLL and incorporated News patches from Maciej
Puzio
HT_PERSISTENT is now obsolete and should be replaced by
HT_PENDING
The HTChannelMode enumeration describing the flow of a
channel has been replaced by HTTransportMode as it is now a
part of the transport object and not the channel object
HTNet_idle() has been removed and replaced by
HTNet_isIdle(). The function returns YES if there are no
pending requests in libwww at all.
Add support for checking public information about a host and use this
information when issuing a PUT, for example. Also add check for host
element in HTAccess when doing PUT. We
may have public information available
Changed the return codes defined in HTUtils to reflect the values of the HTTP
spec.
Added support for 305 Use Proxy redirections
Updated HTDir module to better use the
fact that we know that an entry is a directory or not. Now it appends a
'/' to the URL if it is a directory. That way we often avoid a
redirection.
Removed WWWRules as interface and merged it with the WWWApp interface. The reason was that the
two were highly alike and depended on each other
Bug Fixes
Fixed problem with uploading directory listing using chunked
encoding
Fixed autoconf to handle WAIS and updated the HTWAIS.c module
Fixed race problem in PUT on alphas
Tested reentrant version of libwww (uses _REENTRANT
define)
Fixed problem in file name conversion from URL to local format on
Windows
Release 4.1b3 July 20 1996
The file access module now does content
negotiation by default. This means that all local file access (including from
client applications) do content negotiation when accessing local
files. Content negotiation can be turned off by setting a flag in the request object.
A main difference in beta 3 is that we now have a set of "application profiles" that helps the
application to initialize libwww core to
work as a typical client, robot or other type of application. This should
replace the huge initialization procedure seen in previous versions. This is
in fact a result of the core being so flexible - it is inly a framework for
accessing the Web. The application must initialize all the functionality at
run-time. You can see the various profile functions in the WWWInit interface.
The second main difference is that the BEFORE and AFTER
filters have been more explicit than before. The HTLoadStart and
HTLoadTerminate functions actually covered many typical
BEFORE and AFTER filter functions like looking for proxies,
searching the cache, looking for rule file matching, and logging etc.
However, two functions were not covered by this: redirection and
authentication. That is why the application in previous version had to
supply this functionality. However, in beta 3 we have split up the
HTLoadStart and HTLoadTerminte functions into a set of filters which each perform only a
single function, for example looking for proxies. The split has two
functions: first it shows how you can use filtes to add new functionality to
the Library and second it can be used by more types of applications. A result
of the new filters is that we also have default redirection and
authentication filters so you don't have to provide this anymore.
The filters are set up as part of the
profiles so you will normally not have to
register them individually.
New features and Changes
Introduced HTUserprofile class to
handle host and user specific information
A host name is not expanded to a fqdn name as it is not reliable
enough. Now we just keep it as is, that is we don't expand www to
www.w3.org, for example.
Updated HTML parser to support
BASE tag and LINK tag
Introduced HTLib as a new core module.
It contains generic information about the core which used to be in the
HTAccess module.
Added support for HTTP/1.1. Most of the
HTTP/1.1 specification is now in place, we still need some headers and
some features but this version can be considered to be compliant.
Added suport for proxy authentication as specified by HTTP/1.1
Added support for case-insensitive searching for proxies via
environment variables
Changed handling of proxies so that they are not included in the URL
but is now instead part of the request object. This allows for better
handling of proxies and also for more freely use of the proxy filter as
it doesn't affect the other filters anymore.
Bug Fixes
Fixed problem with uploading directory listing using chunked
encoding
Fixed problem in the rule file parser. It didn't parse the last line of
the config file
Fixed autoconf to handle WAIS and updated the HTWAIS.c module
Fixed race problem in PUT on alphas which caused the PUT operation to
hang under certain circumstances
Release 4.1b1 May 20 1996
New Features
Introduced GNU autoconf configure script for compiling on Unix
platforms instead of the old BUILD script. This should make it a lot
easier to compile on Unix as we get all the advantages of GNU
autoconf.
Introduction of the HTUserprofile Class
which keeps track of a "user" known to the Library
New access authentication interface allowing for dynamic registration
of new access authentication mechanisms. It provides an easy API for
hooking in new schemes.
Improved handling of trace messages which allows for easy redirection
of trace messages
Support for registration of content coders/decoders and content
transfer encoders/decoders. This is done the same way as for media types
by registering a set of streams that can handle the various
encodings.
Support for chunked decoding
Introduction of the HTHost Class which
keeps track of information about remote hosts
The DNS Class has been simplified to
handle DNS queries only. All additional information about the remote host
is defined by the HThost Class.
We have a new HTEvent module which
allows for dynamic registration of an event manager. This will make it
much easier to use external event managers together with libwww. If you
wish to continue to use the event handlers from HTEvntLst, you must
register them explicitly with HTEvent_register. This call is
demonstrated in HTBrowse.
The HTStream module has been created
containing a set of basiv streams such as an error stream etc.
Introduction of the HTTransport
Class. This allows for dynamic registration of transport protocols
such as for example the W3Mux protocol, TCP access, local file access
etc.
All MIME parsing is now done with registered parsers. The HTMIME module only unwraps the MIME
header fields and calls the best parser. The header parsing origonally
done in HTMIME can be found in HTInit.c and
is registered with HTMIMEInit. This call
is demonstrated in HTBrowse.
This upgrade release fixes some bugs and it adds functionality for posting
data from memory. This is the full list of changes:
Optimized HTTP request header by taken away an "Accept:" line pr
accepted content type and instead use the commna notation.
Introduced HTMemory as dynamic
memory handler. This module is a part of the WWWUtils interface and it handles better
management of dynamic memory. You can find a full description in the User's Guide.
We also present an updated list of all public interfaces
available in the Library. You can find it in the Library Internals.
PUT and POST from memory is implemented. You can now post data from
memory using the POSTWeb model as for posting remote data objects. The
Interface is described in HTAccess
module. There is a very small dummy test implemented in the Line Mode
browser. You can activate it if you typeedit from the command
line.
The Mini server now runs (although crude) as a proxy. It is capable of
serving data as a HTTP/HTTP proxy. It is based on the internal Library
event loop and is therefore highly portable. It is not intended to be a
full featured server but a test implementation which shows how to use the
Library in server applications.
Bug fixed that caused the following problem: If there hasn't been made
a connection between the net->target and the request->output_stream
then the latter is not freed if the request is interrupted.
The referer URL header can now contain a unlimited length
URL
A resolver callback function has been introduced in the HTTee stream. This allows the caller to
assign a callback function to resolve conflicts between the return codes
of the two streams.
The request object as been added as a calling parameter to the HTFWrite stream creation method. This
allows errors from writing to a file to propagate back to the request
object.
The three streams HTSaveLocally,
HTSaveAndExecute and HTSaveAndCallBack have
been optimized and they now all use the HTFWrite stream creation method
We have a new module called HTTPGen.
It generates general HTTP headers. These headers were a part of the MIME
header generator, but by isolating them, we can use the MIME header as a
generic MIME entity header generator.
Created HTErrorStream which always
returns HT_ERROR. It replaces the HTBlackHole has been replace with
HTErrorStream in many places in order to speed up performance
Made HTTP response stream (which parses the response line only) into a
converter so that we can forward the output exactly as received from the
remote HTTP server. This is important for proxy applications and other
applications that want to see the output untouched.
The HTChunk module has been made more solid and the amount of memory
allocations has been limited.
Memory cleanup fixed in content length counter stream and FTP
module
Introducing HT_CLOSED and HT_PAUSE for handling streams. This was
required, especially after HTTP supports persistent connections where a
document is not delimited by a closed connection.
Release 4.0C January 23 1996
Automatic redirection and Access authentication has been take n out of
the HTTP module. Instead the new mechanism with request callback
functions are used so that the application can register handlers to
handle these situations. The reason for this change is that not all
applications are interested in having this functionality performed
automatically.
Authentication handler and redirection handler added to both the Line
Mode Browser and the Command Line Tool
Added three possible return codes on which a request callback function
can be called:
HT_PERM_REDIRECT for permanently moved objects
HT_TEMP_REDIRECT for temporarily moved objects
HT_NO_ACCESS for insufficient credentials
PUT and POST do now work reliably in the Line Mode Browser and the
Command Line Tool. Both can PUT or POST a document from either a
remote HTTP server or the local file system to a remote
HTTP server.
An important bug-fix in the internal event manager that prevented a
socket to be registered for multiple events at the same time.
Cleanup of the POSTWeb management in the file module and the HTTP
module
In addition to progress notification on READ we now support progress
notification when sending a data object
Spelling mistake fixed. preemtive is changed to
preemptive
The W3C Mini Robot has now the ability to stop at a certain depth while
traversing the Web.
Release 4.0A December 11 1995
Created the include HTTPUtil.html - a C file may follow
Changed HTDoAccept so that it automatically inserts the new socket
descriptor in the Net object.
HTLoad_terminate moved from HTReqMan to HTHome. It is not automatically
set up in HTLibInit anymore
Made HTAnchor and HTLink opaque data objects
changed HTLink_newResult to HTLink_result
changed HTLink_newMethod to HTLink_method
Changed HyperDoc to simply a void pointer. This makes it more generic
and the application does not have to actually treat it as a document
anyway
Introduced the WWW_DEBUG internal format that can be used to redirect
debug information, for example from a HTTP redirection message etc.
Request object added to HText_new methods
changed HTExtParse to HTXParse throughout - H&kon
HTXParse now null-terminates buffer - H&kon
return codes from HTTee changed - H&kon
content-length initialized to -1 - H&kon
Fixed bug for handling HEAD in HTTP and MIME streams
Handles of persistent connection is now a semaphore and not a bool
Changes the event loop so that a timeout handler can return HT_OK or
HT_ERROR. If the code is not HT_OK then stop the event loop
Changed request of header enumeration:
General headers have the prefic HT_G_
Request headers have the prefic HT_C_
Server headers have the prefic HT_S_
Entity headers have the prefic HT_E_
Separated file suffix initialization into HTBInit
Changed the names in HTChunk to be consistent with the Module_method
scheme
Added a parameter for handling server modules to the Protocol module. An access scheme (for
example HTTP, can now be registered together with a client
module and a server modele
Changed functions with the name preemptive to
preemptive it was a spelling mistake
Alpha Release 6, November 20 1995
Removed remaining ARGS and PARAMS from the
sources. All function declarations and definitions now follow the ANSI
standard.
Changed HTAlert to be a registration
module instead of a module to be overriden by an application. Now you can
register a Dialog module as a callback
function which is a much more generic moduel. It also means that there
are no more English text in the Core of
the Library - it's all in a application module. The Library provides a
default implementation in the new HTDialog module.
Taken the WWW relevant stuff out of HTString and put it into the new module
called HTWWWStr. The latter is a part
of the core whereas the former is generic string utilities.
Made the global variable HTLibraryVersion local to HTAccess. The are now only
WWW_TraceFlag left as a global variable in the whole
Library.
There are now four main include files. They are called:
The WWW Utility module contains a lot of the functionality that
makes it possible to make applications, that is container modules
for data objects, basic string functionality etc. This module is
the basis for all of the following modules and is used
extensively.
The WWW Core module is a set of registration modules that glues
an application together. It contains no real functionality in
itself; it is for example not capable of loading a HTML document.
It only provides a large set of hooks which can be used to add
functionality to the Library and to give an application real life.
We will here a lot more to the structure of the core, and much of
this guide is actually describing how to add functionality to the
core.
This include is the main include file for the Library. It
basically consists of the WWWUtil.h and the WWWCore.h so that the
application only needs to include this one instead of two.
This module contains a huge set of modules that can be hooked
into the Core Library and make the application work. In contrast to
the Core part, you can pick exactly the modules you want from the
WWWApp.h in order to create your special application whether it is
a server, a client, a proxy, a robot or any other Web
application.
HTGuess stream is now a converter
just like any other converter, and it can be registered by the
application if wanted
HTClientHost (global var) is removed from the Library as
it wasn't use there at all
Introduced HTHome module as part of
the application specific part of the Library. It contains handy functions
for the application.
HTGetCurrentDirectoryURL() and HTHomeAnchor() are moved from HTAccess.c
to the new module HTHome.c
HTImProxy is taken out as it can be handled in the setup of the
converters
Created HTList_appendObject() in HTList
module which adds an element to end of list
Introduced pre request callback functions in HTNet and changed the
interface. It's described in User's Guide. This means that an application
can register any action to be executed either before and/or
after a request is done. This is described in detail in the User's Guide.
HTHome module contains default
callback functions to be called before and after a request.
Removed the HTInputSocket_get* functions from HTSocket as they are no more needed
(they were non-reentrant and belonged to the old interface)
Added the following HTLib_* functions to the HTAccess module:
const char * HTLib_name (void)
const char * HTLib_version (void)
BOOL HTLib_secure (void)
void HTLib_setSecure (BOOL mode)
Library is now handling cleanup of anchors in HTLibTerminate() as it
really is a part of the core.
Default addresses are taken out of HTAccess and added to HTHome. They
were mainly some default addresses for WWW_HOME etc.
The Library is now not using env variables at all
The flag WWW_WIN_ASYNC introduced to use the new async
window interface. Add this define to your project if you want it
HTsocketWin and HTwinMsg made internal to HTEvntrg.c. They can now be
accessed through the methods
HTEvent_setWinHandle
HTEvent_winHandle
Added the following formats to the default initialization:
Changed format image/x-tiff to image/tiff and
image/x-png to image/png
Finished the News module using
non-preemptive sockets. Final testing is still missing
Changed HTRules so that it now is a stream. It also handles line
wrapping and has a much more solid parsing mechanism. See description in
User's Guide for more details.
Alpha Release 5, November 8 1995
The Library has now undergone a major restructuring in order to define the
APIs between the various parts and to make it more modular. The new
architecture is described in the Architecture document and includes a
new Net manager that handles a request queue, a DNS manager that handles
persistent connections and a well defined Request manager where the HTRequest
object is an opaque object.
Core Modules
Generalized the Error Manager to work
on lists instead of as directly on a request object. This makes it
possible to use the error manager in all situations.
Changed HTAppName and HTAppVersion to parameters to
HTLibInit() instead of as global parameters.
Introduced a new main include file called WWWApp.h which contains applications
specific modules. None of these modules are actually required to compile
and link the Library, but the application can use them if needed.
None of the access modules (HTTP, FTP, NNTP, etc.) are set up by
default anymore. This is now for the applications to do.
Changed TRACE to WWWTRACE as it interferes with macro
on NT
The balanced binary tree in HTBtree
is not used anymore. The functionality is now provided by the new HTArray module that is a dynamic array of
pointers. It is much like the HTChunk
module but for pointers instead of dynamic strings. The advantage is
that the fast qsort algorithm can be used on the array.
Moved the two functions HTWWWTLocal() and
HTLocalToWWW() from the HTFile
Module to the HTString module for
converting between local file names and URLs
It is now possible to register a call back function together with a
timeout that is used in the select call of the event loop. When the
select() call times out, the call back function is called. The
registration can either be so that the call back always is
called when that the select call times out or only when Library
sockets are in use.
We have a port to PowerMac and 68K!!! This has been provided by Steven
T. Roussey <sroussey@eng.uci.edu> and Brad Barber
<barber@well.com>
We have changed HTThread to HTNet.c as
thread was too confusing. This module has been cleaned up and rewritten
so that HTNet.c now contains the Net
manager. The Net manager controls the net access so that we only keeps a
specified amount of sockets open simultaneously (called a request queue).
It also partly controls the management of persistent connections together
with the DNS Manager.
The HTRequest Object is now known to the Request Manager only. It is accessible
through a lot of methods just like the Anchor object etc. The HTAccess module is now a user interface
to the Request Manager but it doesn't have to be used - you can use the
Request Manager directly but often it is easier to go through the Access
Module.
The Protocol object has been turned into an object internal to the Protocol manager. This means that protocol
information can be accessed via a set of methods provided by the
manager.
We have changed case sensitive search to insensitive search when
finding an access methods (HTTP, FTP, telnet etc.) for a URL. This
catches silly errors in URLs like HTTP://www.foo.com
The static protocol declarations have been replaced with a set of
parameters to the creation method of a protocol object in the Protocol Manager which is more handy.
We added HTProtocol_delete() as method in the Protocol manager. This means that it is now
possible to unregister a protocol at runtime. If you are on a
platform with dynamic linking (for example DLLs) then this can save a lot
of space.
Introduction of application call back function for memory cache. This
is a part of separating the HText module
and also to make the memory cache manager more flexible. An example
implementation can be found in the GridText module in the Line Mode Browser.
Full support for the HTLink
object that binds together two anchors. This is a requirement for
keeping track of the Post
Web.
URL fragment identifiers were case insensitive - they are now case
sensitive
Added result field to the HTLink so that the result of a
posting operation is stored in the link object. This means that the
application can see which post operations succeeded and which didn't.
This can for example be used in a GUI client to show the relations
between the source and destination anchor as "dim" links.
Fixed bug in put mechanism that caused destinations to accumulate in
the postweb. Now `old' destinations are still registered but not included
in a new postweb - they can, however, be discarded all together, but
often the information is nice to have. On a GUI client, it can be shown
as 'dimmed' destinations.
Removed remaining outputs directly to stdout using fprintf().
Now all goes through HTAlert module.
This was a problem in some of the protocol modules as well (especially in
the Telnet module).
The CacheItems structure is removed from anchor object and
replaced by (BOOL) CacheHit. If CacheHit=YES
then the format negotiation and suffix binding is not used in HTFile Module but the object is loaded
directly (from the local file cache) using non-blocking I/O
Internal DNS/hostname cache optimized and made more flexible to support
persistent connections for all protocols that support this, for
example HTTP, FTP, and NNTP. The first version has support for HTTP only,
but the other ones will follow shortly. and multiple connections to same
address. You can now control the garbage collection of DNS entries based
on time. The DNS object keeps also information about the remote server
(class of request, for example HTTP or FTP and the type of server, for
example HTTP/1.0, HTTP/0.0 etc.). This means that it is possible to
adjust a request to a remote server once the type is known. For example,
the Library now distinguishes between HTTP servers version 0.9, 1.0, and
1.1. New classes and versions can be registered at run time just like
protocol modules.
HTDoConnect rewritten as a state
machine which makes it a lot easier to understand and change. It is
furthermore and important part of the persistent connection
management.
Removed HTMaxRedirect as a global variable. It is now
private to Request manager with two
methods to access it
Support for context swapping and call back function in the HTRequest object. This allows the
application to distinguish between multiple ongoing requests. See more
information in the User's Guide.
Bug fix in HTTCP.c HTGetHostName thanks to
"dave (d.) mielke" <davem@bnr.ca>
Protocol Modules
We have a first version of a new News
module which supports POST, persistent connections and a lot of more
efficient NNTP commands.
Re-implemented long directory listings for FILE and FTP access using
streams. The HTML machinery is now separated into the new HTDir module. It provides long directory
listings for Unix, NT, and VMS.
Support for persistent connections in HTTP requests. This has
been enabled by the generic model in the DNS manager.
Support for the methods DELETE, LINK, and
UNLINK.
Re-implemented HTFTP module using
persistent connections, multiplexed (non-blocking) sockets and
streams for input and output. In fact the FTP module is now very capable
and it's half the size of what it was because the Library in
general supports the features needed in the FTP module. NOTE: Generation
of HTML objects from directory listings is disabled for the moment as we
are re-implementing it using streams. However, now it is possible to get
the raw source code directly from the FTP server by typing:
www -source ftp://ftp.w3.org
Re-implemented Gopher module so that
it uses streams for input and output instead of the obsolete HTSocket_getCharacter() method. The
module now also uses non-blocking I/O as the HTTP module and the file
module. A result of using streams is that you now can obtain the "real"
source (the ASCII object sent by the gopher server) by using
WWW_SOURCE as the output format. Before the source was
equivalent of the source of the HTML object that was generated on the
fly.
Updated Telnet module to understand URL
format user:passwd@host:port. Cleaned up the code and fixed a
set of bugs at the same time.
Support for Host header in HTTP
Request module. Actually this was Orig-URI but it was
changed on the fly.
Fixed bug on HTTP redirect if no body part was present. This is the new
"trend" especially from Netscape's server. The old servers always sent
back a small HTML document explaining if the client didn't understand
automatic redirect.
Stream Modules
We now have a MIME multipart parser
stream which supports nested MIME multipart messages. This stream
sets up a new MIME parser stream each time it finds a new body. Preamble
and epilog messages are ignored by default but can be redirected to the
special Debug Stream output of the
Request object.
We have added a content counter stream and a buffer stream in HTConLen module. This can be used to
count the number of bytes in a data object either to be sent to a remote
server or as a check of a received body. Together with the Content Length
counter stream there is also a buffer stream that can buffer up to a
certain amount of dat before it goes transparent.
We have replaced the internal format identifier www/mime
with the MIME conforming message/rfc822 format. This may
affect the setup of stream converters that used the old format.
For the first time, it is not required to define dummy
definitions of the external declared function in the HText Interface. It is now only referenced
from the HTML module and the Plain Text Presentation module. This means
that applications that do not use the HTML/HText interface no more have
the modules linked into the final object code.
The SGML module is no longer included
anywhere anymore except where necessary. The structured stream definition
can be found in HTStruct module
Make registration of callback function for unknown MIME headers in the
MIME parser. The application can now
more easily experiment with new headers. The registration process is
described in the section How to get Started Writing
an App
Added support for png (lossless graphics format) (patch from unknown?)
into the Guess stream
Bug fixed in HTML parser It couldn't
handle more than 20 nested HTML tags which can happen for example in
auto-generated HTML documents.
HTLoadError taken out completely. Replaced by the error manager. It was for (stupid)
historic reasons put into the HTML parser.
Application Modules
HTInteractive made a private flag to
HTAlert.c. It can now be reached via two
methods in the module.
All functions in the Alert Module now
has a request object as part of the calling parameters. The reason is
that then the implementation of this module can call a registered call
back function (as described above) so that the user message can be put
into the right context.
Mem leak fixed in HTLog module and
added result of request to the log file. The log module is no longer
called in the Library at all so if you do not use then it is not included
in the application.
Rewritten HTHistory module to perform
a better history mechanism - the old version did not work properly. Some
of the new features are
Support for multiple history lists. You can use the context
information in the request object to find the right list at any given
time.
The list now supports both back and forward for
navigation. The list is no more "back trace with deletion".
All replaced with (wb|ab) if known binary output.
This was a problem on Windows 3.1 platform as it inserted extra
CRLF line terminators.
Created HTNumToStr() in HTString
Module to convert a number to a string using prefixes. This can be
used in the progress notification to write Read 1.6K etc.
Library 3.1, November 14 1995
Official release
Library 3.1pre3 Release Notes, November 13
This is a last test of the 3.1 release. The official 3.1 release will
follow in a few days. The purpose of this third pre-release is simply to
avoid any obvious problems while we still have time. Very little will change
in the final release! The big difference is that Windows NT is fully
supported!
New Features and Interfaces
Changed "rs6000" to "AIX" and "decstation" to "ultrix" in BUILD script.
The previous names were not obvious
Have created HTProt module which
handles protocol module registration. It used to be in HTAccess module, but this module is now
uniquely for user requests
Changed the names of the following functions in the HTProt module
Canonicalization now understands host names terminated with a ":", for
example "www.w3.org:" is converted into "www.w3.org"
HTEvnttd.c is removed from the source tree; it's no longer needed.
Likewise, HTEvent.c is no longer needed; it has been superseded by
HTEvntrg.c and HTEvntrg.html.
except in HTFTP module as it
will be changed later. Some of the function cause problems on
Solaris...
Bug Fixes
Fixed bug in HTBind_getSuffix that caused garbage file suffixes
Fixed memory leak in HTAnchor if a child has more than one
destination
Library 3.1 Prerelease 2, November 1 1995
The code word for the 3.1 release is support for remote collaborate work
where people can use HTTP and the Web as a remote authoring environment. The
reason for the slight delay is that this release has a new "Post Web" model
for implementing PUT and POST, the documentation has been reorganized and
rewritten, and some other important features have been incorporated. The Post
Web model is described in the new documentation, see the reference below.
In addition to the new set of features and functionality this release also
is the first example of source code distributed under the W3C conditions.
This means that the code is available to consortium members only within a
month from the release date.
A lot of the work put into this release has been to update the API of the
Library. This is mainly described in the new "User Guide", so please do read
this and remember that comments are welcome!
New Features and Interfaces
The Post Web model is supported by the following methods to handle the
HTRequest object
HTRequest_removeDestination()
HTRequest_linkDestination()
HTRequest_unlinkDestination()
HTRequest_removePostWeb()
HTRequest_killPostWeb()
HTFile module is completely rewritten
as a state machine much like the HTTP
module. It is now possible to have non-blocking, interruptible I/O
and to PUT and POTS from a local file as well as from a remote HTTP
server.
HTCopyAnchor is rewritten to use the PostWeb
The thread model is extended to include a HTThread_kill() function so
that threads are terminated immediately upon request
HTProxy module is introduced. This
module substitutes the environment variables for defining gateways and
proxies. It can now be done dynamically at run time. For backwards
compatibility HTProxy_getEnvVar() can be used to read the most used
environment variables.
Taken socket read/write functionality out of HTFormat and created HTSocket module which handles all the
basic network access
Better handling of WAIS src files and use of gateway information
Upgraded the WAIS to handle version 0.5 of the freeWAIS library.
Better handling of media types from WAIS responses (guessing)
All HTTP headers can now be transmitted and received and they can all
be enabled/disabled using a bitmask.
New functions to support accept encoding, language, and charset
Guess stream now handles macbinhex format
No more circular references in any of the Library include files
Introduced REMOVE and RMDIR as macros instead of direct system
calls
Removed HTRequest_clear() - it's not safe to reuse a request object
HTStrip() is now placed in the HTString module
introducing the method used in the link object in the anchor
HTTP modified to support PostWeb
Bug Fixes
Cache is now avoided if in secure mode (no access to local file
system)
Bug fix to allow simultaneous read and write on the same socket
Stream stack now dumps to local file if no conversion can be made
Bug fix that caused core dumps for HTTP 0.9 servers
Bug fixed in stream stack if system() call is not present. This release
has a new model called "Post Web" for handling put and post from a source
to a multiple number of destinations. Furthermore
Library 3.1 Prerelease 1, May 20 1995
A lot of the work put into this release has been to update the API of the
Library. This is mainly described in the new User Guide,
so please do read this and remember that comments are welcome!
New Features and Interfaces
Windows NT Support (and possibly Windows 3.1 with Win32S) thanks to
Charlie Brooks, <cbrooks@osf.org>
A lot easier handling of headers in the HTTP module including support
for extra headers
Improved HTTP/MIME parser which
recognizes 99% of the HTTP headers The rest will be ready in the next
release! This includes support for "charset" and "level" parameter
Support for PUT and POST. This is not yet fully implemented but it is
possible to experiment with it
Big enhancement of the anchor module with support for all HTTP/1.0
headers, garbage collection etc.
The Cache manager is made more object oriented with a broader set of
methods. The garbage collector is not the best
HTBind module created for better bindings to file system with new hash
function for improved speed and more functionality than before
Created HTDateTimeStr() and HTMessageID() in HTString
Separated HTStructured into its own stream definition module (HTStruct.h)
Introduced STREAM_TRACE and BIND_TRACE as new debug flag
Original reason messages are now passed back as a parameter by the
error handler when talking to a remote HTTP server
Introduction of error_format in the request object. This can be used to
get debug information out of the Library
Better handling of media types in FTP and Gopher including recognition
of UU encoded files
Bug Fixes
Eventloop responds now faster on events as it more often does a
select
HTTP version strings are now string and not floats
Many platform dependent macros introduced and a lot of cleanup
CERN terminated
its direct engagement in the World Wide Web with the release of Library
version 3.0. The code is now developed and maintained by W3C.
Library 3.0, Mar 21 1995
Many of the modifications and new features are mentioned under the
pre-releases.
Changed all float to double in order get it
consistent with the working floating type in C
Fixed problems with memory in HTML.c
Introduced definition of errno constants for WIN32, as
WinSock doesn't define them using BSD notation
Library 3.0 Prerelease 3, Mar 10 1995
New or Changed Features
All library include files which contains public information for
applications are now contained in the single include file WWWLib.h which is the only one necessary
PLEASE DON'T USE ANYTHING ELSE!!!
Changed TRACE messages so that the target is the macro TDEST and not
stderr. This means that on platforms that don't support stderr, TRACE
messages can be redirected to a local file.
Due to the PC Port some modules (definition and declaration files) have
changed names so that the max length is 8 characters:
Move EnableFrom to HTAccess module.
This variable determines whether the HTTP header From: should
be generated. The default value is off
Sockets are no longer assumed to be small, non-negative integers, but
uses macros. This should ensure portability to Windows NT.
Introduced socerrno and errno so that WinSock can use its own
definition whereas `local' errno can still use the well-known
version.
Introduced error_stream as field a HTRequest object. All
information contained in HTTP responses which don't naturally contain a
body entity, for example redirection codes (3xx) and client error codes
(4xx) will be put down this stream so that it can be put into a debug
window.
HTNewsHost is now a local variable in the HTNews module. Use HTGetNewsHost and
HTSetNewsHost to set and get the current value. The news module will be
rewritten in the near future as it has many problems.
The HTTP module understands all HTTP/1.0
return codes and is more solid
Many portability problems has been solved and optimized. Most system
dependent things are now put into tcp
module
Interface to CSO name server made nicer - generates correct HTML
Bug Fixes
file:// no more tries ftp:// if host=localhost
Improved proxy support and fixed bug when reloading a document from a
proxy
Bug fixed in HTGetHostName() which didn't include a dot <.>
Bug fixed if UserID/passwd was not correct and don't want to retry
HTErrorAdd and HTErrorSysAdd no always return HT_ERROR
Fixed bug in HTGetDomainName when no domain name is present at all
Add output_flush to request object
Uses IOCTL as a macro now - not fcntl
and a lot of other stuff...
Library 3.0 Prerelease 2, Dec 2 1994
Introduced memory cleanups from Eric Sink into
HTLibTerminate()
Now the client can provocate a call to
HTEventRequestTerminate() even when the request never enters
the eventloop. This is necesary so that the client can cancel busy icons,
spinning globes etc.
Introduced EVENT_TERM as return code for the
HTEventHandler function
HTEventHandler() now has a double pointer so that the
request pointer can be modified from the client
Fixed bug in HTSearch() and HTLoadRelative()
where wrong return code was returned (BOOL instead of int)
Introduced BlockingIO field in the request object to override the the
mode registered in the protocol object. This can be used as an easy way
to make blocking I/O
Library 3.0 Prerelease 1, Nov 26 1994
New and Changed Features
Introduced the function HTLibInit() and HTLibTerminate() which MUST be
called when the application starts up and terminates.
Introduced the modules HTThread and HTEvent. HTEvent is the client
interface when using multithreaded functionality and HTThread is the
internal socket management
All __STDC__ defines now concentrated to HTUtils.html where it is
called _STANDARD_CODE_ so that _cplusplus also handles this
`new' and `template' not used a names anymore (confuses C++)
Removed from field in HTRequest object. This is now handled by the
functions HTGetMailAddres and HTSetMailAddress and the flag HTEnableFrom
in the HTTP module.
Changed HTSetMailAddress so that a call with parameter equals NULL or
"" clears the contents of the mail-address.
The number of parameters to Streamstack function is now compatible with
the arguments to a stream converter, so that we don't loose any
information while putting up the stream stack.
HTOutputSource variable is removed. You should use WWW_SOURCE in the
request object
HTGuess stream is a a converter compatible stream so that it actually
can be setup as a converter
The stream methods `abort' and `_free' now returns int instead of void.
On success from these methods `_free' returns 0 and `abort' return
EOF
Taken HTEscape and HTUnescape and put them into the HTEscape module.
The functionality is the same but now they can be used in utility
programs without linking in the whole Library
Bug Fixes
Removed bug in WAIS module which caused a lot of core dumps
Removed bug in format classification from URL suffix in HTGoper when
the file was plaintext
localhost is now recognized again after canonicalization in
HTLoadFile()
WAIS SEARCH now produces proper HTML
Max number of lines in WAIS decreased to 200 as 250 (previous) dumps
core
Bug fixed in HTGetHostName() if we must use the
getdomainname() function.
Added some support for SCO
Handling of gopher items of type ERROR (3) changed so no more core
dumps in server
2.17 Release on November 25 1994
New and Changed Features
The host-cache is now extended so that it tries all
IP-addresses before it fails the request. It always starts with the
fastest IP-address.Different penalty is added to the connect times
dependent of the errno returned from connect().
In addition to HTSimplify which canonicalizes the URL path, a new
function now canonicalizes the host-part of the URL. This means that URLs
like
now all are treated identical. This is useful for all the caches based
on URLs such like the Server document cache and the hostname cache
TRACE is now differentiated into a bit flag so that TRACE messages can
be turned on and off for individual groups of messages. This was
necessary as the amount of verbose output was growing too much
Redirection understands now `URI:' and `Location:'. Implemented after
discussion on www-talk
Changed 404 Error Message to `Access Forbidden'. The URL is no more
included in the message as it is sensitive information
FTP client now sends full email address of the user as the password for
anonymous access instead of USER@. This allows access to some servers
which don't accept the old format
The data connection in a FTP session is now based on the return value
sent by the PASV return code rather than the URL. It is not always the
case that the data connection is on the same host as the control
connection
UserID and Passwd in FTP URLs can now contain special characters, like
'@' etc.
The Gopher listings are now slightly lighter and don't contain the
`name' and `files' any more
Support of Gopher info items. They are treated as normal messages. The
gopher code for this is `i'
WAIS module now guesses the stream format when TEXT is returned from
the WAIS library as it might be HTML.
The Protocol modules: FTP, WAIS, and Gopher now produce proper HTML
with etc. in the beginning
Added a function that returns the domain name taken from the same
location as HTGetHostName(). The functionality of obtaining current host
names, mail addresses etc. have improved, see HTTCP.html for more
Introduced the flag HTInteractive in HTAccess.html to tell whether
functions in HTAlert can prompt the User from within the Library or not.
Default is YES.
The common BUILD for the Line Mode Browser, the CERN Server and the W3C
Sample Code Library now accepts a command line option:
BUILD linemode | daemon | library
to build a specific component. The default action is to build all
three parts. BUILD is now also provided in a Bourne Shell version
Bug Fixes
Fixed memory leak in HTWAIS Doc retrieval. However, the functionality
or performance has not changes
If no host is found in the URL then no attempt is made to connect to
host 0.0.0.0 that is localhost. Some hosts do have an alias for this
address
Fixed free memory read in redirections. Put redirection counter into
request object. Now no more than (default) 10 redirections are
allowed
Fixed bug in FTP module to handle really slow hosts in the select call.
The select timed out without the right action taken.
FTP module chopped off the first line of a Windows NT ftp server as it
doesn't send a traditional first UNIX line. Fixed! Well, it did
look like UNIX, but no more than that :-(
Problem in HTTeXGen. Some markups was spread over a new line, and LaTeX
doesn't like that.
Remove the ACCESS_AUTH define as it is never used anymore
(no more compilation without access authentication)
Fixed bug in the ISO Latin 1 translation table in HTML.c. This was a
problem for estonian documents or other with many special characters
Fixed bug in HTSimplify not skipping host names. HTSimplify is made
faster and is now only called once (both from the server and the client).
Before it was called 2-3 times.
Bug found in FTP URLs containing UserID and Passwd fixed
Bug fixed in name generation in client cache
Removed bug in FTP and IP-rotation on multi-homed hosts. When FTP
server is in PASV mode it sends back a port number on a specific host. In
this case we can't use IP-address rotation.
Bug with FTP IP-address network order fixed for PASV mode
Gopher errors (code 3) are now just put as a string as they are not
real error
Bug fixed in HTWriter.c function flush() where a partial
success in NETWRITE would produce a wrong output (repeated buffer)
2.16 Prerelease 2, November 1994
WAIS Client
The WAIS client has been improved and some bugs have been fixed:
Bug in the parser of the search result from the WAIS module fixed
Maximum number of lines presented from a search made a configuration
variable. Default value is 100 (was 40)
Introduced WAIS's own error messages as they are returned from the WAIS
library
The presentation of WAIS on the screen made nicer (well - I think it
is!)
HTTCP Module
Bug in the host cache fixed
Access Authorisation
Premature free of memory fixed
Missing initialization fixed
2.16 Prerelease 1, April 1994
New Features and Changed Interfaces
HTTP Client
HTTP module contains the code for the HTTP
client. The module is now reorganized and made more modular.
Automatic Redirection
Now supported by the HTTP Module. The name of the new URL is parsed
to the client via the error_stack as a ERR_INFO message, see HTError module. The maximum number of
redirections is set by the variable HTMaxRedirections.
Referer Field in HTTP request
Clients are provided the possibility of sending a Referer Field in a HTTP Request. This is done by filling out
the HTRequest->parentAnchor field.
From field in HTTP Request
Clients can now send the full email address of the current user in
the HTTP From field. The feature is
turned off by default as it might get a bit tricky through a Proxy.
204 Response
Support of return code `204 No Response'
FTP Client
HTFTP module contains the code for the FTP
client. The FTP client has changed a lot in this release. It is now a
complete state machine where the actual action executed is a function of the
current state.
The client now follows the suggestions given in rfc 1123: "Requirements
for Internet Hosts -- Application and Support".
Establishment of the data connection now comply to rfc 1579: "Firewall-
friendly FTP" such that the procedure is
try PASV
if that fails, try PORT
The URL is now parsed according to the (latest) specifications:
url : f t p: / / login / path [ ftptype ]
login : [ user [ : password ] @ ] hostport
hostport: host [ : port ]
ftptype : A formcode | E formcode | I | L digits
formcode: N | T | C
Both directory listings and file retrieval use the same procedure:
First try to go to the location directly, as we are often talking
to a UNIX server or one that 'understands' UNIX syntax
If it fails, then go to the location step by step using CWD. In
that way we should not have any problems on any platform, and thus it
is not necessary to make special hacks for VMS, etc.
Long directory listings are supported for unix-like systems and VMS.
This includes NetWare and WindowsNT. See Future plans
for more and Directory Listings
Information from the FTP-server is pr default presented to the client
using the following rule:
If you are connecting to the root directory at a ftp site, we show
the 'login' message (might be a concatenation of several messages)
just like in a normal ftp session.
If you have a more specific URL, then you probably already know the
site and are less interested in the login message. Instead we show
any local message when making a CWD to the right location.
Gopher Client
The Gopher has been revised and
improved error handling has been implemented.
Information Messages
Some Gopher servers send back information messages in a line
containing "error.host". This information is treated like login
information from FTP servers so that it is represented as a message
before or after the actual listing.
Iconized Listings
Listings now contain icons in the same way as the other listings.
CSO Name Server
The CSO Name Server client outputs in HTML and not only <PRE>
as before.
Content Type Recognition
The Gopher module uses it's own content-type recognition inherited
from HTTP when handling gopher text and gopher binary files. This
means, that e.g. PostScript files get handled correctly.
Local File Access
The new version of HTFile module is a lot smaller as all Directory listing
stuff has moved to HTDirBrw module. New error handling has been
implemented.
Passive and Active Connection Establishment
Calls to connect() and accept() now go through the functions HTDoConnect()
and HTDoAccept() respectively.
Cache of Host Names and Addresses
HTInetParse() that is called from within HTDoConnect now has an internal
cache of the names and (possible multiple) IP-addresses of visited hosts.
This minimizes the access to the file /etc/hosts and the Domain Name Server,
even though aliases are not recognized in the cache.
The default cache size is 500 entries and a host stays as long as a
connect() succeeds. That is, if connection is refused for some reason, the
host is taken out of the cache.
The time to make a connection to a multihomed host is measured every time
and a mean access time is calculated so that HTDoConnect always takes the
fastest IP-address.
Improved Functionality of DNS requests
The Library now provides functionality for obtaining the full mail address
of the user, full domain name of the host and also the possibility for
setting both values. This means that the user can use his official email
address, e.g. in the HTTP request.
Long directory listings for HTTP, FTP and files on the local file system
supported. For the moment only a part of the functionality, e.g, sorting,
which columns to show etc. is exploited.
Icon Management
Icons in directory listings are bound to MIME content-types and encoding.
They can be found in the HTIcons module.
The default set of icons is set up using HTStdIconInit() and new icons can be
added dynamicly using HTAddIcon().
File Descriptions in Directory Listings
File descriptions are supported for long HTTP directory listings. The
default thing is to peek the title of the HTML files.
Error and Information Message Management
A new error handling module is introduced in HTError. It uses the error_stack entry in the
HTRequest object. It handles nested error
messages so that we can give a reason for the error, e.g.
Error in ...
This error occurred because ...
This is caused by ...
etc.
It also makes it possible for the Library to pass information back to the
client so that the the Library doesn't act like a `black hole'. An example is
HTTP redirection with status code `Moved 301'. Now the new URL is parsed back
to the client via the error_stack so that the client can update the reference
when possible.
The function that generates and outputs the error messages to the user is
put into HTErrorMsg Module so that it can be overwritten by a smart client or
server.
Guessing the Content Type of a Stream
The HTGuess module reads a part a stream and determines the content type
with the highest probability from a statistical analysis.
Minor Stuff
tmpnam()
Because of problems on NeXT platforms the tmpnam() function is now
replaced by HTFWriter_filename() in HTFWriter.c. The function has two
modes: Give back a hash name or the last part of the url(which
normally is more readable).
HTMLPutImg()
New function to make it easier to put out an HTML <IMG>
tag.
HTParseInet()
Added one more parameter to tell whether it is a multihomed host or
not. (This is used in the host cache).
HTInetStatus()
Should no more be used directly but is called from HTErrorAdd so that
the message goes all the way back to the user
HTError
This typedef is now obsolete and will be removed in future
releases
HTLoad()
Added new parameter to HTLoad: BOOL keep_error_stack. If YES then the
error_stack is not cleared. This is used in redirection etc.
HTLoadError()
Because of the new HTError module,
this function in HTML.c is not needed anymore.
Bug Fixes
This is a list of fixed bugs from earlier versions.
Memory faults in HTSimplify() in HTParse.c has been fixed
README files in directory listings now know how to handle '<',
'>' and '&' correctly. Though the file still has to be Ascii.
tmpnam is no more used in the Library because of problems on NeXT
platform. Instead a new function called HTFWriter_filename() in
HTFWriter.c is written.
HTInputSocket_getCharacter now returns a int and not a char so that EOF
is no longer a member of the char set.
HTMLGen_start_element() is only allowed to put extra '\n' in
<PRE> mode if it is between parameters in a tag
Changed type of <IMG> into SGML_EMPTY so that it doesn't expect
end tag <\IMG>
Nested <PRE> is no more a problem in HTMLGen_start_element.
Removed all #elif as not all compilers on HPUX likes it.
Changed HTChunk such that chunk->data is '\0' terminated at any
time. This actually makes HTChunkTerminate less needed but be aware that
HTChunk->size changes.
Removed non-portable d_namlen field in HTMulti.
Moved definition of NO_GROUPS to tch.html
Moved definition of HT_MAX_PATH to tch.html
Proxy server now closes connection in HTTP.c. This was only problem in
non-forking servers (VMS).
Definition of HT_NO_DATA moved to HTUtils.html where the other return
codes are placed.
Functions from HTAlert Module that
prompt the user don't get confused about ctrl-D anymore.
Lou Montulli's Lynx and LIST NEWSGROUP diffs put in to HTNews.c
Complete HTML+ DTD in
parser as per Dave Raggett's internet draft 00. All 78 elements in the
spec plus COMMENT, H7, HTML, XMP, LISTING and PLAINTEXT for
back-compatibility.
Backup files created on PUT to local file (in server or client)
according to global flag HTTakeBackup, defaults to YES.
Changed interface to HText
-- added HText_new3
creation method. All clients using this interface should ensure that the
structured stream used during creation is reopen()ed if it is still
needed to hold dcoument structure after the original load has been done.
In this case it should later be freed.
CHANGED INTERFACE to HTAccess: Now
an HTRequest environment is passed to almost all calls. Allows better
reentrant use of library for image embedding, etc. Implies (minor)
changes to all clients.
Changed HTConverter definition. Should be internal only, uness you
register new converters. Now take more useful parameters.
HTSaveAndCallBack stream allows simple implementation of embedded
graphics: just load them with all conversions you can hadle using
HTSaveAndCallBack -- you will be called back with a filename when the
image is on disk.
Initialization: Calling of HTInit functions now not done by library --
if you want them, you call them from the application.
xxx_proxy environment variables for forwarding requests to a proxy
(e.g. http_proxy , ftp_proxy , gopher_proxy ).
no_proxy environment variable to override the proxy; e.g.
Fix to Gopher code -- Now tries to do own filetyping with unknown
gopher types. Even if this fails doesn't truncate binary files.
Local directory listing bug with trailing slash fixed.
Better multiformat handling, HTMulti(), with welcome page support.
The eight misplaced cern_httpd access authorization modules moved away
from the library.
Parts with inefficient list handling rewritten -- better
performance.
Non-modal changes for password prompting -- I suspect this doesn't work
yet. Someone who is implementing a GUI client please contact me
<luotonen@dxcern.cern.ch>.
The PlainToHTML text converter now works.
A basic HTMLToTeX converter is implemented. It understands fairly well
written HTML documents but can not recover serious HTML errors.
All caching is turned off on the client side in this release until a
more stable solution is found.
tmpnam() not used any more as it caused problems on a NeXT. Instead the
function HTFWriter_filename() returns a hash or the last part of the URL
as filename.
HTML titles now get parsed as well. This is in order to convert foreign
letters etc. But if somebody puts a list into the title....
Bug fix in HTML generation -- line wrapping code produced garbage.
Affected servers working as gateways and also editors writing back
HTML.
Buf fix: static string returned by crypt() was free() 'd.
4 Nov 93 Version 2.13
Bug fix: FTP text transfer don't end when 8 bit characters used. Still
end on character -1. Thanks to Bjoern Stabell
(bjoerns@staff.cs.uit.no)
A single function HTPromptUsernameAndPassword() so that GUIs can have a
single authentication dialog box.
Extensive filename suffix recognition defaults.
11 Oct 93 Version 2.12
Authroization in. See Ari's documentation. Everything is
back-compatible, but there are large extenions to the rule file syntax,
plus new ACL and protection setup files. Authorization is done by
password in the clear now, with hooks for public key later.
Some other bug fixes. Not all the reported ones.
8 Sept 93 Version 2.11
Binary transfer problems tackled in HTTP. Yukky problem of existing
illegal 0.9 server producing binary which looks like text is unsolved. No
problems with HTTP 1.0 transfer.
Streams: end_document method removed. "Abort" method added as requested
at W5 -- use this when a pipe (stream stack) is broken by an error
prevents execution etc of partially written files for example.
New stream HTNetToText for converting Net ASCII to local C
representation.
Bug fix: Directories represented with anchors all with empty string (as
opposed to no) names: many NAME="" in anchors.
Bug fix: application/octet-stream replaces application/binary as per
MIME.
CSO Nameserver code to HTGopher added as Mosaic originally from ???.
(AS)
Bug fix: HTTPD would return an old-style (non-MIME) response when
sending back a file in unrecognised format, evn if client was using HTTP
1.0.
Version 2.10 Not released due to WWWWW
Bug fix: crashed with error message when doing a Gopher search.
WAIS type "HTML" recognised as HTML. Other types treated as
application/binary.
30 November 1993, version 2.09a
Bug fix: A space was not allowed befoe the tagcloser (>) of an end
tag. Now it is. Also (untested), empty close tags should be allowed a la
"<h1>heading</>". Not recommended, but in.
Bug fix: HTML nesting stack overflow check.
Bug fix: Compilation under IBM MVS didn't work.
23 November 1993, version 2.09
At last this stupid library is aware of content-transfer-encoding. When
defining file suffixes you must define not only the content-type (eg
application/postscript) but also the encoding (eg binary, or 7bit). The first
effect of this is to get the proper mode selected for FTP transfers. This
means you can do a
On the client side, you can define the presentation method for any given
MIME-style 'content type". This is done in the rule file too.
2.08
You can define file suffix conventions in the server and/or client rule
files. This allows you to set up a server which for example, trateas .ai
files as application/postscript, etc, or specifies that .ME is generally
text.
You can define how a new content type will be rendered in the rule file
of a client.
15 November 93, version 2.07
If you telnet to the www client (-h option): 1. it is more secure: No
local file access. 2. you can use /etc/www-remote.url file to specify the
url of the home page for telnet users.
Library uses fopen() rather than open(), allowing file access on
non-unix machines.
7 November 93, 2.06
Check against local hostname in file access is now string
insensitive.
HTML parser: entities were not expanded at outer level.
HTTP client sends client name and version. Requires HTAppName
FTP client fix: Continuation lines which do not have - in are now
handled properly as per the spec.
May 1993 version 2.05a
Bug fix in directory reading of local files -- it didn't work! The
stopped the daemon working on directories.
Bug fix in DIR handling for browers: tabs work again in directory
listsings.
22 April 1993 version 2.04
WAIS handling code in library, no need to go through a gateway of you
compile with the right options abd link with freeWAIS. First version of
this, no bells or whistles.
wwwsys.h systems specifics for SCO picked up from mosaic but directory
handling not fixed yet ...
20 April 1993 version 2.03c:
Use of identifier "this" removed to prevent conflicts with C++
_AIX cpp defined picked up for AIX.
In networking code, no assumption that \n is 10 on even on ASCII
machines (for Mac/MPW).
Server bug fix after Sebastian.Wilhelmi@isst.fhg.de -- exited with bad
status even if all OK.
19 April 1993 version 2.03b
Distribution bug fix: s/Specific/All/ on line 2 of
WWW/All/Makefile.product
SGML bug fix: treatment of entities was on when would be off and
vice-cversa.
Sun binaries linked with -Bstatic to make them transportable
Bug fix: Sometimes, client would ignore first part of returned
file.
17 Mar 93 Version 2.02-beta
Bug fix: Falls back to old HTTP with servers which don't like the new
HTTP.
BUILD file for easy installation of library, httpd and www
15 Mar 93 Version 2.0 alpha:
Incompatible library release has cleaned up interfaces. Developers read
the .h files! Version 2 libraries must be compiled with version 2 products
and vice-versa. Version 2 clients will access (most) version 1 servers OK,
version 2 servers will respond correctly to (all) version 1 clients.
Bug fix: Gopher connections wasted socket numbers and eventually used
them all up. (Thanks Marc Andreessen)
Rule file used by clients as well as servers.
MIME parser takes content-type field from MIME messages and invokes
appropriate registerd presentation method for each type.
Local and remote multiformat files.
***HTSearch and HTLoadRelative parameters changed! HTMainAnchor and
HTMainText globals are no longer used by the library.
HTAccess package now registers known protocols in list.
HT{G,S}etNewsHost available from HTNews module.
SGML parser speeded up
New software intrerface introduced: HTStructuredClass. This should make
use of the library easier with clients, and allow more code in common
between clients and servers.
Hooks for handling new formats with other applications
General hooks for status messages, user query, progress monitoring,
etc. (HTAlert module)
6 January 1993, Version1.1a
Can revisit telnet nodes.
Tn3270 access type accepted.
FTP password for anonymous is now WWWuser@ withouyt a hostname, for
software on ftp.uu.net etc.
Bug fix in HTML.c: Would crash when a list was the first visible
element in a text object.
Added numeric character reference handling for future use.
December 1992, Version 1.0c
Fixed bug in FTP handling (FTP file retrieve put control connection in
bad state)
Fixed bug in Gopher handling on non-ASCII platforms. Also, bug fix in
Gopher search of index whose name contained characters (like blanks)
escapes with %.
Fixed bug in NEWS handling, failed on non-ASCII platforms.
November 1992 Version1.0b
Fix some bugs in Make system.
Memory bug fixed: On failure to connect to HTTP server, it would free()
an uninitialized pointer!
Some trace messages were output to stdout instead of stderr
Allow "ftp:" prefix on URL . Effect is currently equivalent to that of
the "file:" prefix.
Local file access not allowed in secure mode. (telnet access was never
allowed in this mode). Secure mode is used for telnet server and mail
robot.
November 1992 version 1.0
Library libwww.a made independently of browsers to save time and space
and to force good modularity