Document Date : 16th Sept. 2004 Introduction: -------------- In caravan ver. 3.17, a much needed extension to HTTP 1.1 is implemented which makes it possible to transfer large amounts of data over busy and unreliable data links. The problem is that, the larger the amount of data being sent, the longer the time it takes and consequently more the probability of the connection breaking. For really large amounts of data it may take a few tries before it is finally sent. Or in case of very busy lines such large data may block other more urgent data in the queue, with no way to stop its transmission without losing the time already spent sending this data -- when it is retransmitted, it will have to start all over again. Most http 1.1 servers already support a solution for this by implementing byte ranges in GET requests. The logic being that the client accumulates the data in each aborted session, requesting only the remaining byte range in each consecutive request. While this is fine for file downloads initiated by browsers, it does not solve the problem where it is most needed. An application, which generates large data and needs to send it across the WAN to other sites, uses either HTTP-PUT or HTTP-POST method. Byte ranges cannot be used while `puting` or `posting`.This is a problem which is being addressed by Caravan 3.17. It would be very nice if browsers also start supporting this protocol. The Protocol: ------------- By using the 100-continue header along with 'chunked' transfer-encoding and using a few additional headers it is possible to implement such a scheme within the specifications of HTTP 1.1. This is how caravan implements the new feature which will be called 'http-resume': When making an http post/put request to an http 1.1 server, the sender sends the 'expect' header with value '100-continue',an additional header 'Offset' with value 'bytes' ,an entity tag 'ETag' associated the body of this message and transfer-ecoding=chunked. This is a sample header: HTTP/1.1 PUT /myfile.data ETag : xyz Expect : 100-continue transfer-encoding : chunked Offset : bytes When receiver gets this header: 1.if the message is acceptable it sends the status code 100 continue. If it is not acceptable it may send an error status such as 400 http-bad-request and discard the message body. 2.If the receiver understands the offset header and can retrieve the entity with the given ETag , it returns the offset header with the size of its own copy of the entity in bytes (hexadecimal format), and also the entity tag. This is the 100-continue from a receiver who understands http-resume: HTTP/1.1 100 Continue ETag : xyz offset : FFF A server which does not understand the offset header will ignore it and will not include the 'offset' header in its response. If the sender manages to read the 100-continue status, it can assume that the receiver understands the http-resume protocol and so send the body of the message by skipping the data by the number of bytes specified in the offset header. In case the offset header is not present or if the 100-continue is missed the sender starts sending the body from beginning. The chunked transfer encoding fits nicely here because it allows additional information to be send along with each chunk size and also some trailing header fields, without which the receiver will not be able to verify that its 100-continue has been seen by the sender. Using chunked transfer-encoded body one can send additional information even after sending the header. In this scheme the first chunk size will be followed by the offset value the sender is using and a trailing header 'txstatus' which will be sent after the last chunck with value 'over' or 'not-over'. The chunked-body will be as : abcd ; offset=FFF .......... bbb ............. 0 txstatus: over If the sender was forced to abort the transmission because of an higher priority message the chunked body will be as : abcd ; offset=FFF .......... 0 txstatus: not-over THe receiver will assemble the body by appending the new data to the existing entity. If the txstatus is 'over' it will continue to service the request normally. If the txtstaus is not 'over' or if the connection was broken, the new data will be appended to the existing entity and the status code '406 http-not-acceptable' will be returned to the sender. The txstatus is to be discarded and not added to the request headers. This sub-protocol is designed to maintain the HTTP/1.1 persistent connections whenever possible and to to fit nicely with the existing scheme of things. For this to work the sender and the receiver both must be http 1.1 compliant and must support the new headers. Programming caravan for http-resume: -------------------------------------- To enable http-resume in any caravan 'PUT' or 'POST' one has to indicate this by creating an _ETag property in the form object. form myform myform(_url)-"action.html" myform(_server)="targergetserver.com" myform(_port)="88" myform(_content)=myfile(file) myform(_Etag)="xyz" myform(put) if reply(_responsecode)="406" "File was not sent completely" elseif reply(_responsecode)="200" "ok file sent 100%" endif The _etag value must be unique -- the reciever should be able uniquely identify this entity from all other possible entities it might have in its local storage. Caravan will automatically use http-resume for all puts and posts done through an eventhandler. Eventhandlers are the written to process caravan messagequeues. For an explanation of queues and eventhandlers please see the documentaion. In case the receiving system is not http 1.1 compliant then you must disable this protocol putting _Etag="". For HTTP 1.1 systems it will not create any issues. If this code is in the eventHandler, caravan automatically puts an _ETAg if none present. This is gauranteed to be unique for post methods and files from caravan database. Caravan uses the (location+queueid+crc of body) to create the Etag. Still it is better to create an Etag which is independent of the queueid. Because then http-resume will be effective even if the Item is removed from queue and re-enqueued later. To use http-resume from other scripts the programmer will have to set the '_etag' in the form or 'Etag' in the _request object. Please see the document "Caravan : Focus on HTTP" on using caravan for messaging applications. Requirement : Caravan 3.17 or higher.