!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
Lecture 11
HTTP and Web Services
Ketan Mayer-Patel
University of North Carolina
HTTP
- Protocol for retrieving "resources"
- What's a protocol?
- What's a resource?
Protocol
- A formal set of rules for communicating
- Example of a human protocol?
- Networking protocols a bit more formal
- E-mail: SMTP
- Transferring files: SCP, FTP, SFTP
- Web: HTTP
Web Resources via HTTP
- Addressed by a URL
- Possibly dynamic
- Generated on the fly
- Using parameters provided in the request
- Not specific to (X)HTML
- HTTP can be used to transfer almost any type of information.
- Web Pages: (X)HTML
- Images: JPEG, PNG, GIF
- Stylesheets: CSS
- Scripts: JavaScript
- Structured Information: XML
Overview of an HTTP Exchange
- Connection established
- Client sends request
- Server sends response
- Connection closed
Connection Established
- HTTP uses TCP for connection services
- Protocols are "layered".
- More complex protocols use the services of simpler protocols.
- TCP
- Transmission Control Protocol
- Reliable pipe-like connection between programs
- Server: listening for new connections.
- Client: initiates the connection.
HTTP Request Overview
- 3 parts:
- Request line
- Zero or more header lines
- Empty line indicates the end of headers
- Request body
- Sometimes referred to as the "message body"
- Can be empty.
HTTP Reply Overview
- 3 parts:
- Status line
- Zero or more header lines
- Reply (or message) body
- This will be the requested resource.
Request Line
- Single line of text with 3 parts:
METHOD RESOURCE VERSION
METHOD
- Indicates what the client wants to do with the resource
RESOURCE
- Identifies the resource.
- Either a full URL or just the path part
VERSION
- Identifies the HTTP version
- Either: "HTTP/1.0" or "HTTP/1.1"
Request Methods
- Not all servers will support all of these methods.
- GET
- PUT
- POST
- DELETE
- Rarely used: OPTIONS, HEAD, TRACE, CONNECT
Naming Resources
PROTOCOL://HOST/PATH
Examples:
http://www.cs.unc.edu/~kmp
http://scores.espn.go.com/nfl/nflpreview?gameId=301003001
http://www.nytimes.com/2010/10/03/world/asia/03marines.html?_r=1&hp
HTTP only requires the path portion as the resource name.
Request Line Example
GET /nfl/nflpreview?gameId=301003001 HTTP/1.0
Headers
- Provide additional information about the request/reply
- General headers
- Used for both requests and replies
- Message headers
- Specific to either request or reply
- Entity headers
- Information about message body
Header Format
- One header per line
- Syntax: HEADER_ID: VALUE
Date: Mon, 4 Oct 2010 17:30:00 GMT
Empty line indicates no more headers
Common Headers
- Content-length
- Required for requests, optional for replies
- Content-type
- MIME type specifier: general/specific
- Used to type the response
- Examples: text/html, text/xhtml, text/xml, image/gif, ...
Common Headers, cont'd
- Expires
- User-agent
- Host
- Indicates the name of the server that a request is intended for.
- Required by HTTP 1.1 to support virtual web server hosting.
- Cookie
HTTP Reply
- 3 parts:
- Status line
- Zero or more header lines
- Reply (or message) body
- This will be the requested resource.
Reply Status Line
- Format: VERSION CODE REASON
- Status Code
- 3 digits
- 1xx (Informational)
- 2xx (Success)
- 3xx (Redirection)
- 4xx (Client Error)
- 5xx (Server Error)
- Reason
1.0 vs 1.1
- Host header required for 1.1
- Persistent connection mode.
- Allows multiple request/reply exchanges using the same connection.
- 1.0 requires new connection for each request/reply exchange.
- Lots of new header types
Putting it together
- httpview.cgi
- Request for a page:
GET /nfl/nflpreview?gameId=301003001 HTTP/1.0
Host: espn.go.com
Content-length: 0